Re: [I] Snowflake Iceberg Partitioned data read issue [iceberg]

via GitHub Wed, 03 Jan 2024 15:51:06 -0800


amogh-jahagirdar commented on issue #9404:
URL: https://github.com/apache/iceberg/issues/9404#issuecomment-1876122199


   I ultimately recommend continue reaching out to Snowflake on any issues you 
are encountering on Iceberg integration, but the Spark behavior in the reported 
issue does seem really odd to me from an Iceberg perspective.
   
   Ultimately, in Iceberg the source of truth for partitioning is the partition 
spec for the table.  The advantage with decoupling logical partitioning from 
the physical organization of files is that it allows for safely and correctly 
evolving the partitioning as your data/query patterns change. Hive style 
partitioning in the path is irrelevant for Iceberg in terms of partition 
pruning and other planning related operations.
   
   You mentioned: ```We try to read the same data by using Apace Spark Iceberg 
and it is working ```
   
   When you say "it's working" are you querying with partition predicates and 
seeing pruning of partitions? I highly doubt that would be happening (because 
the source of truth mentioned in the previous point).  Could you share your 
Spark configs (redact any data that should be hidden)? 
   
   But as mentioned before, for any vendor related integrations with Iceberg, I 
recommend reaching out to the vendor. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Snowflake Iceberg Partitioned data read issue [iceberg]

Reply via email to