I am working on spark project and came across interesting ( was known in
hive) convention spark use.
https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#partition-discovery

in spark if I partition dataset. partition columns does not exists in
parquet schema and hence in final data file. partition information has to
be extracted from path.
this does not work well when I pass list of files to spark instead of path.

What is behaviour in iceberg? does it store partition columns in final
parquet file or behaviour same as spark where partition columns are only
part of metadata and not actual file?

(P.S. I am aware about iceberg metadata implementation but I need some
pointers to find out if partition columns are stored in file vs metadata)

--
Thanks

Reply via email to