Thank you for reply and great explanation! On Thu, Apr 18, 2019 at 8:54 AM Ryan Blue <rb...@netflix.com> wrote:
> Iceberg stores all table columns in the underlying data files. It does not > store derived partition values in the data files. If you're partitioning by > date(ts), it won't store that date ordinal. If you're partitioning by > identity(date_col), it will store date_col. > > When reading data, values from the manifest are used for identity > partition data to avoid extra work materializing the same value for every > row. > > On Thu, Apr 18, 2019 at 8:47 AM suds <sudssf2...@gmail.com> wrote: > >> I am working on spark project and came across interesting ( was known in >> hive) convention spark use. >> >> https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#partition-discovery >> >> in spark if I partition dataset. partition columns does not exists in >> parquet schema and hence in final data file. partition information has to >> be extracted from path. >> this does not work well when I pass list of files to spark instead of >> path. >> >> What is behaviour in iceberg? does it store partition columns in final >> parquet file or behaviour same as spark where partition columns are only >> part of metadata and not actual file? >> >> (P.S. I am aware about iceberg metadata implementation but I need some >> pointers to find out if partition columns are stored in file vs metadata) >> >> -- >> Thanks >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Iceberg Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to iceberg-devel+unsubscr...@googlegroups.com. >> To post to this group, send email to iceberg-de...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/iceberg-devel/CAO32DPza-kcm1PnfSA5KJu3rymvk1FYHZnwLe0hu%2B86FLqmt8g%40mail.gmail.com >> <https://groups.google.com/d/msgid/iceberg-devel/CAO32DPza-kcm1PnfSA5KJu3rymvk1FYHZnwLe0hu%2B86FLqmt8g%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > Ryan Blue > Software Engineer > Netflix >