Thank you for reply and great explanation!

On Thu, Apr 18, 2019 at 8:54 AM Ryan Blue <rb...@netflix.com> wrote:

> Iceberg stores all table columns in the underlying data files. It does not
> store derived partition values in the data files. If you're partitioning by
> date(ts), it won't store that date ordinal. If you're partitioning by
> identity(date_col), it will store date_col.
>
> When reading data, values from the manifest are used for identity
> partition data to avoid extra work materializing the same value for every
> row.
>
> On Thu, Apr 18, 2019 at 8:47 AM suds <sudssf2...@gmail.com> wrote:
>
>> I am working on spark project and came across interesting ( was known in
>> hive) convention spark use.
>>
>> https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#partition-discovery
>>
>> in spark if I partition dataset. partition columns does not exists in
>> parquet schema and hence in final data file. partition information has to
>> be extracted from path.
>> this does not work well when I pass list of files to spark instead of
>> path.
>>
>> What is behaviour in iceberg? does it store partition columns in final
>> parquet file or behaviour same as spark where partition columns are only
>> part of metadata and not actual file?
>>
>> (P.S. I am aware about iceberg metadata implementation but I need some
>> pointers to find out if partition columns are stored in file vs metadata)
>>
>> --
>> Thanks
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Iceberg Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to iceberg-devel+unsubscr...@googlegroups.com.
>> To post to this group, send email to iceberg-de...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/iceberg-devel/CAO32DPza-kcm1PnfSA5KJu3rymvk1FYHZnwLe0hu%2B86FLqmt8g%40mail.gmail.com
>> <https://groups.google.com/d/msgid/iceberg-devel/CAO32DPza-kcm1PnfSA5KJu3rymvk1FYHZnwLe0hu%2B86FLqmt8g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to