Re: DataFrame API: how to partition by a "virtual" column, or by a nested column?

2016-10-13 Thread Samy Dindane
This partially answers the question: http://stackoverflow.com/a/35449563/604041 On 10/04/2016 03:10 PM, Samy Dindane wrote: Hi, I have the following schema: -root |-timestamp |-date |-year |-month |-day |-some_column |-some_other_column I'd like to achieve either of these: 1) Us

DataFrame API: how to partition by a "virtual" column, or by a nested column?

2016-10-04 Thread Samy Dindane
Hi, I have the following schema: -root |-timestamp |-date |-year |-month |-day |-some_column |-some_other_column I'd like to achieve either of these: 1) Use the timestamp field to partition by year, month and day. This looks weird though, as Spark wouldn't magically know how to lo