Hello Saurabh,

>  What config options should we set,
> - if we are always going to read old data written from Spark2.4 using
Spark 3.0

You should set *spark.sql.legacy.parquet.datetimeRebaseModeInRead* to
*LEGACY *when you read old data*.*

You see this exception because Spark 3.0 cannot determine who wrote the
parquet files and which calendar was used while saving the files. Starting
from the version 2.4.6, Spark saves meta-data to parquet files, and Spark
3.0 can infer the mode automatically.

Maxim Gekk

Software Engineer

Databricks, Inc.


On Thu, Nov 19, 2020 at 8:10 PM Saurabh Gulati
<saurabh.gul...@fedex.com.invalid> wrote:

> Hello,
> First of all, Thanks to you guys for maintaining and improving Spark.
>
> We just updated to Spark 3.0.1 and are facing some issues with the new
> Proleptic Gregorian calendar.
>
> We have data from different sources in our platform and we saw there were
> some * date/timestamp* columns that go back to years before 1500.
>
> According to this
> <https://www.waitingforcode.com/apache-spark-sql/whats-new-apache-spark-3-proleptic-calendar-date-time-management/read>
> post, data written with spark 2.4 and read with 3.0 should result in some
> difference in *dates/timestamps* but we are not able to replicate this
> issue. We only encounter an exception that suggests us to set
> *spark.sql.legacy.parquet.datetimeRebaseModeInRead/Write* config options
> to make it work.
>
> So, our main concern is:
>
>    - How can we test/replicate this behavior? Since it's not very clear
>    to us/nor we see any docs for this change, we can't decide with certainty
>    which parameters to set and why.
>    - What config options should we set,
>       -  if we are always going to read old data written from Spark2.4
>       using Spark 3.0
>       - will always be writing newer data with Spark3.0.
>
> We couldn't make a deterministic/informed choice so it's a better idea to
> ask the community what scenarios will be impacted and what will still work
> fine.
>
> Thanks
> Saurabh
>
>
>

Reply via email to