[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-28874: --------------------------------- Description: Pyspark date_format add one years in the last days off year : Example : {code:python} spark.range(1).select(date_format(lit("2010-12-26"), "YYYY-MM-dd")).show() {code} {code} +-----------------------------------+ |date_format(2010-12-26, YYYY-MM-dd)| +-----------------------------------+ | 2011-12-26| +-----------------------------------+ {code} was: Pyspark date_format add one years in the last days off year : Example : {code:python} from datetime import datetime from dateutil.relativedelta import relativedelta import pandas as pd from pyspark.sql.functions import date_format, col from pyspark.sql.types import * start_date = datetime(2010,1,1) end_date = datetime(2055,1,1) indx_ts = pd.date_range(start_date.strftime('%m/%d/%Y'), end_date.strftime('%m/%d/%Y'), freq='D') data_date = [ {"d":datetime.utcfromtimestamp(x.tolist()/1e9)} for x in indx_ts.values ] df_p = spark.createDataFrame(data_date,StructType([StructField('d', DateType(), True)])) df_string = df_p.withColumn("date_string" ,date_format(col("d"), "YYYY-MM-dd")) df_string.filter("d!=date_string").show(1000) {code} {code} +----------+-----------+ | d|date_string| +----------+-----------+ |2010-12-26| 2011-12-26| |2010-12-27| 2011-12-27| |2010-12-28| 2011-12-28| |2010-12-29| 2011-12-29| |2010-12-30| 2011-12-30| |2010-12-31| 2011-12-31| |2012-12-30| 2013-12-30| |2012-12-31| 2013-12-31| |2013-12-29| 2014-12-29| |2013-12-30| 2014-12-30| |2013-12-31| 2014-12-31| |2014-12-28| 2015-12-28| |2014-12-29| 2015-12-29| |2014-12-30| 2015-12-30| |2014-12-31| 2015-12-31| |2015-12-27| 2016-12-27| |2015-12-28| 2016-12-28| |2015-12-29| 2016-12-29| |2015-12-30| 2016-12-30| |2015-12-31| 2016-12-31| |2017-12-31| 2018-12-31| |2018-12-30| 2019-12-30| |2018-12-31| 2019-12-31| |2019-12-29| 2020-12-29| |2019-12-30| 2020-12-30| |2019-12-31| 2020-12-31| |2020-12-27| 2021-12-27| |2020-12-28| 2021-12-28| |2020-12-29| 2021-12-29| |2020-12-30| 2021-12-30| |2020-12-31| 2021-12-31| |2021-12-26| 2022-12-26| |2021-12-27| 2022-12-27| |2021-12-28| 2022-12-28| |2021-12-29| 2022-12-29| |2021-12-30| 2022-12-30| |2021-12-31| 2022-12-31| |2023-12-31| 2024-12-31| |2024-12-29| 2025-12-29| |2024-12-30| 2025-12-30| |2024-12-31| 2025-12-31| |2025-12-28| 2026-12-28| |2025-12-29| 2026-12-29| |2025-12-30| 2026-12-30| |2025-12-31| 2026-12-31| |2026-12-27| 2027-12-27| |2026-12-28| 2027-12-28| |2026-12-29| 2027-12-29| |2026-12-30| 2027-12-30| |2026-12-31| 2027-12-31| |2027-12-26| 2028-12-26| |2027-12-27| 2028-12-27| |2027-12-28| 2028-12-28| |2027-12-29| 2028-12-29| |2027-12-30| 2028-12-30| |2027-12-31| 2028-12-31| |2028-12-31| 2029-12-31| |2029-12-30| 2030-12-30| |2029-12-31| 2030-12-31| |2030-12-29| 2031-12-29| |2030-12-30| 2031-12-30| |2030-12-31| 2031-12-31| |2031-12-28| 2032-12-28| |2031-12-29| 2032-12-29| |2031-12-30| 2032-12-30| |2031-12-31| 2032-12-31| |2032-12-26| 2033-12-26| |2032-12-27| 2033-12-27| |2032-12-28| 2033-12-28| |2032-12-29| 2033-12-29| |2032-12-30| 2033-12-30| |2032-12-31| 2033-12-31| |2034-12-31| 2035-12-31| |2035-12-30| 2036-12-30| |2035-12-31| 2036-12-31| |2036-12-28| 2037-12-28| |2036-12-29| 2037-12-29| |2036-12-30| 2037-12-30| |2036-12-31| 2037-12-31| |2037-12-27| 2038-12-27| |2037-12-28| 2038-12-28| |2037-12-29| 2038-12-29| |2037-12-30| 2038-12-30| |2037-12-31| 2038-12-31| |2038-12-26| 2039-12-26| |2038-12-27| 2039-12-27| |2038-12-28| 2039-12-28| |2038-12-29| 2039-12-29| |2038-12-30| 2039-12-30| |2038-12-31| 2039-12-31| |2040-12-30| 2041-12-30| |2040-12-31| 2041-12-31| |2041-12-29| 2042-12-29| |2041-12-30| 2042-12-30| |2041-12-31| 2042-12-31| |2042-12-28| 2043-12-28| |2042-12-29| 2043-12-29| |2042-12-30| 2043-12-30| |2042-12-31| 2043-12-31| |2043-12-27| 2044-12-27| |2043-12-28| 2044-12-28| |2043-12-29| 2044-12-29| |2043-12-30| 2044-12-30| |2043-12-31| 2044-12-31| |2045-12-31| 2046-12-31| |2046-12-30| 2047-12-30| |2046-12-31| 2047-12-31| |2047-12-29| 2048-12-29| |2047-12-30| 2048-12-30| |2047-12-31| 2048-12-31| |2048-12-27| 2049-12-27| |2048-12-28| 2049-12-28| |2048-12-29| 2049-12-29| |2048-12-30| 2049-12-30| |2048-12-31| 2049-12-31| |2049-12-26| 2050-12-26| |2049-12-27| 2050-12-27| |2049-12-28| 2050-12-28| |2049-12-29| 2050-12-29| |2049-12-30| 2050-12-30| |2049-12-31| 2050-12-31| |2051-12-31| 2052-12-31| |2052-12-29| 2053-12-29| |2052-12-30| 2053-12-30| |2052-12-31| 2053-12-31| |2053-12-28| 2054-12-28| |2053-12-29| 2054-12-29| |2053-12-30| 2054-12-30| |2053-12-31| 2054-12-31| |2054-12-27| 2055-12-27| |2054-12-28| 2055-12-28| |2054-12-29| 2055-12-29| |2054-12-30| 2055-12-30| |2054-12-31| 2055-12-31| +----------+-----------+ {code} > Pyspark bug in date_format > -------------------------- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.1.0, 2.3.0 > Reporter: Luis > Priority: Major > > Pyspark date_format add one years in the last days off year : > Example : > {code:python} > spark.range(1).select(date_format(lit("2010-12-26"), "YYYY-MM-dd")).show() > {code} > {code} > +-----------------------------------+ > |date_format(2010-12-26, YYYY-MM-dd)| > +-----------------------------------+ > | 2011-12-26| > +-----------------------------------+ > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org