[ 
https://issues.apache.org/jira/browse/SPARK-38571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Appel updated SPARK-38571:
--------------------------------
    Description: 
In Spark2 we could use the date_format function with either the W or F flags to 
compute week of month from a date.  These are computing two different items, 
the W is having values from 1 to 6 and the F is having values from 1 to 5

Sample code and output of expected
df1 = spark.createDataFrame(
    [
        (1, date(2014, 3, 7)),
        (2, date(2014, 3, 8)),
        (3, date(2014, 3, 30)),
        (4, date(2014, 3, 31)),
        (5, date(2015, 3, 7)),
        (6, date(2015, 3, 8)),
        (7, date(2015, 3, 30)),
        (8, date(2015, 3, 31)),
    ],
    schema="a long, b date",
)
df1 = df1.withColumn("WEEKOFMONTH1-6", F.date_format(F.col("b"), "W"))
df1 = df1.withColumn("WEEKOFMONTH1-5", F.date_format(F.col("b"), "F"))
df1.show()



{+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+                  
                                                                                
                      
| a|        b|WEEKOFMONTH1-6|WEEKOFMONTH1-5|

{+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+
| 1|2014-03-07|            2|            1|
| 2|2014-03-08|            2|            2|
| 3|2014-03-30|            6|            5|
| 4|2014-03-31|            6|            5|
| 5|2015-03-07|            1|            1|
| 6|2015-03-08|            2|            2|
| 7|2015-03-30|            5|            5|
| 8|2015-03-31|            5|            5|

{+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+

 

With the Spark3 having the spark.sql.legacy.timeParserPolicy set to 
EXCEPTION by default this throws an error:
Caused by: java.lang.IllegalArgumentException: All week-based patterns are 
unsupported since Spark 3.0, detected: W, Please use the SQL function EXTRACT 
instead
 
However from the EXTRACT function there is nothing available that is extracting 
the week of month for the values 1 to 6
 
The Spark3 mentions they define our own patterns  located at 
[https://spark.apache.org/docs/3.2.1/sql-ref-datetime-pattern.html] that are 
implemented via DateTimeFormatter under the hood: 
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]
 
That site is listing both W and F for week of month
 W week-of-month number 4
 F week-of-month number 3
 
However only F is implemented on the datetime pattern reference
 
Is there another way we can compute this week of month for values 1 to 6 by 
still using the builtin with Spark3?  Currently we have to set the 
spark.sql.legacy.timeParserPolicy to LEGACY in order to run this
 
Thank you,
 
Kevin

  was:
In Spark2 we could use the date_format function with either the W or F flags to 
compute week of month from a date.  These are computing two different items, 
the W is having values from 1 to 6 and the F is having values from 1 to 5

Sample code and output of expected
``` python
df1 = spark.createDataFrame(
    [
        (1, date(2014, 3, 7)),
        (2, date(2014, 3, 8)),
        (3, date(2014, 3, 30)),
        (4, date(2014, 3, 31)),
        (5, date(2015, 3, 7)),
        (6, date(2015, 3, 8)),
        (7, date(2015, 3, 30)),
        (8, date(2015, 3, 31)),
    ],
    schema="a long, b date",
)
df1 = df1.withColumn("WEEKOFMONTH1-6", F.date_format(F.col("b"), "W"))
df1 = df1.withColumn("WEEKOFMONTH1-5", F.date_format(F.col("b"), "F"))
df1.show()
```

+---+----------+--------------+--------------+                                  
                                                                                
      
|  a|         b|WEEKOFMONTH1-6|WEEKOFMONTH1-5|
+---+----------+--------------+--------------+
|  1|2014-03-07|             2|             1|
|  2|2014-03-08|             2|             2|
|  3|2014-03-30|             6|             5|
|  4|2014-03-31|             6|             5|
|  5|2015-03-07|             1|             1|
|  6|2015-03-08|             2|             2|
|  7|2015-03-30|             5|             5|
|  8|2015-03-31|             5|             5|
+---+----------+--------------+--------------+

 

With the Spark3 having the spark.sql.legacy.timeParserPolicy set to 
EXCEPTION by default this throws an error:
Caused by: java.lang.IllegalArgumentException: All week-based patterns are 
unsupported since Spark 3.0, detected: W, Please use the SQL function EXTRACT 
instead
 
However from the EXTRACT function there is nothing available that is extracting 
the week of month for the values 1 to 6
 
The Spark3 mentions they define our own patterns  located at 
[https://spark.apache.org/docs/3.2.1/sql-ref-datetime-pattern.html] that are 
implemented via DateTimeFormatter under the hood: 
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]
 
That site is listing both W and F for week of month
 W week-of-month number 4
 F week-of-month number 3
 
However only F is implemented on the datetime pattern reference
 
Is there another way we can compute this week of month for values 1 to 6 by 
still using the builtin with Spark3?  Currently we have to set the 
spark.sql.legacy.timeParserPolicy to LEGACY in order to run this
 
Thank you,
 
Kevin


> Week of month from a date is missing in spark3 for return values of 1 to 6
> --------------------------------------------------------------------------
>
>                 Key: SPARK-38571
>                 URL: https://issues.apache.org/jira/browse/SPARK-38571
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.2
>            Reporter: Kevin Appel
>            Priority: Major
>
> In Spark2 we could use the date_format function with either the W or F flags 
> to compute week of month from a date.  These are computing two different 
> items, the W is having values from 1 to 6 and the F is having values from 1 
> to 5
> Sample code and output of expected
> df1 = spark.createDataFrame(
>     [
>         (1, date(2014, 3, 7)),
>         (2, date(2014, 3, 8)),
>         (3, date(2014, 3, 30)),
>         (4, date(2014, 3, 31)),
>         (5, date(2015, 3, 7)),
>         (6, date(2015, 3, 8)),
>         (7, date(2015, 3, 30)),
>         (8, date(2015, 3, 31)),
>     ],
>     schema="a long, b date",
> )
> df1 = df1.withColumn("WEEKOFMONTH1-6", F.date_format(F.col("b"), "W"))
> df1 = df1.withColumn("WEEKOFMONTH1-5", F.date_format(F.col("b"), "F"))
> df1.show()
> {+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+                
>                                                                               
>                           
> | a|        b|WEEKOFMONTH1-6|WEEKOFMONTH1-5|
> {+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+
> | 1|2014-03-07|            2|            1|
> | 2|2014-03-08|            2|            2|
> | 3|2014-03-30|            6|            5|
> | 4|2014-03-31|            6|            5|
> | 5|2015-03-07|            1|            1|
> | 6|2015-03-08|            2|            2|
> | 7|2015-03-30|            5|            5|
> | 8|2015-03-31|            5|            5|
> {+}--{-}{-}{+}--------{-}++{-}------------{-}{-}-------------+
>  
> With the Spark3 having the spark.sql.legacy.timeParserPolicy set to 
> EXCEPTION by default this throws an error:
> Caused by: java.lang.IllegalArgumentException: All week-based patterns are 
> unsupported since Spark 3.0, detected: W, Please use the SQL function EXTRACT 
> instead
>  
> However from the EXTRACT function there is nothing available that is 
> extracting the week of month for the values 1 to 6
>  
> The Spark3 mentions they define our own patterns  located at 
> [https://spark.apache.org/docs/3.2.1/sql-ref-datetime-pattern.html] that are 
> implemented via DateTimeFormatter under the hood: 
> [https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]
>  
> That site is listing both W and F for week of month
>  W week-of-month number 4
>  F week-of-month number 3
>  
> However only F is implemented on the datetime pattern reference
>  
> Is there another way we can compute this week of month for values 1 to 6 by 
> still using the builtin with Spark3?  Currently we have to set the 
> spark.sql.legacy.timeParserPolicy to LEGACY in order to run this
>  
> Thank you,
>  
> Kevin



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to