[ 
https://issues.apache.org/jira/browse/SPARK-39107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555111#comment-17555111
 ] 

Thomas Graves commented on SPARK-39107:
---------------------------------------

[~srowen]   I think this actually went into 3.1.4,  not 3.1.3, could you 
confirm before I update Fixed versions? 

> Silent change in regexp_replace's handling of empty strings
> -----------------------------------------------------------
>
>                 Key: SPARK-39107
>                 URL: https://issues.apache.org/jira/browse/SPARK-39107
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.2
>            Reporter: Willi Raschkowski
>            Assignee: Lorenzo Martini
>            Priority: Major
>              Labels: correctness, release-notes
>             Fix For: 3.1.3, 3.3.0, 3.2.2
>
>
> Hi, we just upgraded from 3.0.2 to 3.1.2 and noticed a silent behavior change 
> that a) seems incorrect, and b) is undocumented in the [migration 
> guide|https://spark.apache.org/docs/latest/sql-migration-guide.html]:
> {code:title=3.0.2}
> scala> val df = spark.sql("SELECT '' AS col")
> df: org.apache.spark.sql.DataFrame = [col: string]
> scala> df.withColumn("replaced", regexp_replace(col("col"), "^$", 
> "<empty>")).show
> +---+--------+
> |col|replaced|
> +---+--------+
> |   | <empty>|
> +---+--------+
> {code}
> {code:title=3.1.2}
> scala> val df = spark.sql("SELECT '' AS col")
> df: org.apache.spark.sql.DataFrame = [col: string]
> scala> df.withColumn("replaced", regexp_replace(col("col"), "^$", 
> "<empty>")).show
> +---+--------+
> |col|replaced|
> +---+--------+
> |   |        |
> +---+--------+
> {code}
> Note, the regular expression {{^$}} should match the empty string, but 
> doesn't in version 3.1. E.g. this is the Java behavior:
> {code}
> scala> "".replaceAll("^$", "<empty>");
> res1: String = <empty>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to