[ https://issues.apache.org/jira/browse/SPARK-39107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555111#comment-17555111 ]
Thomas Graves commented on SPARK-39107: --------------------------------------- [~srowen] I think this actually went into 3.1.4, not 3.1.3, could you confirm before I update Fixed versions? > Silent change in regexp_replace's handling of empty strings > ----------------------------------------------------------- > > Key: SPARK-39107 > URL: https://issues.apache.org/jira/browse/SPARK-39107 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.2 > Reporter: Willi Raschkowski > Assignee: Lorenzo Martini > Priority: Major > Labels: correctness, release-notes > Fix For: 3.1.3, 3.3.0, 3.2.2 > > > Hi, we just upgraded from 3.0.2 to 3.1.2 and noticed a silent behavior change > that a) seems incorrect, and b) is undocumented in the [migration > guide|https://spark.apache.org/docs/latest/sql-migration-guide.html]: > {code:title=3.0.2} > scala> val df = spark.sql("SELECT '' AS col") > df: org.apache.spark.sql.DataFrame = [col: string] > scala> df.withColumn("replaced", regexp_replace(col("col"), "^$", > "<empty>")).show > +---+--------+ > |col|replaced| > +---+--------+ > | | <empty>| > +---+--------+ > {code} > {code:title=3.1.2} > scala> val df = spark.sql("SELECT '' AS col") > df: org.apache.spark.sql.DataFrame = [col: string] > scala> df.withColumn("replaced", regexp_replace(col("col"), "^$", > "<empty>")).show > +---+--------+ > |col|replaced| > +---+--------+ > | | | > +---+--------+ > {code} > Note, the regular expression {{^$}} should match the empty string, but > doesn't in version 3.1. E.g. this is the Java behavior: > {code} > scala> "".replaceAll("^$", "<empty>"); > res1: String = <empty> > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org