[ https://issues.apache.org/jira/browse/SPARK-22436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237446#comment-16237446 ]
Andreas Maier commented on SPARK-22436: --------------------------------------- Python UDFs are very slow, aren't they? I believe a Spark native function would be much faster. And in fact it was already available with trim() before SPARK-17299 . > New function strip() to remove all whitespace from string > --------------------------------------------------------- > > Key: SPARK-22436 > URL: https://issues.apache.org/jira/browse/SPARK-22436 > Project: Spark > Issue Type: Improvement > Components: PySpark, Spark Core > Affects Versions: 2.2.0 > Reporter: Andreas Maier > Priority: Minor > Labels: features > > Since ticket SPARK-17299 the [trim() > function|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.functions.trim] > will not remove any whitespace characters from beginning and end of a string > but only spaces. This is correct in regard to the SQL standard, but it opens > a gap in functionality. > My suggestion is to add to the Spark API in analogy to pythons standard > library the functions l/r/strip(), which should remove all whitespace > characters from a string from beginning and/or end of a string respectively. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org