[jira] [Comment Edited] (SPARK-23291) SparkR : substr : In SparkR dataframe , starting and ending position arguments in "substr" is giving wrong result when the position is greater than 1

Felix Cheung (JIRA) Sun, 06 May 2018 15:36:35 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-23291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465307#comment-16465307
 ]


Felix Cheung edited comment on SPARK-23291 at 5/6/18 10:35 PM:
---------------------------------------------------------------

actually, I'm not sure we should backport this to a x.x.1 release.

yes, the behavior "was unexpected" but it has been around for the last 3 years, 
if I recall, since the very beginning.

either users don't care since it has never been reported, or (most likely) 
users have adopted to the behavior in which case we will break existing jobs in 
a patch release.

anyway, it's just my 2c.


was (Author: felixcheung):
actually, I'm not sure we should backport this to a x.x.1 release.

yes, the behavior "was unexpected" but it has been around for the last 3 years, 
if I recall.

either users don't care since it has never been reported, or users have adopted 
to the behavior in which case we will break existing jobs in a patch release.

anyway, it's just my 2c.

> SparkR : substr : In SparkR dataframe , starting and ending position 
> arguments in "substr" is giving wrong result  when the position is greater 
> than 1
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23291
>                 URL: https://issues.apache.org/jira/browse/SPARK-23291
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 2.1.2, 2.2.0, 2.2.1, 2.3.0
>            Reporter: Narendra
>            Assignee: Liang-Chi Hsieh
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Defect Description :
> -----------------------------
> For example ,an input string "2017-12-01" is read into a SparkR dataframe 
> "df" with column name "col1".
>  The target is to create a a new column named "col2" with the value "12" 
> which is inside the string ."12" can be extracted with "starting position" as 
> "6" and "Ending position" as "7"
>  (the starting position of the first character is considered as "1" )
> But,the current code that needs to be written is :
>  
>  df <- withColumn(df,"col2",substr(df$col1,7,8)))
> Observe that the first argument in the "substr" API , which indicates the 
> 'starting position', is mentioned as "7" 
>  Also, observe that the second argument in the "substr" API , which indicates 
> the 'ending position', is mentioned as "8"
> i.e the number that should be mentioned to indicate the position should be 
> the "actual position + 1"
> Expected behavior :
> ----------------------------
> The code that needs to be written is :
>  
>  df <- withColumn(df,"col2",substr(df$col1,6,7)))
> Note :
> -----------
>  This defect is observed with only when the starting position is greater than 
> 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-23291) SparkR : substr : In SparkR dataframe , starting and ending position arguments in "substr" is giving wrong result when the position is greater than 1

Reply via email to