[ https://issues.apache.org/jira/browse/SPARK-44840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756574#comment-17756574 ]
Serge Rielau edited comment on SPARK-44840 at 8/20/23 7:41 PM: --------------------------------------------------------------- [~srowen] There is no standard as such. However, there are multiple reasons not to be compatible with Snowflake: 1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not 'l'). 2. array access has been a mixed bag for us (some 0, some 1-based), but we have tried to move towards 1-based as well. e.g., element_at() is 1-based, and we use -1 (!) to get the last element. 3. Snowflake had no choice but to use -1 for the second last element because 1 is their second element. Because they are 0-based they are unable to use array_insert() to append an element (short of passing the (length - 1) as parameter. So the proposal is objectively more powerful. was (Author: JIRAUSER288374): [~srowen] There is no standard as such. However, there are multiple reasons not to be compatible with Snowflake: 1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not 'l'). 2. array access has been a mixed bag for us (some 0, some 1-based), but we have tried to move towards 1-based as well. e.g., element_at() is 1-based, and we use -1 (!) to get the last element. 3. Snowflake had no choice but to use 1 for the second last element because 1 is their second element. Because they are 0-based they are unable to use array_insert() to append an element (short of passing the (length - 1) as parameter. So the proposal is objectively more powerful. > array_insert() give wrong results for ngative index > --------------------------------------------------- > > Key: SPARK-44840 > URL: https://issues.apache.org/jira/browse/SPARK-44840 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.4.0 > Reporter: Serge Rielau > Assignee: Max Gekk > Priority: Major > > Unlike in Snowflake we decided that array_inert() is 1 based. > This means 1 is the first element in an array and -1 is the last. > This matches the behavior of functions such as substr() and element_at(). > > {code:java} > > SELECT array_insert(array('a', 'b', 'c'), 1, 'z'); > ["z","a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 0, 'z'); > Error > > SELECT array_insert(array('a', 'b', 'c'), -1, 'z'); > ["a","b","c","z"] > > SELECT array_insert(array('a', 'b', 'c'), 5, 'z'); > ["a","b","c",NULL,"z"] > > SELECT array_insert(array('a', 'b', 'c'), -5, 'z'); > ["z",NULL,"a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 2, cast(NULL AS STRING)); > ["a",NULL,"b","c"] > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org