[ https://issues.apache.org/jira/browse/SPARK-26879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jash Gala updated SPARK-26879: ------------------------------ Description: In the Spark SQL functions definitions, `inline` uses col1, col2, etc. (i.e. 1-indexed columns), while `stack` uses col0, col1, col2, etc. (i.e. 0-indexed columns). {code:title=spark-shell|borderStyle=solid} scala> spark.sql("SELECT stack(2, 1, 2, 3)").show +----+----+ |col0|col1| +----+----+ | 1| 2| | 3|null| +----+----+ scala> spark.sql("SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')))").show +----+----+ |col1|col2| +----+----+ | 1| a| | 2| b| +----+----+ {code} This feels like an issue with consistency. As discussed on [PR #23748|https://github.com/apache/spark/pull/23748], it might be a good idea to standardize this to something specific (like zero-based indexing) for these and other similar functions. was: In the Spark SQL functions definitions, `inline` uses col1, col2, etc. (i.e. 1-indexed columns), while `stack` uses col0, col1, col2, etc. (i.e. 0-indexed columns). {code:title=spark-shell|borderStyle=solid} scala> spark.sql("SELECT stack(2, 1, 2, 3)").show | col0 | col1 | | 1 | 2 | | 3 | null | scala> spark.sql("SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')))").show | col1 | col2 | | 1 | a | | 2 | b | {code} This feels like an issue with consistency. As discussed on [PR #23748|https://github.com/apache/spark/pull/23748], it might be a good idea to standardize this to something specific (like zero-based indexing) for these and other similar functions. > Inconsistency in default column names for functions like inline and stack > ------------------------------------------------------------------------- > > Key: SPARK-26879 > URL: https://issues.apache.org/jira/browse/SPARK-26879 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Jash Gala > Priority: Minor > > In the Spark SQL functions definitions, `inline` uses col1, col2, etc. (i.e. > 1-indexed columns), while `stack` uses col0, col1, col2, etc. (i.e. 0-indexed > columns). > {code:title=spark-shell|borderStyle=solid} > scala> spark.sql("SELECT stack(2, 1, 2, 3)").show > +----+----+ > |col0|col1| > +----+----+ > | 1| 2| > | 3|null| > +----+----+ > scala> spark.sql("SELECT inline_outer(array(struct(1, 'a'), struct(2, > 'b')))").show > +----+----+ > |col1|col2| > +----+----+ > | 1| a| > | 2| b| > +----+----+ > {code} > This feels like an issue with consistency. As discussed on [PR > #23748|https://github.com/apache/spark/pull/23748], it might be a good idea > to standardize this to something specific (like zero-based indexing) for > these and other similar functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org