[ https://issues.apache.org/jira/browse/SPARK-37788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Davies updated SPARK-37788: ---------------------------------- Description: PySpark has mainly migrated to supporting both Column input types as well as string names of columns ("ColumnOrName") in it's functions module. There seem to be a small number of functions that need updating; either on conversions of input string names representing columns into the Column type, or simple annotation changes that indicate the function supports column string names. Below are the functions I've seen: * F.overlay: Annotation only * F.least: Annotation only * F.slice: Needs a conversion * F.array_repeat: Needs a conversion See here for additional context: [https://github.com/apache/spark/pull/35032#issuecomment-1003033776] I'm happy to make a quick PR fixing these, if there is no reason for these functions being handled as a special case. was: PySpark has mainly migrated to supporting both Column input types as well as string names of columns ("ColumnOrName") in it's functions module. There seem to be a small number of functions that need updating; either on conversions of input string names representing columns into the Column type, or simple annotation changes that indicate the function supports column string names. Below are the functions I've seen: * F.overlay: Annotation only * F.least: Annotation only * F.slice: Needs a conversion * F.array_repeat: Needs a conversion See here for additional context: [https://github.com/apache/spark/pull/35032#issuecomment-1003033776] > ColumnOrName vs Column in PySpark Functions module > -------------------------------------------------- > > Key: SPARK-37788 > URL: https://issues.apache.org/jira/browse/SPARK-37788 > Project: Spark > Issue Type: Question > Components: PySpark > Affects Versions: 3.2.0 > Reporter: Daniel Davies > Priority: Minor > > PySpark has mainly migrated to supporting both Column input types as well as > string names of columns ("ColumnOrName") in it's functions module. There seem > to be a small number of functions that need updating; either on conversions > of input string names representing columns into the Column type, or simple > annotation changes that indicate the function supports column string names. > Below are the functions I've seen: > * F.overlay: Annotation only > * F.least: Annotation only > * F.slice: Needs a conversion > * F.array_repeat: Needs a conversion > See here for additional context: > [https://github.com/apache/spark/pull/35032#issuecomment-1003033776] > I'm happy to make a quick PR fixing these, if there is no reason for these > functions being handled as a special case. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org