[ https://issues.apache.org/jira/browse/SPARK-24574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon reassigned SPARK-24574: ------------------------------------ Assignee: Chongguang LIU > improve array_contains function of the sql component to deal with Column type > ----------------------------------------------------------------------------- > > Key: SPARK-24574 > URL: https://issues.apache.org/jira/browse/SPARK-24574 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Chongguang LIU > Assignee: Chongguang LIU > Priority: Major > Fix For: 2.4.0 > > > Hello all, > > I ran into a use case in project with spark sql and want to share with you > some thoughts about the function array_contains. > > Say I have a Dataframe containing 2 columns. Column A of type "Array of > String" and Column B of type "String". I want to determine if the value of > column B is contained in the value of column A, without using a udf of course. > The function array_contains came into my mind naturally: > > > def array_contains(column: Column, value: Any): Column = withExpr{ > ArrayContains(column.expr, Literal(value)) > } > > However the function takes the column B and does a "Literal" of column B, > which yields a runtime exception: RuntimeException("Unsupported literal type > " + v.getClass + " " + v). > > Then after discussion with my friends, we fund a solution without using udf: > new Column(ArrayContains(col("ColumnA").expr, col("ColumnB").expr) > > With this solution, I think of empowering a little bit more the function, by > doing like this: > def array_contains(column: Column, value: Any): Column = withExpr { > value match { > case c: Column => ArrayContains(column.expr, c.expr) > case _ => ArrayContains(column.expr, Literal(value)) > } > } > > It does a pattern matching to detect if value is of type Column. If yes, it > will use the .expr of the column, otherwise it will work as it used to. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org