[ https://issues.apache.org/jira/browse/SPARK-40363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-40363. ---------------------------------- Resolution: Won't Fix > Add SQL misc function to assert/check column value > -------------------------------------------------- > > Key: SPARK-40363 > URL: https://issues.apache.org/jira/browse/SPARK-40363 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.3.0 > Reporter: Rafal Wojdyla > Priority: Major > > SQL function that allows to assert a condition on a column that: > * fails when condition is not met > * returns original value otherwise > Related: SPARK-32793 > But {{assert_true}} and {{raise_error}} do not really cut it. In case of > {{assert_true}} you have to actually collect the empty column, and the check > might no happen if you drop the assertion column, which you will likely do > since it's empty. Having a function that returns some value as part of the > check, in most cases it would be the checked column would be handy. > I'm working with pyspark, so here's python implementation: > {code:python} > @overload > def assert_col_condition( > col: Union[str, Column], > cond: Callable[[Column], Column], > error_msg: Optional[str] = None, > ) -> Column: > """Asserts condition on a column, IFF it holds returns the original value > under `col`""" > ... > @overload > def assert_col_condition( > col: Union[str, Column], cond: Column, error_msg: Optional[str] = None > ) -> Column: > """Asserts condition on a column, IFF it holds returns the original value > under `col`""" > ... > def assert_col_condition( > col: Union[str, Column], > cond: Union[Column, Callable[[Column], Column]], > error_msg: Optional[str] = None, > ) -> Column: > col = str_to_col(col) > if not isinstance(cond, Column): > cond = cond(col) > return F.when( > ~cond, F.raise_error(error_msg or f"Assertion failed: {cond}") > ).otherwise(col) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org