Sandeep Singh created SPARK-41904: ------------------------------------- Summary: Fix `nth_value` functions output Key: SPARK-41904 URL: https://issues.apache.org/jira/browse/SPARK-41904 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Sandeep Singh
{code:java} from pyspark.sql.functions import flatten, struct, transform df = self.spark.sql("SELECT array(1, 2, 3) as numbers, array('a', 'b', 'c') as letters") actual = df.select( flatten( transform( "numbers", lambda number: transform( "letters", lambda letter: struct(number.alias("n"), letter.alias("l")) ), ) ) ).first()[0] expected = [ (1, "a"), (1, "b"), (1, "c"), (2, "a"), (2, "b"), (2, "c"), (3, "a"), (3, "b"), (3, "c"), ] self.assertEquals(actual, expected){code} {code:java} Traceback (most recent call last): File "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_functions.py", line 809, in test_nested_higher_order_function self.assertEquals(actual, expected) AssertionError: Lists differ: [{'n': 'a', 'l': 'a'}, {'n': 'b', 'l': 'b'[151 chars]'c'}] != [(1, 'a'), (1, 'b'), (1, 'c'), (2, 'a'), ([43 chars]'c')] First differing element 0: {'n': 'a', 'l': 'a'} (1, 'a') - [{'l': 'a', 'n': 'a'}, - {'l': 'b', 'n': 'b'}, - {'l': 'c', 'n': 'c'}, - {'l': 'a', 'n': 'a'}, - {'l': 'b', 'n': 'b'}, - {'l': 'c', 'n': 'c'}, - {'l': 'a', 'n': 'a'}, - {'l': 'b', 'n': 'b'}, - {'l': 'c', 'n': 'c'}] + [(1, 'a'), + (1, 'b'), + (1, 'c'), + (2, 'a'), + (2, 'b'), + (2, 'c'), + (3, 'a'), + (3, 'b'), + (3, 'c')] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org