[ https://issues.apache.org/jira/browse/SPARK-42965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-42965. ---------------------------------- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46636 [https://github.com/apache/spark/pull/46636] > metadata mismatch for StructField when running some tests. > ---------------------------------------------------------- > > Key: SPARK-42965 > URL: https://issues.apache.org/jira/browse/SPARK-42965 > Project: Spark > Issue Type: Improvement > Components: Connect, Pandas API on Spark > Affects Versions: 3.5.0 > Reporter: Haejoon Lee > Priority: Major > Fix For: 4.0.0 > > > For some reason, the metadata of `StructField` is different in a few tests > when using Spark Connect. However, the function works properly. > For example, when running `python/run-tests --testnames > 'pyspark.pandas.tests.connect.data_type_ops.test_parity_binary_ops > BinaryOpsParityTests.test_add'` it complains `AssertionError: > ([InternalField(dtype=int64, struct_field=StructField('bool', LongType(), > False))], [StructField('bool', LongType(), False)])` because metadata is > different something like `\{'__autoGeneratedAlias': 'true'}` but they have > same name, type and nullable, so the function just works well. > Therefore, we have temporarily added a branch for Spark Connect in the code > so that we can create InternalFrame properly to provide more pandas APIs in > Spark Connect. If a clear cause is found, we may need to revert it back to > its original state. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org