zhengruifeng opened a new pull request, #47355: URL: https://github.com/apache/spark/pull/47355
### What changes were proposed in this pull request? Make `from_xml` support StructType schema ### Why are the changes needed? StructType schema was supported in Spark Classic, but not in Spark Connect to address https://github.com/apache/spark/pull/43680#discussion_r1385332357 ### Does this PR introduce _any_ user-facing change? before: ``` from pyspark.sql.types import StructType, LongType import pyspark.sql.functions as sf data = [(1, '''<p><a>1</a></p>''')] df = spark.createDataFrame(data, ("key", "value")) schema = StructType().add("a", LongType()) df.select(sf.from_xml(df.value, schema)).show() --------------------------------------------------------------------------- AnalysisException Traceback (most recent call last) Cell In[1], line 7 ... AnalysisException: [PARSE_SYNTAX_ERROR] Syntax error at or near '{'. SQLSTATE: 42601 JVM stacktrace: org.apache.spark.sql.AnalysisException at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(parsers.scala:278) at org.apache.spark.sql.catalyst.parser.AbstractParser.parse(parsers.scala:98) at org.apache.spark.sql.catalyst.parser.AbstractParser.parseDataType(parsers.scala:40) at org.apache.spark.sql.types.DataType$.$anonfun$fromDDL$1(DataType.scala:126) at org.apache.spark.sql.types.DataType$.parseTypeWithFallback(DataType.scala:145) at org.apache.spark.sql.types.DataType$.fromDDL(DataType.scala:127) ``` after: ``` +---------------+ |from_xml(value)| +---------------+ | {1}| +---------------+ ``` ### How was this patch tested? added doctest ### Was this patch authored or co-authored using generative AI tooling? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org