Yuchen Liu created SPARK-48964: ---------------------------------- Summary: Fix the discrepancy between implementation, comment and documentation of option recursive.fields.max.depth in ProtoBuf connector Key: SPARK-48964 URL: https://issues.apache.org/jira/browse/SPARK-48964 Project: Spark Issue Type: Documentation Components: Connect Affects Versions: 3.5.1, 3.5.0, 4.0.0, 3.5.2, 3.5.3 Reporter: Yuchen Liu Fix For: 4.0.0, 3.5.2, 3.5.3, 3.5.1, 3.5.0
After the three PRs ([https://github.com/apache/spark/pull/38922,] [https://github.com/apache/spark/pull/40011,] [https://github.com/apache/spark/pull/40141]) working on the same option, there are some legacy comments and documentation that has not been updated to the latest implementation. This task should consolidate them. Below is the correct description of the behavior. The `recursive.fields.max.depth` parameter can be specified in the from_protobuf options to control the maximum allowed recursion depth for a field. Setting `recursive.fields.max.depth` to 1 drops all-recursive fields, setting it to 2 allows it to be recursed once, and setting it to 3 allows it to be recursed twice. Attempting to set the `recursive.fields.max.depth` to a value greater than 10 is not allowed. If the `recursive.fields.max.depth` is specified to a value smaller than 1, recursive fields are not permitted. The default value of the option is -1. if a protobuf record has more depth for recursive fields than the allowed value, it will be truncated and some fields may be discarded. This check is based on the fully qualified field type. SQL Schema for the protobuf message {code:java} message Person { string name = 1; Person bff = 2 }{code} will vary based on the value of `recursive.fields.max.depth`. {code:java} 1: struct<name: string> 2: struct<name string, bff: <name: string>> 3: struct<name string, bff: <name: string, bff: struct<name: string>>> ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org