Yuchen Liu created SPARK-48964:
----------------------------------

             Summary: Fix the discrepancy between implementation, comment and 
documentation of option recursive.fields.max.depth in ProtoBuf connector
                 Key: SPARK-48964
                 URL: https://issues.apache.org/jira/browse/SPARK-48964
             Project: Spark
          Issue Type: Documentation
          Components: Connect
    Affects Versions: 3.5.1, 3.5.0, 4.0.0, 3.5.2, 3.5.3
            Reporter: Yuchen Liu
             Fix For: 4.0.0, 3.5.2, 3.5.3, 3.5.1, 3.5.0


After the three PRs ([https://github.com/apache/spark/pull/38922,] 
[https://github.com/apache/spark/pull/40011,] 
[https://github.com/apache/spark/pull/40141]) working on the same option, there 
are some legacy comments and documentation that has not been updated to the 
latest implementation. This task should consolidate them. Below is the correct 
description of the behavior.

The `recursive.fields.max.depth` parameter can be specified in the 
from_protobuf options to control the maximum allowed recursion depth for a 
field. Setting `recursive.fields.max.depth` to 1 drops all-recursive fields, 
setting it to 2 allows it to be recursed once, and setting it to 3 allows it to 
be recursed twice. Attempting to set the `recursive.fields.max.depth` to a 
value greater than 10 is not allowed. If the `recursive.fields.max.depth` is 
specified to a value smaller than 1, recursive fields are not permitted. The 
default value of the option is -1. if a protobuf record has more depth for 
recursive fields than the allowed value, it will be truncated and some fields 
may be discarded. This check is based on the fully qualified field type. SQL 
Schema for the protobuf message
{code:java}
message Person { string name = 1; Person bff = 2 }{code}
will vary based on the value of `recursive.fields.max.depth`.

 
{code:java}
1: struct<name: string>
2: struct<name string, bff: <name: string>>
3: struct<name string, bff: <name: string, bff: struct<name: string>>> ...
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to