baganokodo2022 commented on PR #38922:
URL: https://github.com/apache/spark/pull/38922#issuecomment-1341534552

   Hi @SandishKumarHN,
   
   For the `recursionDepth` option, could we consider naming it as 
`CircularReferenceTolerance` or `CircularReferenceDepth` for clarity?
   For instance, -1 (default value) will error out on any circular reference, 0 
drops any circular reference field, 1 allows the same field to be entered 
twice, and on.
    
   Besides, can we also support a "CircularReferenceType" option with a enum 
value of `[FIELD_NAME, FIELD_TYPE]`. The reason is because navigation can go 
very deep before the same **fully-qualified** `FIELD_NAME` is encountered 
again. While `FIELD_TYPE` stops recursive navigation much faster. We could make 
`FIELD_NAME` the default option. In my test cases, with `FIELD_TYPE`, a 
circular reference can repeat 3 times before the executor hit OOM, while 
`FIELD_NAME` hit OOM when `CircularReferenceTolerance` is set to 1.
   
   Please let me know your thoughts.
   
   cc @rangadi 
   
   Thank you
   
   Xinyu Liu
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to