I'm trying to analyze XML documents using spark-xml package. Since all XML columns are optional, some columns may or may not exist. When I register the Dataframe as a table, how do I check if a nested column is existing or not? My column name is "emp" which is already exploded and I am trying to check if the nested column "emp.mgr.col" exists or not. If it exists, I need to use it. If it does not exist, I should set it to null. Is there a way to achieve this?
Please note I am not able to use .columns method because it does not show the nested columns. Also, note that I cannot manually specify the schema because of my requirement. I'm trying this in Pyspark. Thank you.