benj created DRILL-8481: --------------------------- Summary: Ability to query root attributes Key: DRILL-8481 URL: https://issues.apache.org/jira/browse/DRILL-8481 Project: Apache Drill Issue Type: Improvement Components: Storage - XML Affects Versions: 1.21.1 Reporter: benj
Hi, It is possible to retrieve the field attributes except those of the root It would be interesting to be able to retrieve the attributes found in the root node of XML files. In my common use cases, I have many XML files each containing a single XML frame with often one or more attributes in the root tag. To recover this value, I am currently forced to preprocess the files to "copy" this attribute into the fields of the XML record. Even with multiple xml records under the root, it would be useful to consider that the root attributes are accessible for each record Example (fichier aaa.xml): {noformat} <PPP Version="2023-001" TimeStamp="2023-06-09T21:17:14.416+02:00"> <P1 SubVersion="a1" MID="XX003" PN="156" SL="3"/> <P2 SubVersion="b1"><Color>blue</Color></P2> </PPP> {noformat} With request : {code:sql} SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', dataLevel=>1)) as xml) AS x; {code} I can access to : * P1_SubVersion * P1_MID * P1_PN * P1_SL * P2_SubVersion * P2.Color But I can' access to : * PPP_Version * PPP_TimeStamp and changing the DataLevel does not solve the problem Regards, -- This message was sent by Atlassian Jira (v8.20.10#820010)