[ https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821285#comment-17821285 ]
Charles Givre commented on DRILL-8481: -------------------------------------- [~benj641] Thanks for submitting. Are you actively working on this or is this just a bug report? > Ability to query XML root attributes > ------------------------------------ > > Key: DRILL-8481 > URL: https://issues.apache.org/jira/browse/DRILL-8481 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - XML > Affects Versions: 1.21.1 > Reporter: benj > Priority: Major > > Hi, > It is possible to retrieve the field attributes except those of the root > It would be interesting to be able to retrieve the attributes found in the > root node of XML files. > In my common use cases, I have many XML files each containing a single XML > frame with often one or more attributes in the root tag. > To recover this value, I am currently forced to preprocess the files to > "copy" this attribute into the fields of the XML record. > Even with multiple xml records under the root, it would be useful to consider > that the root attributes are accessible for each record > Example (fichier aaa.xml): > {noformat} > <PPP Version="2023-001" TimeStamp="2023-06-09T21:17:14.416+02:00"> > <P1 SubVersion="a1" MID="XX003" PN="156" SL="3"/> > <P2 SubVersion="b1"><Color>blue</Color></P2> > </PPP> > {noformat} > With request : > {code:sql} > SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', > dataLevel=>1)) as xml) AS x; > {code} > I can access to : > * P1_SubVersion > * P1_MID > * P1_PN > * P1_SL > * P2_SubVersion > * P2.Color > But I can' access to : > * PPP_Version > * PPP_TimeStamp > and changing the DataLevel does not solve the problem > Regards, -- This message was sent by Atlassian Jira (v8.20.10#820010)