[ 
https://issues.apache.org/jira/browse/DRILL-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre reassigned DRILL-8481:
------------------------------------

    Assignee: Charles Givre

> Ability to query XML root attributes
> ------------------------------------
>
>                 Key: DRILL-8481
>                 URL: https://issues.apache.org/jira/browse/DRILL-8481
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - XML
>    Affects Versions: 1.21.1
>            Reporter: benj
>            Assignee: Charles Givre
>            Priority: Major
>
> Hi,
> It is possible to retrieve the field attributes except those of the root
> It would be interesting to be able to retrieve the attributes found in the 
> root node of XML files.
> In my common use cases, I have many XML files each containing a single XML 
> frame with often one or more attributes in the root tag.
> To recover this value, I am currently forced to preprocess the files to 
> "copy" this attribute into the fields of the XML record.
> Even with multiple xml records under the root, it would be useful to consider 
> that the root attributes are accessible for each record
> Example (fichier aaa.xml): 
> {noformat}
> <PPP Version="2023-001" TimeStamp="2023-06-09T21:17:14.416+02:00">
> <P1 SubVersion="a1" MID="XX003" PN="156" SL="3"/>
> <P2 SubVersion="b1"><Color>blue</Color></P2>
> </PPP>
> {noformat}
> With request : 
> {code:sql}
> SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', 
> dataLevel=>1)) as xml) AS x;
> {code}
> I can access to :
> * P1_SubVersion
> * P1_MID
> * P1_PN
> * P1_SL
> * P2_SubVersion
> * P2.Color
> But I can' access to :
> * PPP_Version
> * PPP_TimeStamp
> and changing the DataLevel does not solve the problem
> Regards,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to