Hi Brandon, Thank you for your help so far - much appreciated.
I think I found the difference: your example works when used as is, but the moment the enum type is part of a complex type, the parser complains or falls over. Here is my schema: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" xmlns:ex="http://example.com" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions" > <xs:include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd" /> <!--xmlns:tns="urn:a" targetNamespace="urn:a"--> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:format ref="GeneralFormat" lengthUnits="bits" byteOrder="bigEndian" bitOrder="mostSignificantBitFirst" representation="binary" /> </xs:appinfo> </xs:annotation> <xs:simpleType name="uint8" dfdl:lengthKind="explicit" dfdl:length="8"> <xs:restriction base="xs:unsignedByte"/> </xs:simpleType> <xs:simpleType name="SomeEnumType" dfdlx:repType="uint8" > <xs:restriction base="xs:string"> <xs:enumeration value="ENUM_1" dfdlx:repValues="0" /> <xs:enumeration value="ENUM_2" dfdlx:repValues="1" /> <xs:enumeration value="ENUM_3" dfdlx:repValues="2" /> </xs:restriction> </xs:simpleType> <xs:element name="Myroot"> <xs:complexType> <xs:sequence> <xs:element name="a" type="SomeEnumType"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> If element "a" was the root element, then it works. If element "a" is part of a sequence, then the parser complains: [error] Schema Definition Error: String with dfdl:lengthKind='implicit' must have an XSD maxLength facet value. Seems like I have to specify explicit lengths on the enumerated type like this for it to work: <xs:element name="Myroot"> <xs:complexType> <xs:sequence> <xs:element name="a" type="SomeEnumType" dfdl:lengthKind="explicit" dfdl:length="8"/> </xs:sequence> </xs:complexType> </xs:element> Thanks Pirow Engelbrecht | Senior Design Engineer Tel +27 12 678 9740 (ext. 9879) | Cell +27 63 148 3376 76 Regency Drive | Irene | Centurion | 0157<https://goo.gl/maps/v9ZbwjqpPyL2> [create-transition]<https://etion.co.za/> Facebook<https://www.facebook.com/Etion-Limited-2194612947433812?_rdc=1&_rdr> | YouTube<https://www.youtube.com/channel/UCUY-5oeACtLk2uTsEjZCU6A> | LinkedIn<https://www.linkedin.com/company/etionltd> | Twitter<https://twitter.com/Etionlimited> | Instagram<https://www.instagram.com/Etionlimited/> From: Sloane, Brandon <[email protected]> Sent: Thursday, 03 October 2019 02:21 To: [email protected] Subject: Re: repValue enumaration translation issue > It uses the inputTypeCalcString inputValueCalc function, but that just throws > the error that inputTypeCalcString is an unsupported function. It seems these > are deprecated in version 2.4.0 even though the fix version for these are > version 2.4.0. I missed this part in my first reply. dfdl:inputTypeCalcString() was never part of an official release of Daffodil. The corresponding function is now just dfdlx:inputTypeCalc(). ________________________________ From: Pirow Engelbrecht <[email protected]> Sent: Wednesday, October 2, 2019 5:34 AM To: [email protected] <[email protected]> Subject: repValue enumaration translation issue Hello, This issue has been posted on Stackoverflow originally here: https://stackoverflow.com/questions/58168427/dfdl-decoding-of-enumerated-binary-data<https://stackoverflow.com/questions/58168427/dfdl-decoding-of-enumerated-binary-data> Since then, I've realised that this mailing list is probably the better audience :-) Here is my original post (with some edits to keep things a bit shorter): I'm currently working on a DFDL schema for a legacy (custom) binary file format used in a system to translate to either XML or JSON. I've got some binary data that is enumerated values, i.e. the C-struct data type looks like this (and stored as a byte): typedef enum _SomeEnum { ENUM_1 = 0x00, ENUM_2 = 0x01, ENUM_3 = 0x02 } SomeEnum; I can decode the enumeration to a numerical value just fine with this DFDL schema code (including checks for speculative parsing): <xs:element name="SomeEnum" type="xs:unsignedByte> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/<http://www.ogf.org/dfdl/>"> <dfdl:assert><![CDATA[{ . lt 3 }]]></dfdl:assert> </xs:appinfo> </xs:annotation> </xs:element> which translates to this XML with the enum field equal to 1 in this instance: <SomeEnum>1</SomeEnum> What I would like is to have the ability to translate the decoded enumeration value to a string so that the XML result looks like this: <SomeEnum>ENUM_1</SomeEnum> Brandon Sloane (Daffodil dev) then responded to the post (also edited, just to highlight the preferred solution): The newest release of Daffodil (2.4.0) includes a DFDL extension designed specifically for this problem. Some documentation available on the Daffodil wiki<https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Feature+to+support+enumerations+and+typeValueCalc>. The theory here is that you can define a simple type that is a restriction on xs:string as an xsd enumeration; then supply the corresponding binary values as a DFDL annotation: <xs:simpleType name="uint8" dfdl:length="1"> <xs:restriction base="xs:unsignedInt"/> </xs:simpleType> <xs:simpleType name="SomeEnumType" dfdlx:repType="tns:uint8"> <xs:restriction base="xs:string"> <xs:enumeration value="ENUM_1" dfdlx:repValues="0" /> <xs:enumeration value="ENUM_2" dfdlx:repValues="1" /> <xs:enumeration value="ENUM_3" dfdlx:repValues="2" /> </xs:restriction> </xs:simpleType> <xs:element name="SomeEnum" type="tns:SomeEnumType" /> The benefit here is that the schema is much more maintainable, and Daffodil will perform the lookup using a direct hash-table lookup, instead of needed to walk through an if-else tree. I then ran into some issues with the above recommendation: Daffodil produces the following error for the above schema: [error] Schema Definition Error: When lengthKind='implicit', both minLength and maxLength facets must be specified. Adding xs:minLength and xs:maxLength, the parser complains that they need to be the same value. Setting them the same, the parser then crashes. Not sure what these need to be. I found this JIRA issue<https://issues.apache.org/jira/browse/DAFFODIL-2146> (https://issues.apache.org/jira/browse/DAFFODIL-2146<https://issues.apache.org/jira/browse/DAFFODIL-2146>) . It uses the inputTypeCalcString inputValueCalc function, but that just throws the error that inputTypeCalcString is an unsupported function. It seems these are deprecated in version 2.4.0 even though the fix version for these are version 2.4.0. What I have realised is that it can translate from one type to another only if that type is the exact same length. So this works: <xs:simpleType name="SomeEnumType" dfdlx:repType="xs:unsignedByte"> <xs:restriction base=" xs:unsignedByte "> <xs:enumeration value="55" dfdlx:repValues="0" /> <xs:enumeration value="56" dfdlx:repValues="1" /> <xs:enumeration value="57" dfdlx:repValues="2" /> </xs:restriction> </xs:simpleType> The value 0 is translated to 55, 1 to 56 and 2 to 57. But the moment I change the translated base to something else, Daffodil doesn't like it, e.g. <xs:simpleType name="SomeEnumType" dfdlx:repType="xs:unsignedByte"> <xs:restriction base=" xs:unsignedShort "> <xs:enumeration value="55" dfdlx:repValues="0" /> <xs:enumeration value="56" dfdlx:repValues="1" /> <xs:enumeration value="57" dfdlx:repValues="2" /> </xs:restriction> </xs:simpleType> It complains that shorts are 16-bits in length (the repValue base is 8-bits). Any ideas/help? Thanks Pirow Engelbrecht | Senior Design Engineer Tel +27 12 678 9740 (ext. 9879) | Cell +27 63 148 3376 76 Regency Drive | Irene | Centurion | 0157<https://goo.gl/maps/v9ZbwjqpPyL2> [create-transition]<https://etion.co.za/> Facebook<https://www.facebook.com/Etion-Limited-2194612947433812?_rdc=1&_rdr> | YouTube<https://www.youtube.com/channel/UCUY-5oeACtLk2uTsEjZCU6A> | LinkedIn<https://www.linkedin.com/company/etionltd> | Twitter<https://twitter.com/Etionlimited> | Instagram<https://www.instagram.com/Etionlimited/>
