Hi Steve,
> It looks like not allowing grouping separators with
> scientific notation is a limitation of ICU, and so isn't
> supported by Daffodil.
Is there a workaround?
This
123,456e3
seems like a very reasonable format that one would like Daffodil to process.
Being unable to process this format seems like a significant limitation.
/Roger
-----Original Message-----
From: Steve Lawrence <[email protected]>
Sent: Friday, August 16, 2019 12:57 PM
To: [email protected]
Subject: [EXT] Re: Best Practice in describing integers?
It looks like not allowing grouping separators with scientific notation is a
limitation of ICU, and so isn't supported by Daffodil. I don't think I've ever
seen such numbers, so I'm not sure this is too important, but I don't think
Daffodil is capable of parsing that to a number.
The point still stands that there are alot of complex number representations
that Daffodil can canonicalize so that users do not have to worry about how a
number was represented in the original format.
- Steve
On 8/16/19 12:47 PM, Costello, Roger L. wrote:
> (123,456e3)
>
> How do I describe that number in DFDL? I tried this:
>
> <xs:elementname="NumberOfStudents"type="xs:integer"
> dfdl:textStandardDecimalSeparator="."
> dfdl:textNumberPattern="###,###E#;-#"
> dfdl:textStandardExponentRep="E"
> dfdl:ignoreCase="yes"
> dfdl:textNumberCheckPolicy="lax"/>
>
> But that resulted in this error message:
>
> *[error] Schema Definition Error: Invalid textNumberPattern: Malformed
> pattern for ICU DecimalFormat: "###,###E#;-#": Cannot have grouping
> separator in scientific notation at position 7*
>
> What is the correct way to describe the input data?
>
> /Roger
>
> -----Original Message-----
> From: Steve Lawrence <[email protected]>
> Sent: Wednesday, August 14, 2019 3:27 PM
> To: [email protected]
> Subject: [EXT] Re: Best Practice in describing integers?
>
> I think you hit on the biggest drawback of using a pattern facet, is
> that it becomes very difficult to use the value. By using the DFDL
> properties, Daffodil will canonicalize the number so users don't have
> to worry about what the actual form was in the data. In your simple
> example, it's not too bad, but what if the data contained:
>
> (123,456e3)
>
> This is a negative number with grouping separators and an exponent. If
> this was put in the infoset, a user would need to strip parenthesis
> and commas, realize it's negative, and expand the exponent to figure
> out what the number is. But if you use DFDL properties, Daffodil does all the
> for you, and the infoset contains :
>
> 123456000
>
> Which is much easier to use and reason about.
>
> Related, since the number is an actual integer in the infoset, you can
> now use things like min/maxInclusive to validate the int value. You
> can't do this if you treat is as a string. The checkconstrains/pattern
> facet doesn't allow this kind of validation.
>
> On 8/14/19 12:23 PM, Costello, Roger L. wrote:
>
> > Hello DFDL community,
>
> >
>
> > Below are two ways a DFDL schema may describe integers. One way
> uses
>
> > lots of DFDL properties. The other way using XML Schema facets.
> Which
>
> > way do you think is better? I've listed pros and cons of each way.
> Are there other pros and cons?
>
> > /Roger
>
> >
>