I don't think there is a workaround, but I'm not sure it's that big of a
limitation in practice. Most scientific notation numbers I've come
across either a) don't have any grouping separator (e.g. 123456e3) or b)
the significand is less than ten and so won't have any grouping
separators. For example, most commonly this

  123,456e3

is really going to be written like this:

  1.23456e8

Which Daffodil can support.

- Steve

On 8/16/19 1:25 PM, Costello, Roger L. wrote:
> Hi Steve,
> 
>> It looks like not allowing grouping separators with 
>> scientific notation is a limitation of ICU, and so isn't 
>> supported by Daffodil.
> 
> Is there a workaround?
> 
> This 
> 
>      123,456e3
> 
> seems like a very reasonable format that one would like Daffodil to process. 
> Being unable to process this format seems like a significant limitation.
> 
> /Roger
> 
> -----Original Message-----
> From: Steve Lawrence <[email protected]> 
> Sent: Friday, August 16, 2019 12:57 PM
> To: [email protected]
> Subject: [EXT] Re: Best Practice in describing integers?
> 
> It looks like not allowing grouping separators with scientific notation is a 
> limitation of ICU, and so isn't supported by Daffodil. I don't think I've 
> ever seen such numbers, so I'm not sure this is too important, but I don't 
> think Daffodil is capable of parsing that to a number.
> 
> The point still stands that there are alot of complex number representations 
> that Daffodil can canonicalize so that users do not have to worry about how a 
> number was represented in the original format.
> 
> - Steve
> 
> On 8/16/19 12:47 PM, Costello, Roger L. wrote:
>> (123,456e3)
>>
>> How do I describe that number in DFDL? I tried this:
>>
>> <xs:elementname="NumberOfStudents"type="xs:integer"
>>              dfdl:textStandardDecimalSeparator="."
>>              dfdl:textNumberPattern="###,###E#;-#"
>>              dfdl:textStandardExponentRep="E"
>>              dfdl:ignoreCase="yes"
>>              dfdl:textNumberCheckPolicy="lax"/>
>>
>> But that resulted in this error message:
>>
>> *[error] Schema Definition Error: Invalid textNumberPattern: Malformed 
>> pattern for ICU DecimalFormat: "###,###E#;-#": Cannot have grouping 
>> separator in scientific notation at position 7*
>>
>> What is the correct way to describe the input data?
>>
>> /Roger
>>
>> -----Original Message-----
>> From: Steve Lawrence <[email protected]>
>> Sent: Wednesday, August 14, 2019 3:27 PM
>> To: [email protected]
>> Subject: [EXT] Re: Best Practice in describing integers?
>>
>> I think you hit on the biggest drawback of using a pattern facet, is 
>> that it becomes very difficult to use the value. By using the DFDL 
>> properties, Daffodil will canonicalize the number so users don't have 
>> to worry about what the actual form was in the data. In your simple 
>> example, it's not too bad, but what if the data contained:
>>
>>    (123,456e3)
>>
>> This is a negative number with grouping separators and an exponent. If 
>> this was put in the infoset, a user would need to strip parenthesis 
>> and commas, realize it's negative, and expand the exponent to figure 
>> out what the number is. But if you use DFDL properties, Daffodil does all 
>> the for you, and the infoset contains :
>>
>>    123456000
>>
>> Which is much easier to use and reason about.
>>
>> Related, since the number is an actual integer in the infoset, you can 
>> now use things like min/maxInclusive to validate the int value. You 
>> can't do this if you treat is as a string. The checkconstrains/pattern 
>> facet doesn't allow this kind of validation.
>>
>> On 8/14/19 12:23 PM, Costello, Roger L. wrote:
>>
>>  > Hello DFDL community,
>>
>>  >
>>
>>  > Below are two ways a DFDL schema may describe integers. One way 
>> uses
>>
>>  > lots of DFDL properties. The other way using XML Schema facets. 
>> Which
>>
>>  > way do you think is better? I've listed pros and cons of each way. 
>> Are there other pros and cons?
>>
>>  > /Roger
>>
>>  >
>>
> 

Reply via email to