There is a deeper problem with this example. It is a dateTime without a date 
and loses the nuance between when Los Angeles is GMT-7 and GMT-8.

Sent from my iPhone

> On Jan 8, 2021, at 6:48 AM, Beckerle, Mike <mbecke...@owlcyberdefense.com> 
> wrote:
> 
> I believe SL is correct. This is as expected.  This is data 
> canonicalization, which is very typically what happens when a parser 
> tolerates many diverse formats, but the data format doesn't capture which of 
> those specifically. They're considered, by the DFDL schema, to be 100% 
> equivalent. The output when unparsing is then the canonical representation of 
> that information.
> 
> In general to deal with this we use roundTrip="twoPass" tests in TDML. You 
> probably need to change some one-pass tests to two-pass.
> 
> That way it parses, unparses (to the canonical representation) then parses 
> again and compares infosets. At that second parse, it will get the same 
> infoset from 8:43.Los Angeles Time as from 8:43.GMT-08:00 so the test will 
> pass.
> 
> 
> ________________________________
> From: Larry Barber <larry.bar...@nteligen.com>
> Sent: Friday, January 8, 2021 9:36 AM
> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
> Subject: RE: Timezones in DFDL
> 
> This reminds me of the case where there are multiple possible delimiters - 
> the one provided in the original file may not be the one that appears in the 
> unparse output.
> 
> -----Original Message-----
> From: Steve Lawrence [mailto:slawre...@apache.org]
> Sent: Friday, January 8, 2021 9:18 AM
> To: dev@daffodil.apache.org
> Subject: Timezones in DFDL
> 
> I was confirming that DAFFODIL-1580 [1] is still an issue and was going to 
> open a bug with ICU, but as I look more at this, I think this is just a 
> limitation with timezones and DFDL, but wanted confirmation first.
> 
> For example, we have a test schema that looks like this:
> 
> <xs:element name="time" type="xs:dateTime"
>   dfdl:calendarPattern="hh:mm.VVVV" ... />
> 
> And matching data that looks like this:
> 
>  8:43.Los Angeles Time
> 
> This parses to an infoset that looks like this:
> 
>  <time>08:43:00-08:00</time>
> 
> And that infoset unparses to this:
> 
>  08:43.GMT-08:00
> 
> Note that the unparsed timezone does not match the original data.
> DAFFODIL-1580 describes this behavior as a bug (either in Daffodil or
> ICU) but I think this is actually expected behavior. A DFDL infoset does not 
> contain any location-specific timezone information--it only contains a GMT 
> offset (a restriction of XML Schema). So this data will always unparse to a 
> non-location specific timezone, depending on the calendar pattern. For some 
> patterns this will be an offset or a generic timezone like PST (which should 
> both roundtrip fine), but others might result in "Unknown" or "unk". I think 
> this only affects the "V" and "v" calendar patterns, but additional tests 
> should be added to confirm this behavior.
> 
> This is the expected behavior, correct?
> 
> 
> [1] 
> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FDAFFODIL-1580&amp;data=04%7C01%7Clarry.barber%40nteligen.com%7Ced53e6d6768e41dca6ec08d8b3e041ae%7C379c214c5c944e86a6062d047675f02a%7C0%7C0%7C637457123063759940%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=hV8nkoGDxv039R6ZVkVwfYB%2BaUAIG3YLt3aRfebTrMI%3D&amp;reserved=0

Reply via email to