I believe SL is correct. This is as expected. This is data canonicalization, which is very typically what happens when a parser tolerates many diverse formats, but the data format doesn't capture which of those specifically. They're considered, by the DFDL schema, to be 100% equivalent. The output when unparsing is then the canonical representation of that information.
In general to deal with this we use roundTrip="twoPass" tests in TDML. You probably need to change some one-pass tests to two-pass. That way it parses, unparses (to the canonical representation) then parses again and compares infosets. At that second parse, it will get the same infoset from 8:43.Los Angeles Time as from 8:43.GMT-08:00 so the test will pass. ________________________________ From: Larry Barber <larry.bar...@nteligen.com> Sent: Friday, January 8, 2021 9:36 AM To: dev@daffodil.apache.org <dev@daffodil.apache.org> Subject: RE: Timezones in DFDL This reminds me of the case where there are multiple possible delimiters - the one provided in the original file may not be the one that appears in the unparse output. -----Original Message----- From: Steve Lawrence [mailto:slawre...@apache.org] Sent: Friday, January 8, 2021 9:18 AM To: dev@daffodil.apache.org Subject: Timezones in DFDL I was confirming that DAFFODIL-1580 [1] is still an issue and was going to open a bug with ICU, but as I look more at this, I think this is just a limitation with timezones and DFDL, but wanted confirmation first. For example, we have a test schema that looks like this: <xs:element name="time" type="xs:dateTime" dfdl:calendarPattern="hh:mm.VVVV" ... /> And matching data that looks like this: 8:43.Los Angeles Time This parses to an infoset that looks like this: <time>08:43:00-08:00</time> And that infoset unparses to this: 08:43.GMT-08:00 Note that the unparsed timezone does not match the original data. DAFFODIL-1580 describes this behavior as a bug (either in Daffodil or ICU) but I think this is actually expected behavior. A DFDL infoset does not contain any location-specific timezone information--it only contains a GMT offset (a restriction of XML Schema). So this data will always unparse to a non-location specific timezone, depending on the calendar pattern. For some patterns this will be an offset or a generic timezone like PST (which should both roundtrip fine), but others might result in "Unknown" or "unk". I think this only affects the "V" and "v" calendar patterns, but additional tests should be added to confirm this behavior. This is the expected behavior, correct? [1] https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FDAFFODIL-1580&data=04%7C01%7Clarry.barber%40nteligen.com%7Ced53e6d6768e41dca6ec08d8b3e041ae%7C379c214c5c944e86a6062d047675f02a%7C0%7C0%7C637457123063759940%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hV8nkoGDxv039R6ZVkVwfYB%2BaUAIG3YLt3aRfebTrMI%3D&reserved=0