Could be a bug, but we do regression testing of UTF-8 that uses lots of multi-byte characters and such. So I'd be surprised.
We need to see the entire example including the data bytes you are parsing so we can reproduce. ________________________________ From: Costello, Roger L. <[email protected]> Sent: Wednesday, September 18, 2019 1:16 PM To: [email protected] <[email protected]> Subject: Re: Is an oe ligature okay in the value of dfdl:initiator? Hi Mike, I changed the encoding to utf-8: <xs:element name="input" type="xs:string" dfdl:initiator="Lecœur" dfdl:encoding="utf-8"/> I get the same error message: [error] Parse Error: Initiator 'Lec?ur' not found Bug in Daffodil? /Roger From: Beckerle, Mike <[email protected]> Sent: Wednesday, September 18, 2019 1:02 PM To: [email protected] Subject: [EXT] Re: Is an oe ligature okay in the value of dfdl:initiator? You have a mismatch between the character set encoding of your DFDL schema, and the character set encoding it says is in the data. Is your DFDL schema in UTF-8? The character œ doesn't exist in iso-8859-1. If your data contains œ then the encoding must be iso-8859-15 or utf-8 or something that has the œ character. I think it is a daffodil bug that you did not get a schema definition error when it read the string for your dfdl:initiator, but is not able to translate it into the encoding because some characters are illegal/unmapped. I would like you to have gotten "SDE: initiator contains characters undefined in encoding iso-8859-1: 'œ' ". ________________________________ From: Costello, Roger L. <[email protected]<mailto:[email protected]>> Sent: Wednesday, September 18, 2019 12:43 PM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: Is an oe ligature okay in the value of dfdl:initiator? Hello DFDL community, Here’s my input file (notice the œ ligature): Lecœur Hello, world Lecœur is the initiator. Here’s my DFDL schema: <xs:element name="input" type="xs:string" dfdl:initiator="Lecœur" dfdl:encoding="ISO-8859-1"/> Running it yields this error message: [error] Parse Error: Initiator 'Lec?ur' not found Why am I getting this error message? /Roger
