Thanks Brandon.

  *   1) Use generateEscapeBlock="always". This is ugly, but should be 
technically correct.
  *   2) Add %LF; to dfdl:extraEscapedCharacters


Your first suggestion didn't work - unparsing didn't put quotes around the 
string. Your second suggestion did work - thanks!

/Roger

From: Sloane, Brandon <[email protected]>
Sent: Saturday, November 16, 2019 8:55 PM
To: [email protected]
Subject: [EXT] Re: Bug in Daffodil - unparsing loses quotes around a CSV field

This looks to me like a bug in the spec to me.

According to the spec: generateEscapeBlock="whenNeeded" generates an escape 
block when any of the following conditions is met:

  *   any in-scope terminating delimiter
  *   dfdl:escapeBlockStart at the start of the data
  *   any dfdl:extraEscapedCharacters
In your schema, %NL; is an infix separator, which does not qualify.

There are 2 simple workarounds I see:

1) Use generateEscapeBlock="always". This is ugly, but should be technically 
correct.
2) Add %LF; to dfdl:extraEscapedCharacters

The reason we we %LF; instead of %NL; is that dfdl:extraEscapedCharacters has 
rather strict restrictions on what is allowed to be used, and %NL; is 
explicitly not permitted. To get the exact same behaviour you would expect from 
%NL; you would need to use "%LF; %CR; %NEL; %LS;" instead, but if you only care 
about UNIX and DOS style line endings, %LF; will suffice.

Below are all of the changes I had to make. Most of these are not directly 
relevent to your question


  *   Added xmlns:fn="http://www.w3.org/2005/xpath-functions"; (I thought this 
was included automatically in TDML files?)
  *   Switch outputNewLine to UNIX style instead of DOS style. This reflects 
the fact that the test cases seem to be in UNIX style (at least on my Linux 
box, something might have translated without us knowing)
  *   Added %LF; to extraEscapedCharacters
  *   Replaced ""Great car"" with \"Great car\"
Regards,
Brandon
________________________________
From: Costello, Roger L. <[email protected]<mailto:[email protected]>>
Sent: Saturday, November 16, 2019 7:27 AM
To: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Bug in Daffodil - unparsing loses quotes around a CSV field


Hi Folks,



In CSV a field may span multiple lines if the field is wrapped in double quotes.



I have a CSV record that has a field that spans two lines and the field is 
wrapped in double quotes. Daffodil parses it perfectly but unparsing loses the 
double quotes. See graphic below and see attached TDML file. I believe this is 
a bug. Do you agree? Is there a workaround?  /Roger



[cid:[email protected]]

Reply via email to