So when a separator can terminate an element or group it is considered to be 
"terminating markup".

This may be a definition missing in the spec.

It looks like a daffodil bug to me that this is not putting in the escape block 
start/end when a NL is in the data and is an in scope separator.


________________________________
From: Sloane, Brandon <[email protected]>
Sent: Saturday, November 16, 2019 8:54:36 PM
To: [email protected] <[email protected]>
Subject: Re: Bug in Daffodil - unparsing loses quotes around a CSV field

This looks to me like a bug in the spec to me.

According to the spec: generateEscapeBlock="whenNeeded" generates an escape 
block when any of the following conditions is met:

  *   any in-scope terminating delimiter
  *   dfdl:escapeBlockStart at the start of the data
  *   any dfdl:extraEscapedCharacters

In your schema, %NL; is an infix separator, which does not qualify.

There are 2 simple workarounds I see:

1) Use generateEscapeBlock="always". This is ugly, but should be technically 
correct.
2) Add %LF; to dfdl:extraEscapedCharacters

The reason we we %LF; instead of %NL; is that dfdl:extraEscapedCharacters has 
rather strict restrictions on what is allowed to be used, and %NL; is 
explicitly not permitted. To get the exact same behaviour you would expect from 
%NL; you would need to use "%LF; %CR; %NEL; %LS;" instead, but if you only care 
about UNIX and DOS style line endings, %LF; will suffice.

Below are all of the changes I had to make. Most of these are not directly 
relevent to your question


  *   Added xmlns:fn="http://www.w3.org/2005/xpath-functions"; (I thought this 
was included automatically in TDML files?)
  *   Switch outputNewLine to UNIX style instead of DOS style. This reflects 
the fact that the test cases seem to be in UNIX style (at least on my Linux 
box, something might have translated without us knowing)
  *   Added %LF; to extraEscapedCharacters
  *   Replaced ""Great car"" with \"Great car\"

Regards,
Brandon
________________________________
From: Costello, Roger L. <[email protected]>
Sent: Saturday, November 16, 2019 7:27 AM
To: [email protected] <[email protected]>
Subject: Bug in Daffodil - unparsing loses quotes around a CSV field


Hi Folks,



In CSV a field may span multiple lines if the field is wrapped in double quotes.



I have a CSV record that has a field that spans two lines and the field is 
wrapped in double quotes. Daffodil parses it perfectly but unparsing loses the 
double quotes. See graphic below and see attached TDML file. I believe this is 
a bug. Do you agree? Is there a workaround?  /Roger



[cid:[email protected]]

Reply via email to