I looked at and considered TDML (link) <https://daffodil.apache.org/tdml/> but could not find a convenient tool to facilitate the generation of a TDML file from inputs so instead...
I've attached the relevant ~.dfdl.xsd and input/output ~.csv files. After adjusting for respective file paths, below are the results of: parse, unparse, diff, and xmllint respectively where 'diff' [without the '-qs' options] indicates 'No newline at end of file' error... --------------------------------------------------- *+ daffodil parse --validate=on -s csv-version4.dfdl.xsd -r csv-version4.dfdl.xsd -Dheader=present -o out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml /home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv****+ parse_exit_code=0****+ daffodil unparse --validate=on -s csv-version4.dfdl.xsd -r csv-version4.dfdl.xsd '-DSeparator=|' header=present -o out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml****+ unparse_exit_code=0****+ diff -qs /home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv****+ diff_exit_code=1****+ xmllint --schema csv-version4.dfdl.xsd out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml --noout out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml validates****+ xmllint_exit_code=0* --------------------------------------------------- thx in advance - attila On Mon, Jun 21, 2021 at 3:20 PM Beckerle, Mike <[email protected] <mailto:[email protected]>> wrote: Attila, It took me a bit to spot this, and I'm really not sure I am correct here. I think you need one more sequence. If you insert another element of type "TailType-perstempo" at the end, it doesn't want to be inside the sequence with NL infix separators. It wants to be after that sequence has ended, but inside a surrounding sequence that is the model group of the complexType of the csv-version4... element, but which has no separators. Given the images you provided, I can't cut/paste to try this theory out. I would like to make self-contained TDML files be the way we all exchange examples/bug-reports. Could you make a TDML file? (See https://daffodil.apache.org/tdml/ <https://daffodil.apache.org/tdml/>) Their beauty is that they can be fully self-contained, i.e., contain schema, data, and expected results all together. Everything to reproduce can be in the same file. -mikeb -------------------------------------------------------------------------------- *From:* Attila Horvath <[email protected] <mailto:[email protected]>> *Sent:* Monday, June 21, 2021 1:27 PM *To:* [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]>> *Subject:* how to incorporate file terminator into generic CSV schema? I have following generic variable field/record length schema which daffodil 2.4.0 parses/unparses verbatim except "No newline at end of file" error when I diff original CSV against reconstituted CSV. Otherwise reconstituted CSV appears to match original CSV:... image.png To get around this I've tried/failed to incorporate code block in *RED* into code block in *YELLOW* (see code block image). I've used code block in *RED* successfully but not w/ variable fields/records CSV via "...fn:count...". Can someone pls suggest how to correctly integrate unknown file terminator into code block in *YELLOW* in this schema? image.png Thx in advance, Attila
defaults.dfdl.xsd
Description: Binary data
csv-version4.dfdl.xsd
Description: Binary data
USSOCOM_PERSTEMPO_CT1.csv
Description: MS-Excel spreadsheet
_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv
Description: MS-Excel spreadsheet
