Unfortunately PonyMail doesn't render HTML formatted emails well that
contain inline images and formatting.
This AM's response attached in HTML format: "how to incorporate file
terminator into generic CSV schema .html"

---------- Forwarded message ---------
From: attila horvath <[email protected]>
Date: Tue, Jun 22, 2021 at 5:53 AM
Subject: Re: how to incorporate file terminator into generic CSV schema?
To: <[email protected]>, Beckerle, Mike <
[email protected]>


I looked at and considered TDML (link) <https://daffodil.apache.org/tdml/>
but could not find a convenient tool to facilitate the generation of a TDML
file from inputs so instead...

I've attached the relevant ~.dfdl.xsd and input/output ~.csv files.

After adjusting for respective file paths, below are the results of: parse,
unparse, diff, and xmllint respectively where 'diff' [without the '-qs'
options] indicates 'No newline at end of file' error...
---------------------------------------------------

*+ daffodil parse --validate=on -s csv-version4.dfdl.xsd -r
csv-version4.dfdl.xsd -Dheader=present -o
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml
/home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv**+
parse_exit_code=0**+ daffodil unparse --validate=on -s
csv-version4.dfdl.xsd -r csv-version4.dfdl.xsd '-DSeparator=|'
header=present -o
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml**+
unparse_exit_code=0**+ diff -qs
/home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv**+
diff_exit_code=1**+ xmllint --schema csv-version4.dfdl.xsd
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml
--noout 
out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml
validates**+ xmllint_exit_code=0*

---------------------------------------------------

thx in advance - attila


On Mon, Jun 21, 2021 at 3:20 PM Beckerle, Mike <
[email protected]> wrote:

> Attila,
>
> It took me a bit to spot this, and I'm really not sure I am correct here.
>
> I think you need one more sequence. If you insert another element of type
> "TailType-perstempo" at the end, it doesn't want to be inside the sequence
> with NL infix separators. It wants to be after that sequence has ended, but
> inside a surrounding sequence that is the model group of the complexType of
> the csv-version4... element, but which has no separators.
>
> Given the images you provided, I can't cut/paste to try this theory out.
>
> I would like to make self-contained TDML files be the way we all exchange
> examples/bug-reports.
>
> Could you make a TDML file?  (See https://daffodil.apache.org/tdml/)
>
> Their beauty is that they can be fully self-contained, i.e., contain
> schema, data, and expected results all together. Everything to reproduce
> can be in the same file.
>
> -mikeb
>
>
>
>
>
> ------------------------------
> *From:* Attila Horvath <[email protected]>
> *Sent:* Monday, June 21, 2021 1:27 PM
> *To:* [email protected] <[email protected]>
> *Subject:* how to incorporate file terminator into generic CSV schema?
>
> I have following generic variable field/record length schema which
> daffodil 2.4.0 parses/unparses verbatim except "No newline at end of file"
> error when I diff original CSV against reconstituted CSV. Otherwise
> reconstituted CSV appears to match original CSV:...
> [image: image.png]
>
> To get around this I've tried/failed to incorporate code block in *RED*
> into code block in *YELLOW* (see code block image). I've used code block
> in *RED* successfully but not w/ variable fields/records CSV via
> "...fn:count...".
> Can someone pls suggest how to correctly integrate unknown file terminator
> into code block in *YELLOW* in this schema?
> [image: image.png]
>
> Thx in advance,
>
> Attila
>
>
>
>
>
Title: Re: how to incorporate file terminator into generic CSV schema?
I looked at and considered TDML (link) but could not find a convenient tool to facilitate the generation of a TDML file from inputs so instead...

I've attached the relevant ~.dfdl.xsd and input/output ~.csv files.

After adjusting for respective file paths, below are the results of: parse, unparse, diff, and xmllint respectively where 'diff' [without the '-qs' options] indicates 'No newline at end of file' error...
---------------------------------------------------
+ daffodil parse --validate=on -s csv-version4.dfdl.xsd -r csv-version4.dfdl.xsd -Dheader=present -o out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml /home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv
+ parse_exit_code=0
+ daffodil unparse --validate=on -s csv-version4.dfdl.xsd -r csv-version4.dfdl.xsd '-DSeparator=|' header=present -o out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml
+ unparse_exit_code=0
+ diff -qs /home/attila/CDES/trunk/perstempo/data/JITC/CT/USSOCOM_PERSTEMPO_CT1.csv out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.csv
+ diff_exit_code=1
+ xmllint --schema csv-version4.dfdl.xsd out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml --noout out/_-_home_-_attila_-_CDES_-_trunk_-_perstempo_-_data_-_JITC_-_CT_-_USSOCOM_PERSTEMPO_CT1.xml validates
+ xmllint_exit_code=0
---------------------------------------------------

thx in advance - attila


On Mon, Jun 21, 2021 at 3:20 PM Beckerle, Mike <[email protected]> wrote:
Attila,

It took me a bit to spot this, and I'm really not sure I am correct here.

I think you need one more sequence. If you insert another element of type "TailType-perstempo" at the end, it doesn't want to be inside the sequence with NL infix separators. It wants to be after that sequence has ended, but inside a surrounding sequence that is the model group of the complexType of the csv-version4... element, but which has no separators.

Given the images you provided, I can't cut/paste to try this theory out.

I would like to make self-contained TDML files be the way we all exchange examples/bug-reports.

Could you make a TDML file?  (See https://daffodil.apache.org/tdml/

Their beauty is that they can be fully self-contained, i.e., contain schema, data, and expected results all together. Everything to reproduce can be in the same file.

-mikeb






From: Attila Horvath <[email protected]>
Sent: Monday, June 21, 2021 1:27 PM
To: [email protected] <[email protected]>
Subject: how to incorporate file terminator into generic CSV schema?
 
I have following generic variable field/record length schema which daffodil 2.4.0 parses/unparses verbatim except "No newline at end of file" error when I diff original CSV against reconstituted CSV. Otherwise reconstituted CSV appears to match original CSV:...
image.png

To get around this I've tried/failed to incorporate code block in RED into code block in YELLOW (see code block image). I've used code block in RED successfully but not w/ variable fields/records CSV via "...fn:count...".
Can someone pls suggest how to correctly integrate unknown file terminator into code block in YELLOW in this schema?
image.png

Thx in advance,

Attila




<?xml version="1.0" encoding="UTF-8"?> 

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
		   xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
		   elementFormDefault="qualified">
	
	<xs:annotation>
		<xs:appinfo source="http://www.ogf.org/dfdl/";>
			<dfdl:defineFormat name="default-dfdl-properties">
				<dfdl:format 
					alignment="1" 
					alignmentUnits="bytes"  
					binaryFloatRep="ieee" 
					binaryNumberRep="binary"  
					bitOrder="mostSignificantBitFirst"
					byteOrder="bigEndian"  
					calendarPatternKind="implicit"
					choiceLengthKind="implicit"
					documentFinalTerminatorCanBeMissing="yes" 
					emptyValueDelimiterPolicy="none"
					encoding="ISO-8859-1"
					encodingErrorPolicy="replace" 
					escapeSchemeRef=""  
					fillByte="f" 
					floating="no" 
					ignoreCase="no" 
					initiator="" 
					initiatedContent="no" 
					leadingSkip="0" 
					lengthKind="delimited"
					lengthUnits="bytes"  
					nilKind="literalValue"  
					nilValueDelimiterPolicy="none"
					occursCountKind="implicit"
					outputNewLine="%CR;%LF;"
					representation="text" 
					separator=""
					separatorPosition="infix"
					separatorSuppressionPolicy="anyEmpty"  
					sequenceKind="ordered" 
					terminator=""   
					textBidi="no" 
					textNumberCheckPolicy="strict"
					textNumberPattern="#,##0.###;-#,##0.###" 
					textNumberRep="standard" 
					textNumberRounding="explicit"  
					textNumberRoundingIncrement="0"
					textNumberRoundingMode="roundUnnecessary" 
					textOutputMinLength="0" 
					textPadKind="none" 
					textStandardBase="10"
					textStandardDecimalSeparator="."
					textStandardExponentRep="E"
					textStandardInfinityRep="Inf"  
					textStandardNaNRep="NaN"
					textStandardZeroRep="0" 
					textStandardGroupingSeparator="," 
					textTrimKind="none" 
					trailingSkip="0" 
					truncateSpecifiedLengthString="no" 
					utf16Width="fixed" 
				/>
			</dfdl:defineFormat>
		</xs:appinfo>
	</xs:annotation>
	
</xs:schema>
<?xml version="1.0" encoding="UTF-8"?> 
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
	xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
	xmlns:fn="http://www.w3.org/2005/xpath-functions";
	xmlns:math="http://www.w3.org/2005/xpath-functions/math";
	elementFormDefault="qualified">
	
	<xs:include schemaLocation="defaults.dfdl.xsd" />
	
	<xs:annotation>
		<xs:appinfo source="http://www.ogf.org/dfdl/";>
			<dfdl:defineEscapeScheme name="Quotes">
				<dfdl:escapeScheme 
					escapeKind="escapeBlock"
					escapeBlockStart='"'
					escapeBlockEnd='"' 
					escapeEscapeCharacter="\"
					extraEscapedCharacters="" 
					generateEscapeBlock="whenNeeded"/>
			</dfdl:defineEscapeScheme>
			<dfdl:format ref="default-dfdl-properties"/>
			<dfdl:defineVariable name="Separator" type="xs:string" external="true">,</dfdl:defineVariable>
			<dfdl:defineVariable name="header" type="xs:string" external="true">present</dfdl:defineVariable>
			<dfdl:defineFormat name="fieldSeparator">
				<dfdl:format separator="{ $Separator }" separatorPosition="infix"/>
			</dfdl:defineFormat>
		</xs:appinfo>
	</xs:annotation>
	 
	<xs:element name="csv-version4.dfdl.xsd">
		<xs:complexType>
			<xs:sequence dfdl:separator="%NL;" dfdl:separatorPosition="infix" dfdl:separatorSuppressionPolicy="trailingEmpty">
				<xs:element name="header">
					<xs:complexType>
						<xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
							<xs:element name="title" maxOccurs="unbounded" type="xs:string" />
						</xs:sequence>
					</xs:complexType>
				</xs:element>
				<xs:element name="record" maxOccurs="unbounded">
					<xs:complexType>
						<xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
							<xs:element name="field" maxOccurs="unbounded" type="xs:string"
								dfdl:occursCount="{ fn:count(../../header/title) }"
								dfdl:occursCountKind="expression" />
						</xs:sequence>
					</xs:complexType>
				</xs:element>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
	
	<xs:complexType name="TailType-perstempo">
		<xs:sequence>
			<xs:element name="contents" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern=".*?(?=\r\n|\n|\z)"/>
			<xs:choice>
				<xs:element name="CRLF" type="xs:string" dfdl:lengthKind="explicit" dfdl:length="0" dfdl:initiator="%CR;%LF;"/>
				<xs:element name="LF" type="xs:string" dfdl:lengthKind="explicit" dfdl:length="0" dfdl:initiator="%LF;"/>
				<xs:element name="NIL" type="xs:string" dfdl:lengthKind="explicit" dfdl:length="0"/>
			</xs:choice>
		</xs:sequence>
	</xs:complexType>
	
</xs:schema>
"Ssn","Asvab Test Date","Gct Total"
"0000000057","2004-01-15 12:00 AM","124"
"0000000174","1996-10-08 12:00 AM","105"
"0000700161","2008-06-30 12:00 AM","136"
"0130000155","1996-08-25 12:00 AM","118"
"0100000175","1993-12-07 12:00 AM","117"
"0000000738","1995-11-07 12:00 AM","125"
"0000000070","","112"
"0000000895","1989-08-20 12:00 AM","108"
"0000000217","1998-09-03 12:00 AM","108"
"0200000961","1994-09-27 12:00 AM","117"
"0330000160","1997-07-01 12:00 AM","132"
"0000001861","2004-06-24 12:00 AM","114"
"0000000596","2003-12-09 12:00 AM","120"
"0000000009","2000-04-20 12:00 AM","99"
"0400000000","2009-08-14 12:00 AM","107"
"0470000000","1994-11-16 12:00 AM","106"
"0400000000","2002-02-19 12:00 AM","111"
"0000000000","","0"
"0400000000","2008-08-15 12:00 AM","123"
"0400000000","2004-09-13 12:00 AM","127"
"0400000006","2000-04-28 12:00 AM","135"
"0500000004","2007-11-06 12:00 AM","125"
"0500000001","2010-02-22 12:00 AM","102"
"0500000001","2005-06-07 12:00 AM","108"
"0500000007","2004-09-30 12:00 AM","116"
"0570000013","1998-11-04 12:00 AM","121"
"0400000084","1998-11-30 12:00 AM","118"
"0500000047","1999-10-25 12:00 AM","102"
"0000000085","","130"
"0500000007","2002-06-10 12:00 AM","111"
"0000000000","2008-02-01 12:00 AM","95"
"0000000000","2004-04-01 12:00 AM","115"
"0100000005","","0"
"0200000005","2009-05-14 12:00 AM","103"
"0009600002","2004-09-13 12:00 AM","138"
"Ssn","Asvab Test Date","Gct Total"
"0000000057","2004-01-15 12:00 AM","124"
"0000000174","1996-10-08 12:00 AM","105"
"0000700161","2008-06-30 12:00 AM","136"
"0130000155","1996-08-25 12:00 AM","118"
"0100000175","1993-12-07 12:00 AM","117"
"0000000738","1995-11-07 12:00 AM","125"
"0000000070","","112"
"0000000895","1989-08-20 12:00 AM","108"
"0000000217","1998-09-03 12:00 AM","108"
"0200000961","1994-09-27 12:00 AM","117"
"0330000160","1997-07-01 12:00 AM","132"
"0000001861","2004-06-24 12:00 AM","114"
"0000000596","2003-12-09 12:00 AM","120"
"0000000009","2000-04-20 12:00 AM","99"
"0400000000","2009-08-14 12:00 AM","107"
"0470000000","1994-11-16 12:00 AM","106"
"0400000000","2002-02-19 12:00 AM","111"
"0000000000","","0"
"0400000000","2008-08-15 12:00 AM","123"
"0400000000","2004-09-13 12:00 AM","127"
"0400000006","2000-04-28 12:00 AM","135"
"0500000004","2007-11-06 12:00 AM","125"
"0500000001","2010-02-22 12:00 AM","102"
"0500000001","2005-06-07 12:00 AM","108"
"0500000007","2004-09-30 12:00 AM","116"
"0570000013","1998-11-04 12:00 AM","121"
"0400000084","1998-11-30 12:00 AM","118"
"0500000047","1999-10-25 12:00 AM","102"
"0000000085","","130"
"0500000007","2002-06-10 12:00 AM","111"
"0000000000","2008-02-01 12:00 AM","95"
"0000000000","2004-04-01 12:00 AM","115"
"0100000005","","0"
"0200000005","2009-05-14 12:00 AM","103"
"0009600002","2004-09-13 12:00 AM","138"

Reply via email to