Re: [fpc-pascal] XMLWrite looses data
On Mon, 24 Mar 2014 20:12:50 + Graeme Geldenhuys wrote: >[...] > > The parser converts *; to Unicode characters when > > reading. AFAIR some xsl parsers like xsltproc do the same. > > If you want xslt to output ' ' you can use ' ' > > Thanks for that info, it helped find the problem (though no solution > yet). Tha character isn't actully a unicode character, it is simply a > "no-break space" character at position $A0 in the ASCII chart. Well, I see, that the term "character" is confusing here. It is a Unicode codepoint. The is just a xml alias. For xml it does not matter if you write it as code or encoded in UTF-8/UTF-16. > Using hex > value notation, instead of the more popular decimal notation when escaped. > > ===[ charmap details ] > U+00A0 NO-BREAK SPACE > UTF-8: 0xC2 0xA0 > UTF-16: 0x00A0 > > C octal escaped UTF-8: \302\240 > XML decimal entity: > = > > But I now see what happened. When I enabled "show hidden characters" > like spaces and tabs in my editor, I noticed that the no-break space > character is still there, but in the resaved output file it is simply > not escaped any more. Yes. That's what I meant. > How is the fcl-xml package supposed to handle escaped characters which > will form part of the data the XSL will generate? Is fcl-xml supposed to > write them back as escaped characters, or as an normal un-escaped character? XML writers can choose. Both forms are valid xml of the given text. > I tried using the decimal notation too: > And that produced the same result as the original. > > Note: > When we process a XML file with our XSL file, we want he resulting > output to have a no-break character - we don't what to display the text > 'a0;' - which I think is what your suggestion with the & will produce. > > To put this in context, in case my original XSL snippet wasn't clear. > That snippet generates a date string in the format 'dd MMM ' and the > spaces between those elements are not normal spaces, but no-break > spaces, so that whole text stays together (and wouldn't wordwrap in the > middle). > > > The current resaved XSL file still works, but not being able to > physically see the no-break space characters could cause us problems > months down the line when we re-edit those files. Hence the reason they > were escaped (to make them clearly visible to the developer). You can use comments. The current XML writer only escapes '<', '>', '&', #0..#31. Maybe you want to extend it with an option or hook to escape more characters. For example all "control characters". Mattias ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] XMLWrite looses data
On 2014-03-24 13:58, Mattias Gaertner wrote: > > Yes, XSL is XML. Thought so - thanks for confirming. > The parser converts *; to Unicode characters when > reading. AFAIR some xsl parsers like xsltproc do the same. > If you want xslt to output ' ' you can use ' ' Thanks for that info, it helped find the problem (though no solution yet). Tha character isn't actully a unicode character, it is simply a "no-break space" character at position $A0 in the ASCII chart. Using hex value notation, instead of the more popular decimal notation when escaped. ===[ charmap details ] U+00A0 NO-BREAK SPACE UTF-8: 0xC2 0xA0 UTF-16: 0x00A0 C octal escaped UTF-8: \302\240 XML decimal entity: = But I now see what happened. When I enabled "show hidden characters" like spaces and tabs in my editor, I noticed that the no-break space character is still there, but in the resaved output file it is simply not escaped any more. How is the fcl-xml package supposed to handle escaped characters which will form part of the data the XSL will generate? Is fcl-xml supposed to write them back as escaped characters, or as an normal un-escaped character? I tried using the decimal notation too: And that produced the same result as the original. Note: When we process a XML file with our XSL file, we want he resulting output to have a no-break character - we don't what to display the text 'a0;' - which I think is what your suggestion with the & will produce. To put this in context, in case my original XSL snippet wasn't clear. That snippet generates a date string in the format 'dd MMM ' and the spaces between those elements are not normal spaces, but no-break spaces, so that whole text stays together (and wouldn't wordwrap in the middle). The current resaved XSL file still works, but not being able to physically see the no-break space characters could cause us problems months down the line when we re-edit those files. Hence the reason they were escaped (to make them clearly visible to the developer). Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] XMLWrite looses data
On Sun, 23 Mar 2014 17:58:16 + Graeme Geldenhuys wrote: > Hi, > > I'm loading up a XSL file into a TXMLDocument using XMLRead. Up to this > point everything seems to be ok, and I can query the DOMNodes without > problem. If I then save that file out again, using XMLWrite, I noticed > that some data is lost. :-/ > > I don't know if this is because the file is a XSL file? Though I thought > XSL is exactly the same structure as XML - so didn't expect any problems. Yes, XSL is XML. > [...] > Note the two ' ' escaped characters are lost near the end in the > newly written file. The parser converts *; to Unicode characters when reading. AFAIR some xsl parsers like xsltproc do the same. If you want xslt to output ' ' you can use ' ' Mattias ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] XMLWrite looses data
On Sun, Mar 23, 2014 at 2:58 PM, Graeme Geldenhuys wrote: > I'm using FPC 2.6.2 under 64-bit FreeBSD, but will be compiling this > application for Windows 32-bit and 64-bit tomorrow at work. If you can, try also the "Laz2_" XML units: Laz2_Dom, Laz2_xmlwrite and read. They seem to work better with unicode or utf8 at least. > Any idea what is causing this? A bug, because I'm using XSL or anything > else maybe? No idea, but maybe changing some parser option can help. The Validating example shows how to change options: http://wiki.freepascal.org/XML_Tutorial#Validating_a_document ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] XMLWrite looses data
Hi, I'm loading up a XSL file into a TXMLDocument using XMLRead. Up to this point everything seems to be ok, and I can query the DOMNodes without problem. If I then save that file out again, using XMLWrite, I noticed that some data is lost. :-/ I don't know if this is because the file is a XSL file? Though I thought XSL is exactly the same structure as XML - so didn't expect any problems. Anyway, here is a sample area in the XSL file that looses data. Before: 8<-8<-8<-8<-8< 01 02 03 04 05 06 07 08 09 10 11 12 8<-8<-8<-8<-8< After the save: 8<-8<-8<-8<-8< 01 02 03 04 05 06 07 08 09 10 11 12 8<-8<-8<-8<-8< Note the two ' ' escaped characters are lost near the end in the newly written file. I'm using FPC 2.6.2 under 64-bit FreeBSD, but will be compiling this application for Windows 32-bit and 64-bit tomorrow at work. Any idea what is causing this? A bug, because I'm using XSL or anything else maybe? The XSL file passed all validation tools I could throw at it. We currently use the full XSL file to populate and generate PDF documents, so I don't believe there is any validation/syntax issues. In case this is useful, the XSL file starts like this. 8<-8<-8<-8<-8< http://www.w3.org/1999/XSL/Transform"; version="1.0"> ...snip... 8<-8<-8<-8<-8< Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal