Re: [fpc-pascal] XMLWrite looses data

2014-03-24 Thread Mattias Gaertner
On Mon, 24 Mar 2014 20:12:50 +
Graeme Geldenhuys  wrote:

>[...]
> > The parser converts &#*; to Unicode characters when
> > reading. AFAIR some xsl parsers like xsltproc do the same.
> > If you want xslt to output ' ' you can use ' '
> 
> Thanks for that info, it helped find the problem (though no solution
> yet). Tha character isn't actully a unicode character, it is simply a
> "no-break space" character at position $A0 in the ASCII chart.

Well, I see, that the term "character" is confusing here.
It is a Unicode codepoint. The   is just a xml alias. For xml it
does not matter if you write it as code or encoded in UTF-8/UTF-16.


> Using hex
> value notation, instead of the more popular decimal notation when escaped.
> 
> ===[ charmap details ]
> U+00A0 NO-BREAK SPACE
> UTF-8: 0xC2 0xA0
> UTF-16: 0x00A0
> 
> C octal escaped UTF-8: \302\240
> XML decimal entity:  
> =
> 
> But I now see what happened. When I enabled "show hidden characters"
> like spaces and tabs in my editor, I noticed that the no-break space
> character is still there, but in the resaved output file it is simply
> not escaped any more.

Yes. That's what I meant.

 
> How is the fcl-xml package supposed to handle escaped characters which
> will form part of the data the XSL will generate? Is fcl-xml supposed to
> write them back as escaped characters, or as an normal un-escaped character?

XML writers can choose. Both forms are valid xml of the given text.

 
> I tried using the decimal notation too:  
> And that produced the same result as the original.
> 
> Note:
> When we process a XML file with our XSL file, we want he resulting
> output to have a no-break character - we don't what to display the text
> '&#a0;' - which I think is what your suggestion with the & will produce.
> 
> To put this in context, in case my original XSL snippet wasn't clear.
> That snippet generates a date string in the format 'dd MMM ' and the
> spaces between those elements are not normal spaces, but no-break
> spaces, so that whole text stays together (and wouldn't wordwrap in the
> middle).
> 
> 
> The current resaved XSL file still works, but not being able to
> physically see the no-break space characters could cause us problems
> months down the line when we re-edit those files. Hence the reason they
> were escaped (to make them clearly visible to the developer).

You can use comments.

The current XML writer only escapes '<', '>', '&', #0..#31.
Maybe you want to extend it with an option or hook to escape more
characters. For example all "control characters".

Mattias
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] XMLWrite looses data

2014-03-24 Thread Graeme Geldenhuys
On 2014-03-24 13:58, Mattias Gaertner wrote:
> 
> Yes, XSL is XML.

Thought so - thanks for confirming.


> The parser converts &#*; to Unicode characters when
> reading. AFAIR some xsl parsers like xsltproc do the same.
> If you want xslt to output ' ' you can use ' '

Thanks for that info, it helped find the problem (though no solution
yet). Tha character isn't actully a unicode character, it is simply a
"no-break space" character at position $A0 in the ASCII chart. Using hex
value notation, instead of the more popular decimal notation when escaped.

===[ charmap details ]
U+00A0 NO-BREAK SPACE
UTF-8: 0xC2 0xA0
UTF-16: 0x00A0

C octal escaped UTF-8: \302\240
XML decimal entity:  
=

But I now see what happened. When I enabled "show hidden characters"
like spaces and tabs in my editor, I noticed that the no-break space
character is still there, but in the resaved output file it is simply
not escaped any more.

How is the fcl-xml package supposed to handle escaped characters which
will form part of the data the XSL will generate? Is fcl-xml supposed to
write them back as escaped characters, or as an normal un-escaped character?

I tried using the decimal notation too:  
And that produced the same result as the original.

Note:
When we process a XML file with our XSL file, we want he resulting
output to have a no-break character - we don't what to display the text
'&#a0;' - which I think is what your suggestion with the & will produce.

To put this in context, in case my original XSL snippet wasn't clear.
That snippet generates a date string in the format 'dd MMM ' and the
spaces between those elements are not normal spaces, but no-break
spaces, so that whole text stays together (and wouldn't wordwrap in the
middle).


The current resaved XSL file still works, but not being able to
physically see the no-break space characters could cause us problems
months down the line when we re-edit those files. Hence the reason they
were escaped (to make them clearly visible to the developer).


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] XMLWrite looses data

2014-03-24 Thread Mattias Gaertner
On Sun, 23 Mar 2014 17:58:16 +
Graeme Geldenhuys  wrote:

> Hi,
> 
> I'm loading up a XSL file into a TXMLDocument using XMLRead. Up to this
> point everything seems to be ok, and I can query the DOMNodes without
> problem. If I then save that file out again, using XMLWrite, I noticed
> that some data is lost. :-/
> 
> I don't know if this is because the file is a XSL file? Though I thought
> XSL is exactly the same structure as XML - so didn't expect any problems.

Yes, XSL is XML.

> [...]
> Note the two ' ' escaped characters are lost near the end in the
> newly written file.

The parser converts &#*; to Unicode characters when
reading. AFAIR some xsl parsers like xsltproc do the same.
If you want xslt to output ' ' you can use ' '


Mattias
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] XMLWrite looses data

2014-03-24 Thread Daniel Gaspary
On Sun, Mar 23, 2014 at 2:58 PM, Graeme Geldenhuys
 wrote:
> I'm using FPC 2.6.2 under 64-bit FreeBSD, but will be compiling this
> application for Windows 32-bit and 64-bit tomorrow at work.

If you can, try also the "Laz2_" XML units: Laz2_Dom, Laz2_xmlwrite and read.

They seem to work better with unicode or utf8 at least.

> Any idea what is causing this? A bug, because I'm using XSL or anything
> else maybe?

No idea, but maybe changing some parser option can help. The
Validating example shows how to change options:

http://wiki.freepascal.org/XML_Tutorial#Validating_a_document
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] XMLWrite looses data

2014-03-24 Thread Graeme Geldenhuys
Hi,

I'm loading up a XSL file into a TXMLDocument using XMLRead. Up to this
point everything seems to be ok, and I can query the DOMNodes without
problem. If I then save that file out again, using XMLWrite, I noticed
that some data is lost. :-/

I don't know if this is because the file is a XSL file? Though I thought
XSL is exactly the same structure as XML - so didn't expect any problems.

Anyway, here is a sample area in the XSL file that looses data.

Before:
8<-8<-8<-8<-8<



   
  
 
 

   01
   02
   03
   04
   05
   06
   07
   08
   09
   10
   11
   12

 
   
  
   

8<-8<-8<-8<-8<


After the save:
8<-8<-8<-8<-8<
  
  


  
  

  
01
02
03
04
05
06
07
08
09
10
11
12
  

 
 

  

  
8<-8<-8<-8<-8<


Note the two ' ' escaped characters are lost near the end in the
newly written file.

I'm using FPC 2.6.2 under 64-bit FreeBSD, but will be compiling this
application for Windows 32-bit and 64-bit tomorrow at work.

Any idea what is causing this? A bug, because I'm using XSL or anything
else maybe?

The XSL file passed all validation tools I could throw at it. We
currently use the full XSL file to populate and generate PDF documents,
so I don't believe there is any validation/syntax issues.


In case this is useful, the XSL file starts like this.

8<-8<-8<-8<-8<

http://www.w3.org/1999/XSL/Transform";
version="1.0">



...snip...
8<-8<-8<-8<-8<

Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal