RE: System upgrade and now Cocoon is escaping tabs/entities.
Chris, So it turned out updating Xalan fixed the problem completely. We went with Xalan 2.7.1 (which has Xerces 2.9.0 included). We replace 'xercesImpl.jar' and 'xml-apis.jar' in Tomcat's endorsed folder and 'xalan-2.6.1-dev-20041008T0304.jar' with 'xalan.jar' from 2.7.1 and added 'serializer.jar' both in our lib folder. Restarted Tomcat and the problem went away and nothing else on the site was affected. In fact, it seems a little faster now. :) So now we're running find on CentOS 5, JDK 1.6.21 and Tomcat 5.0.28. - J Date: Wed, 29 Sep 2010 09:41:55 -0400 From: ch...@christopherschultz.net To: users@cocoon.apache.org Subject: Re: System upgrade and now Cocoon is escaping tabs/entities. -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 J, On 9/29/2010 1:10 AM, . . wrote: #a9 should be a copyright symbol if you're using ASCII. I suspect that #a9 is being used instead of a newline (0xa) followed by a tab (0x9). Actually it was a typo on my part. It's using #9; :( *oops* Yeah, that makes a ton of difference. I'm glad it wasn't 0xa9, 'cause that would have been a real mess. :) [file.encoding] is likely to solve both of your problems. I wrote a little JSP page to spit out the System.getProperty(file.encoding) value and got some surprising results. I tried two of the existing machines and got ISO-8859-1 for one and ANSI_X3.4-1968 for the other. ANSI_X3.4-1968, as you probably found out, is essentially basic ASCII, and ISO-8859-1 is ASCII plus a few other things, so they are compatible. It's not surprising that these two character sets are both working: if one works, the other has a good chance of working. The application runs fine on both of them. On the new server that too is giving out ISO-8859-1. Interesting. That said, we did an experiment last night and copied the entire previous Tomcat folder over to the new CentOS server and ran it with Sun JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1.5 or 1.6 the problem manifested itself. So the problem appears to related to the JDK in some way. Googling I came up with this: http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-javax-transformer-from-escaping-whitespace Which makes me wonder if the old Xalan from our previous Tomcat is having issues with JDK 1.5 and up. I guess an Xalan upgrade is in order. Cocoon packages it's own Xalan library, so that shouldn't be the problem, although I can't remember when Sun started packaging Xalan with Java. At some point, I think they even removed it. What version of Xalan are you running? It should be in your webapp's WEB-INF/lib directory. I don't think there's been a Xalan update in quite a few years. Let us know how things turn out. NB: Tomcat 5.0 has been retired and really should be replaced. Upgrading to Tomcat 6.0 shouldn't be too much trouble. Only issue there is we have to support this legacy application for another 12 months and it's a hand me down so we have little or no source code or documentation. Porting it now would take up more time/effort than is financially viable right now :( Technically speaking, servlet containers are supposed to be backward compatible. I wouldn't be surprised if, given a review of your Context element for Tomcat (it should go into META-INF/context.xml, now in your webapp, instead of in conf/server.xml for the server), everything else works exactly as it did before. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyjQiMACgkQ9CaO5/Lv0PBtOACeKG7EgdIqh+vDNND8wFKAtGHM N08AnjBBlR2cvmgIu1BfIDy79bMSAs7Q =h7CA -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org For additional commands, e-mail: users-h...@cocoon.apache.org
Re: System upgrade and now Cocoon is escaping tabs/entities.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 J, On 9/29/2010 1:10 AM, . . wrote: #a9 should be a copyright symbol if you're using ASCII. I suspect that #a9 is being used instead of a newline (0xa) followed by a tab (0x9). Actually it was a typo on my part. It's using #9; :( *oops* Yeah, that makes a ton of difference. I'm glad it wasn't 0xa9, 'cause that would have been a real mess. :) [file.encoding] is likely to solve both of your problems. I wrote a little JSP page to spit out the System.getProperty(file.encoding) value and got some surprising results. I tried two of the existing machines and got ISO-8859-1 for one and ANSI_X3.4-1968 for the other. ANSI_X3.4-1968, as you probably found out, is essentially basic ASCII, and ISO-8859-1 is ASCII plus a few other things, so they are compatible. It's not surprising that these two character sets are both working: if one works, the other has a good chance of working. The application runs fine on both of them. On the new server that too is giving out ISO-8859-1. Interesting. That said, we did an experiment last night and copied the entire previous Tomcat folder over to the new CentOS server and ran it with Sun JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1.5 or 1.6 the problem manifested itself. So the problem appears to related to the JDK in some way. Googling I came up with this: http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-javax-transformer-from-escaping-whitespace Which makes me wonder if the old Xalan from our previous Tomcat is having issues with JDK 1.5 and up. I guess an Xalan upgrade is in order. Cocoon packages it's own Xalan library, so that shouldn't be the problem, although I can't remember when Sun started packaging Xalan with Java. At some point, I think they even removed it. What version of Xalan are you running? It should be in your webapp's WEB-INF/lib directory. I don't think there's been a Xalan update in quite a few years. Let us know how things turn out. NB: Tomcat 5.0 has been retired and really should be replaced. Upgrading to Tomcat 6.0 shouldn't be too much trouble. Only issue there is we have to support this legacy application for another 12 months and it's a hand me down so we have little or no source code or documentation. Porting it now would take up more time/effort than is financially viable right now :( Technically speaking, servlet containers are supposed to be backward compatible. I wouldn't be surprised if, given a review of your Context element for Tomcat (it should go into META-INF/context.xml, now in your webapp, instead of in conf/server.xml for the server), everything else works exactly as it did before. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyjQiMACgkQ9CaO5/Lv0PBtOACeKG7EgdIqh+vDNND8wFKAtGHM N08AnjBBlR2cvmgIu1BfIDy79bMSAs7Q =h7CA -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org For additional commands, e-mail: users-h...@cocoon.apache.org
Re: System upgrade and now Cocoon is escaping tabs/entities.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 J, On 9/28/2010 10:09 AM, . . wrote: Our original application components were: NetBSD 3.0.3 with Suse 9.x Linux compatibility layer. Sun JDK 1.4.26 Tomcat 5.0.23 Cocoon 2.1.6 As part of the upgrade we switched to: Centos 5.3 Sun JDK 1.6.21 Tomcat 5.0.30 Cocoon 2.1.6 [snip] Firstly, if any of our source XML/XSL files use tabs to indent the nodes, the outputted source escapes them as #A9; which it didn't do before. This isn't a problem for output to be displayed in a browser but we have a number of legacy Flash components which, annoyingly, don't recognise this as whitespace and refuses to load causing the Flash component to fail. #a9 should be a copyright symbol if you're using ASCII. I suspect that #a9 is being used instead of a newline (0xa) followed by a tab (0x9). My guess is that your JVM's file.encoding system property used to be something like ISO-8859-1 or UTF-8 and now it's been changed to something that is more exotic, perhaps even mandating 16-bit characters (though your pages would be horribly jumbled if everything were interpreted at 16-bit characters). Check the file.encoding of your JVM in the old, working system relative to the new, broken one. Also, check to make sure that your XML files have the encoding set in the ?xml? processing instruction, and that the encoding actually matches what you used when you wrote the file to the disk. Finally, check to see if you have BOM characters at the start of your XML files. This is likely to solve both of your problems. NB: Tomcat 5.0 has been retired and really should be replaced. Upgrading to Tomcat 6.0 shouldn't be too much trouble. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyiNykACgkQ9CaO5/Lv0PD5xgCbBS0jEpDVsd5z9OA3vwlkOqKr WNoAoLLZfRUNW+Dbx/UiGyyOXLtdV2y9 =RGqP -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org For additional commands, e-mail: users-h...@cocoon.apache.org
RE: System upgrade and now Cocoon is escaping tabs/entities.
#a9 should be a copyright symbol if you're using ASCII. I suspect that #a9 is being used instead of a newline (0xa) followed by a tab (0x9). Actually it was a typo on my part. It's using #9; :( *oops* My guess is that your JVM's file.encoding system property used to be something like ISO-8859-1 or UTF-8 and now it's been changed to something that is more exotic, perhaps even mandating 16-bit characters (though your pages would be horribly jumbled if everything were interpreted at 16-bit characters). Check the file.encoding of your JVM in the old, working system relative to the new, broken one. Also, check to make sure that your XML files have the encoding set in the ?xml? processing instruction, and that the encoding actually matches what you used when you wrote the file to the disk. Finally, check to see if you have BOM characters at the start of your XML files. This is likely to solve both of your problems. I wrote a little JSP page to spit out the System.getProperty(file.encoding) value and got some surprising results. I tried two of the existing machines and got ISO-8859-1 for one and ANSI_X3.4-1968 for the other. The application runs fine on both of them. On the new server that too is giving out ISO-8859-1. That said, we did an experiment last night and copied the entire previous Tomcat folder over to the new CentOS server and ran it with Sun JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1.5 or 1.6 the problem manifested itself. So the problem appears to related to the JDK in some way. Googling I came up with this: http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-javax-transformer-from-escaping-whitespace Which makes me wonder if the old Xalan from our previous Tomcat is having issues with JDK 1.5 and up. I guess an Xalan upgrade is in order. NB: Tomcat 5.0 has been retired and really should be replaced. Upgrading to Tomcat 6.0 shouldn't be too much trouble. Only issue there is we have to support this legacy application for another 12 months and it's a hand me down so we have little or no source code or documentation. Porting it now would take up more time/effort than is financially viable right now :( - J