OK! I've finally gotten this thing working in Cocoon 2.0.3!!! Thanks for your assistance (and patience) Vadim.
So now I can have a document with XML Decln: <?xml version="1.0" encoding="MacRoman"?> and Cocoon will now read it, and interpret the characters properly. I modified the Avalon Excalibur JaxpParser.java from April 2002 (v1.1 in CVS) and it overrides the JaxpParser.class in avalon-excalibur-vm12-20020705.jar which ships with Cocoon 2.0.3. Here is a diff... note there are some debug lines that need removing, and perhaps it should leave allow-java-encodings as "false" by default, not sure... anyway, here it is. Perhaps this change could be incorporated into Excalibur for the benefit of anyone else who wants to parse "weird" character encodings??? Cheers Jesse [localhost:avalon/excalibur/xml] root# diff -b -B -c JaxpParser-apr02.java JaxpParser.java *** JaxpParser-apr02.java Sat Aug 17 23:14:51 2002 --- JaxpParser.java Sat Aug 17 23:23:20 2002 *************** *** 1,3 **** --- 1,10 ---- + + /* + * Downloaded from http://cvs.apache.org/viewcvs.cgi/-checkout-/jakarta-avalon-excalibur/xmlbundle/src/java/org/apache/avalon/excalibur/xml/Attic/JaxpParser.java?rev=1.1&search=None&hideattic=1 + * at 23:15 17/8/2002 Adelaide Time by Jesse + */ + + /* * Copyright (C) The Apache Software Foundation. All rights reserved. * *************** *** 5,10 **** --- 12,18 ---- * version 1.1, a copy of which has been included with this distribution in * the LICENSE.txt file. */ + package org.apache.avalon.excalibur.xml; import java.io.IOException; *************** *** 108,113 **** --- 116,124 ---- /** do we stop on recoverable errors ? */ protected boolean stopOnRecoverableError; + /** do we want to allow all possible text encodings recognised by current JVM? */ + protected boolean allowJavaEncodings; + /** * Get the Entity Resolver from the component manager */ *************** *** 131,136 **** --- 142,148 ---- public void parameterize( Parameters params ) throws ParameterException { + System.out.println("Jesse Debug: in avalon JaxpParser.java, class JaxpParser.parameterize"); // Validation and namespace prefixes parameters boolean validate = params.getParameterAsBoolean( "validate", false ); this.nsPrefixes = params.getParameterAsBoolean( "namespace-prefixes", false ); *************** *** 182,187 **** --- 194,206 ---- this.docFactory.setNamespaceAware( true ); this.docFactory.setValidating( validate ); + // Pick up "allow-java-encodings" to allow the use of additional + // character encodings supported by current JVM (eg "MacRoman") + // Jesse Reynolds 2002.08.10 + this.allowJavaEncodings = params.getParameterAsBoolean("allow-java-encodings", true); + System.out.println( "JESSE DEBUG: allow-java-encodings has been set to: " + this.allowJavaEncodings ); + + if( this.getLogger().isDebugEnabled() ) { this.getLogger().debug( "JaxpParser: validating: " + validate + *************** *** 190,196 **** ", stop on warning: " + this.stopOnWarning + ", stop on recoverable-error: " + this.stopOnRecoverableError + ", saxParserFactory: " + saxParserFactoryName + ! ", documentBuilderFactory: " + documentBuilderFactoryName ); } } --- 209,216 ---- ", stop on warning: " + this.stopOnWarning + ", stop on recoverable-error: " + this.stopOnRecoverableError + ", saxParserFactory: " + saxParserFactoryName + ! ", documentBuilderFactory: " + documentBuilderFactoryName + ! ", allow-java-encodings: " + this.allowJavaEncodings ); } } *************** *** 257,262 **** --- 277,291 ---- { this.getLogger().warn( "SAX2 driver does not support property: " + "'http://xml.org/sax/properties/lexical-handler'" ); + } + + if (this.allowJavaEncodings) { + try { + tmpReader.setFeature("http://apache.org/xml/features/allow-java-encodings", true); + } catch (SAXException e) { + this.getLogger().warn("SAX2 driver does not support feature: 'allow-java-encodings' "+ + "('http://apache.org/xml/features/allow-java-encodings')"); + } } tmpReader.setErrorHandler( this ); [localhost:avalon/excalibur/xml] root# At 0:21 -0400 17/8/2002, Vadim Gritsenko wrote: >Jesse Reynolds wrote: > >... > >>OK! Now we're getting somewhere. All my debug output is now >>happening, YAY, but now cocoon is not initialising. Hmmmm. I must >>have stuffed up the code I guess, I will review... >> >>What version of Excalibur is shipped with Cocoon 2.0.2??? > > >See jar name. Usually it is some dated version (use cvs co -D...) >Current 2.0.4-dev is: avalon-excalibur-vm12-20020705.jar. > >>From core.log: >> >>java.lang.NullPointerException >> at >>org.apache.avalon.excalibur.xml.JaxpParser.parse(JaxpParser.java:287) >> at >>org.apache.avalon.excalibur.xml.JaxpParser.parse(JaxpParser.java:246) > > >Seems like NPE in your version of parser. Take a version matching >with your Cocoon. > >Vadim > > > >--------------------------------------------------------------------- >Please check that your question has not already been answered in the >FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html> > >To unsubscribe, e-mail: <[EMAIL PROTECTED]> >For additional commands, e-mail: <[EMAIL PROTECTED]> -- Jesse Reynolds - Virtual Artists Pty Ltd - http://www.va.com.au Email: jesse (at) va.com.au > Website Development Phone: +61 (0)8 8223 2288 > Web & Email Hosting Web: http://jesse.va.com.au > Streaming Media Hosting > Telehousing / Colocation --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>