Re: [PATCH] '8859_1' is not a valid charset alias
On Fri, May 18, 2001 at 12:40:04PM -0700, Forrest R. Girouard wrote: It is my understanding that '8859_1' is an alias for a Java encoding which maps to the 'ISO-8859-1' character set. The Java encoding and the character set name are not always the same. Furthermore, while it's not readily apparent using 'ISO8859_1' for the Java encoding is far preferable to using '8859_1' (or anything else) under Java 2. Look at the private getBTCConverter() method in the String.java source and note the use of the following: !encoding.equals(btc.getCharacterEncoding()) The ByteToCharConverter instance for ISO-8859-1 always returns 'ISO8859_1' for the getCharacterEncoding() method and this means that while other names may work the ThreadLocal caching will be subverted. Since the ByteToCharConverter.getConverter() method involves synchronization it is not a good thing to subvert the ThreadLocal cache. Thanks for pointing this out. AFAICS, the use of 'iso-8859-1' instead of '8859_1' (my patch) does not make this situation any better or worse in the tomcat code. g The tomcat 3.x code doesn't look like it takes this into account at all. I wonder if looking up the Java Encoding name associated with the encoding name supplied by user-agents etc. is an optimisation worth making. I'll look into that. Vince.
servlet upload data corruption (more)
I finally got out from under some work and was able to make some test code. I'm attaching the client and servlet code. The code transfers a couple parameters, then a binary file (I was using a .jar). If you call the client with BinTestClient localhost something.jar b, it uses byte-by-byte read on the server to spool the file to a temp file. If you call the client without the 'b', it uses the byte-array read that I was complaining about. Transfer a file, then try jar tvf test.jar to see if it works. I uses a jar that contains .jpg images and when using the byte array read method, it creats a corrupt jar file. If I apply my fix to the Ajp13ConnectorRequest class, it works fine. (I tried a jar that contained class files and it worked anyway...) I'd like for someone else to try this out to make sure I didn't screw something up. The code seems pretty simple. I discovered this when using JarIn/OutputStream to transfer data from client to servlet. David import java.io.DataOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.io.OutputStream; import java.net.URL; import java.net.URLConnection; // args[0] = hostname // args[1] = jarfile // args[2] = 'b' for single byte read. public class BinTestClient { public static void main(String [] args) { try { URL url = new URL(http://+args[0]+/examples/BinTest;); URLConnection connection = (URLConnection)url.openConnection(); connection.setDoOutput(true); connection.setUseCaches(false); DataOutputStream output = new DataOutputStream(connection.getOutputStream()); File jarFile = new File(args[1]); if (jarFile.exists()) { output.writeUTF(+jarFile.length()); } if (args.length 2 args[2] != null args[2].trim().equals(b)) output.writeChar('b'); else output.writeChar(' '); InputStream istr = new FileInputStream(jarFile); byte [] buf = new byte[8192]; int count = istr.read(buf); while (count != -1) { if (count 0) output.write(buf, 0, count); count = istr.read(buf); } istr.close(); output.flush(); output.close(); istr = connection.getInputStream(); istr.read(); } catch (Exception ex) { ex.printStackTrace(); } } } import java.io.DataInputStream; import java.io.File; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.InputStream; import java.io.IOException; import java.io.OutputStream; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class BinTestServlet extends HttpServlet{ public void doPost (HttpServletRequest request, HttpServletResponse response) { try { DataInputStream istr = new DataInputStream(request.getInputStream()); long fileLen = Long.parseLong(istr.readUTF()); char mode = istr.readChar(); File tmp = File.createTempFile(test, .jar); OutputStream fstr = new FileOutputStream(tmp); if (mode == 'b') { System.out.println(Using byte-by-byte read); for (int i=0; ifileLen; i++) fstr.write(istr.read()); } else { System.out.println(Using byte-array read); byte [] buf = new byte[8192]; int count = istr.read(buf); while (count != -1) { if (count 0) fstr.write(buf, 0, count); count = istr.read(buf); } } fstr.flush(); fstr.close(); OutputStream ostr = response.getOutputStream(); ostr.write(1); // positive response } catch (Exception ex) { ex.printStackTrace(); } } }
Does the beta Tomcat 4 support multiple TLD files in a jar?
I assume it does. If so what is the correct way to use this functionality? I have been having little luck trying and can't find the answer documented. Here is a little insight on what I was attempting. The JAR has all of the class files in their correct directories along with a TLD in the META-INF directory named exampleTags.tld. exampleTags.tld has the uri element set with the value /exampleTags.tld. ... taglib ... uri/exampleTags.tld/uri /taglib I am trying the following with a JSP. --- %@ taglib prefix=e uri=/exampleTags.tld % Here is the example tag output: e:example / --- But a servlet exception error message keeps popping up. org.apache.jasper.JasperException: File /exampleTags.tld not found ... Anyone had this before? Advice would be appreciated. Jayson Falkner V.P./CTO, Amberjack Software LLC [EMAIL PROTECTED] www.jspinsider.com
Re: Does the beta Tomcat 4 support multiple TLD files in a jar?
On Sat, 19 May 2001, Jayson Falkner wrote: I assume it does. If so what is the correct way to use this functionality? I have been having little luck trying and can't find the answer documented. Here is a little insight on what I was attempting. The JAR has all of the class files in their correct directories along with a TLD in the META-INF directory named exampleTags.tld. exampleTags.tld has the uri element set with the value /exampleTags.tld. ... taglib ... uri/exampleTags.tld/uri /taglib Well, first it shouldn't be just plain uri: taglib taglib-uri/myPRlibrary/taglib-uri taglib-location/WEB-INF/tlds/PRlibrary_1_4.tld/taglib-uri /taglib And second, what are you using for the taglib-location? That's how it would locate the .tld file. If you're not using one, perhaps you need to prefix the URI with META-INF/ since all the TLDs are required to be in the META-INF directory of the packaged JAR. Aaron I am trying the following with a JSP. --- %@ taglib prefix=e uri=/exampleTags.tld % Here is the example tag output: e:example / --- But a servlet exception error message keeps popping up. org.apache.jasper.JasperException: File /exampleTags.tld not found ... Anyone had this before? Advice would be appreciated. Jayson Falkner V.P./CTO, Amberjack Software LLC [EMAIL PROTECTED] www.jspinsider.com
The wonderfull worlds of encodings...
Hi, I've got a terible headache... It happens all the time I try to touch the bugs related with encodings - any of them... I'm sure you already know ( but I just found out ) what surrogate characters are. I know that UTF is _not_ 16 bits, but I had no idea it is 21 bits ( as opposed to UCS - 31 bits ). I'll try to get something working this weekend. Craig - you may want to take a look, the code in DefaultServlet is creating a writter for each encoding ( that's terribly expensive ), and doesn't seem to deal with surrogates ( well, the second part is not a problem - I doubt someone would use hieroglyphs or musical signs in a URL ). Now, the biggest problem is as ussually M$. From strange reasons, MSIE's javascript encode() method is generating % sequences instead of %XX%XX ( as most would expect ). That means the whole decoding might have to be rewritten 3.3 ( Apache doesn't deal with that either ). Question: what should happen with the context path ? It is supposed to be returned in the orignal form ( not decoded ) - but that can't work as a certain path can be encoded in many ways. I'm also not sure what should happen if web.xml and in server.xml ( where path is defined ) - should we use %xx encoded URLs ? But what would that mean for characters that have multiple encodings ? The solution I have in mind right now is to keep doing all the mappings and process web.xml - and do all internal operations with decoded characters, while keeping the original form for the facade, so servlets get what they expect. Any ideas ? I'm not sure I can handle this. Costin
Re: servlet upload data corruption (more)
David, A detailed bug report w/ test case is *great*, but it would also be very, very helpful if you could specify: 1) What version of Tomcat you are running (precisely) 2) What web server you are running, and its version 3) Your OS -Dan DAK wrote: I finally got out from under some work and was able to make some test code. I'm attaching the client and servlet code. The code transfers a couple parameters, then a binary file (I was using a .jar). If you call the client with BinTestClient localhost something.jar b, it uses byte-by-byte read on the server to spool the file to a temp file. If you call the client without the 'b', it uses the byte-array read that I was complaining about. Transfer a file, then try jar tvf test.jar to see if it works. I uses a jar that contains .jpg images and when using the byte array read method, it creats a corrupt jar file. If I apply my fix to the Ajp13ConnectorRequest class, it works fine. (I tried a jar that contained class files and it worked anyway...) I'd like for someone else to try this out to make sure I didn't screw something up. The code seems pretty simple. I discovered this when using JarIn/OutputStream to transfer data from client to servlet. David import java.io.DataOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.io.OutputStream; import java.net.URL; import java.net.URLConnection; // args[0] = hostname // args[1] = jarfile // args[2] = 'b' for single byte read. public class BinTestClient { public static void main(String [] args) { try { URL url = new URL(http://+args[0]+/examples/BinTest;); URLConnection connection = (URLConnection)url.openConnection(); connection.setDoOutput(true); connection.setUseCaches(false); DataOutputStream output = new DataOutputStream(connection.getOutputStream()); File jarFile = new File(args[1]); if (jarFile.exists()) { output.writeUTF(+jarFile.length()); } if (args.length 2 args[2] != null args[2].trim().equals(b)) output.writeChar('b'); else output.writeChar(' '); InputStream istr = new FileInputStream(jarFile); byte [] buf = new byte[8192]; int count = istr.read(buf); while (count != -1) { if (count 0) output.write(buf, 0, count); count = istr.read(buf); } istr.close(); output.flush(); output.close(); istr = connection.getInputStream(); istr.read(); } catch (Exception ex) { ex.printStackTrace(); } } } import java.io.DataInputStream; import java.io.File; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.InputStream; import java.io.IOException; import java.io.OutputStream; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class BinTestServlet extends HttpServlet{ public void doPost (HttpServletRequest request, HttpServletResponse response) { try { DataInputStream istr = new DataInputStream(request.getInputStream()); long fileLen = Long.parseLong(istr.readUTF()); char mode = istr.readChar(); File tmp = File.createTempFile(test, .jar); OutputStream fstr = new FileOutputStream(tmp); if (mode == 'b') { System.out.println(Using byte-by-byte read); for (int i=0; ifileLen; i++) fstr.write(istr.read()); } else { System.out.println(Using byte-array read); byte [] buf = new byte[8192]; int count = istr.read(buf); while (count != -1) { if (count 0) fstr.write(buf, 0, count);
Re: [VOTE] Final release of Tomcat 3.2.2
+0 (I don't think I'll have time to do any support), but way to go Marc!!! -Dan Marc Saegesser wrote: The latest beta cycle for Tomcat 3.2.2 has completed with no new bugs identified. As the release manager I propose that we release the tomcat_32 branch as Tomcat 3.2.2. Please indicate your vote for the release using the ballot below. I will tabulate and post the results of this vote on Friday, May 25. At that time, if the vote has passed, I will tag, build and distribute the release. The vote must pass by majority approval which means the proposal must receive at least three +1 votes and more +1 votes than -1 votes. Marc Saegesser - Vote to release the tomcat_32 branch as Tomcat 3.2.2. [ ] +1. I agree with the proposal and I will help support the release. [ ] +0. I agree with the proposal but I will not be able to help support the release. [ ] -0. I don't agree with the proposal but I won't stop the release. [ ] -1. I disagree with the proposal and will explain my reasons. -- Dan Milstein // [EMAIL PROTECTED]
Re: Does the beta Tomcat 4 support multiple TLD files in a jar?
Well, first it shouldn't be just plain uri: taglib taglib-uri/myPRlibrary/taglib-uri taglib-location/WEB-INF/tlds/PRlibrary_1_4.tld/taglib-uri /taglib Are you referring to an entry in the web.xml file? I was asking about having multiple Tag Library Descriptors in a JAR. According the the JSP 1.2 pfd anything in the META-INF directory with the .tld extension should get mapped accordingly by the uri attribute. I was snagging the uri element from the JSP 1.2 TLD DTD in the specs. The idea was to deploy the entire set of tags through a JAR, not by editing web.xml at all. As I understood this is possible. With JSP 1.1 you can do the same but may only have one TLD file in the JAR. Were you addressing this? Jayson Falkner V.P./CTO, Amberjack Software LLC [EMAIL PROTECTED] www.jspinsider.com
Re: [PATCH] '8859_1' is not a valid charset alias
Vicent, Forrest, Thanks for the patch review. Could you summarize and/or expand a bit :-) ? Also, does anyone played with the various browsers ? Is any browser sending the charset encoding ? What format ? I know that some browsers are encoding the URL with the same charset that is used in the page, while some are using UTF ( there was discussion about that somewhere). Is it true that browsers that are using UTF ( like IE on NT ? ) do send the body as UTF ? Do they set the Charset-Encoding header ? I would really apreciate some info ( I don't use Windows, and I heard there are differences between IE/Win9x and IE/NT ) Costin On Sat, 19 May 2001, Vincent Schonau wrote: On Fri, May 18, 2001 at 12:40:04PM -0700, Forrest R. Girouard wrote: It is my understanding that '8859_1' is an alias for a Java encoding which maps to the 'ISO-8859-1' character set. The Java encoding and the character set name are not always the same. Furthermore, while it's not readily apparent using 'ISO8859_1' for the Java encoding is far preferable to using '8859_1' (or anything else) under Java 2. Look at the private getBTCConverter() method in the String.java source and note the use of the following: !encoding.equals(btc.getCharacterEncoding()) The ByteToCharConverter instance for ISO-8859-1 always returns 'ISO8859_1' for the getCharacterEncoding() method and this means that while other names may work the ThreadLocal caching will be subverted. Since the ByteToCharConverter.getConverter() method involves synchronization it is not a good thing to subvert the ThreadLocal cache. Thanks for pointing this out. AFAICS, the use of 'iso-8859-1' instead of '8859_1' (my patch) does not make this situation any better or worse in the tomcat code. g The tomcat 3.x code doesn't look like it takes this into account at all. I wonder if looking up the Java Encoding name associated with the encoding name supplied by user-agents etc. is an optimisation worth making. I'll look into that. Vince.
upload data corruption report
I've been asked to provide more information, so here is combination of the two messages I posted with some more commentary and attachments. It pertains to Tomcat-3.2.1 and looks to be the same in 3.2.2.b4. I'm running Apache 1.3.17 on Win 2K Professional. I'm also using mod_jk I have some client code that sends a jar file to the servlet. The jar file was getting corrupted. After much digging, I found a CVS commit to Ajp13ConnectorRequest.java that mentioned a problem like this with the doRead() method. It turns out the the same applies to the doRead(byte[], int, int) method. The same problem exists in the Ajp12ConnectionHandler for that byte array read. Single byte reads for both protocols work just fine. I'm including the diffs for these classes to show what I'm talking about. I finally got out from under some work and was able to make some test code. I'm attaching the client and servlet code. The code transfers a couple parameters, then a binary file (I was using a .jar). If you call the client with BinTestClient localhost something.jar b, it uses byte-by-byte read on the server to spool the file to a temp file. If you call the client without the 'b', it uses the byte-array read that I was complaining about. Transfer a file, then try jar tvf test.jar to see if it works. I uses a jar that contains .jpg images and when using the byte array read method, it creats a corrupt jar file. If I apply my fix to the Ajp13ConnectorRequest class, it works fine. (I tried a jar that contained class files and it worked anyway...) I'd like for someone else to try this out to make sure I didn't screw something up. The code seems pretty simple. I discovered this when using JarIn/OutputStream to transfer data from client to servlet.I've seen this type of thing in Java before when writing code that talks to hardware (such as touchscreen driver and scanner drivers). David Index: Ajp13ConnectorRequest.java === RCS file: /home/cvspublic/jakarta-tomcat/src/share/org/apache/tomcat/service/connector/Attic/Ajp13ConnectorRequest.java,v retrieving revision 1.5.2.7 diff -r1.5.2.7 Ajp13ConnectorRequest.java 274c274,277 System.arraycopy(bodyBuff, pos, b, off, c); --- //System.arraycopy(bodyBuff, pos, b, off, c); for (int i=pos, j=off, d=c; d 0; i++, j++, d--) { b[j] = (byte)(((char)bodyBuff[i])0xff); } What I've done here is to replace the array copy with a loop that does the appropriate data conversion. Index: Ajp12ConnectionHandler.java === RCS file: /home/cvspublic/jakarta-tomcat/src/share/org/apache/tomcat/service/connector/Attic/Ajp12ConnectionHandler.java,v retrieving revision 1.28.2.4 diff -r1.28.2.4 Ajp12ConnectionHandler.java 542a543,549 public int read(byte b[], int off, int len) throws IOException { int ret = super.read(b, off, len); for (int i=0, j=off; ilen; i++, j++) { b[j] = (byte)(((char)b[j])0xff); } return ret; } In this case, I over-rode the read method to convert the data after calling the super.read import java.io.DataOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.io.OutputStream; import java.net.URL; import java.net.URLConnection; // args[0] = hostname // args[1] = jarfile // args[2] = 'b' for single byte read. public class BinTestClient { public static void main(String [] args) { try { URL url = new URL(http://+args[0]+/examples/BinTest;); URLConnection connection = (URLConnection)url.openConnection(); connection.setDoOutput(true); connection.setUseCaches(false); DataOutputStream output = new DataOutputStream(connection.getOutputStream()); File jarFile = new File(args[1]); if (jarFile.exists()) { output.writeUTF(+jarFile.length()); } if (args.length 2 args[2] != null args[2].trim().equals(b)) output.writeChar('b'); else output.writeChar(' '); InputStream istr = new FileInputStream(jarFile); byte [] buf = new byte[8192]; int count = istr.read(buf); while (count != -1) { if (count 0) output.write(buf, 0, count); count = istr.read(buf); } istr.close(); output.flush(); output.close();