costin 01/05/26 10:07:31
Added: src/share/org/apache/tomcat/util/buf UDecoder.java
UEncoder.java
Log:
Added ( refactored ) UTF encoder and decoder.
The code used to be part of Byte/Char Chunk, but had many bugs and it was hard
to optimize.
Note that we don't implement M$ encoding scheme ( which is not standard and
may cause many problems ), but it could be implemented.
There is still work to be done for decoding char[] - the result of the
conversion is byte, and it has to be converted ( somehow ) to char, but
you can't do that without a b-c converter.
( this will happen for RequestDispatchers for example - a workaround is to
not encode extended chars )
Revision ChangesPath
1.1
jakarta-tomcat/src/share/org/apache/tomcat/util/buf/UDecoder.java
Index: UDecoder.java
===
/*
*
*
* The Apache Software License, Version 1.1
*
* Copyright (c) 1999 The Apache Software Foundation. All rights
* reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
*notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
*notice, this list of conditions and the following disclaimer in
*the documentation and/or other materials provided with the
*distribution.
*
* 3. The end-user documentation included with the redistribution, if
*any, must include the following acknowlegement:
* This product includes software developed by the
*Apache Software Foundation (http://www.apache.org/).
*Alternately, this acknowlegement may appear in the software itself,
*if and wherever such third-party acknowlegements normally appear.
*
* 4. The names The Jakarta Project, Tomcat, and Apache Software
*Foundation must not be used to endorse or promote products derived
*from this software without prior written permission. For written
*permission, please contact [EMAIL PROTECTED]
*
* 5. Products derived from this software may not be called Apache
*nor may Apache appear in their names without prior written
*permission of the Apache Group.
*
* THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
* WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
* ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
* USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
* OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
*
* This software consists of voluntary contributions made by many
* individuals on behalf of the Apache Software Foundation. For more
* information on the Apache Software Foundation, please see
* http://www.apache.org/.
*
* [Additional notices, if required by prior licensing conditions]
*
*/
package org.apache.tomcat.util.buf;
import org.apache.tomcat.util.buf.*;
import java.util.BitSet;
import java.io.*;
/**
* All URL decoding happens here. This way we can reuse, review, optimize
* without adding complexity to the buffers.
*
* The conversion will modify the original buffer.
*
* @author Costin Manolache
*/
public final class UDecoder {
public UDecoder()
{
}
/** URLDecode, will modify the source.
*/
public void convert(ByteChunk mb)
throws IOException
{
int start=mb.getOffset();
byte buff[]=mb.getBytes();
int end=mb.getEnd();
int idx= mb.indexOf( buff, start, end, '%' );
int idx2= mb.indexOf( buff, start, end, '+' );
if( idx0 idx20 ) {
return;
}
if( idx2 = 0 idx2 idx ) idx=idx2;
for( int j=idx; jend; j++, idx++ ) {
if( buff[ j ] == '+' ) {
buff[idx]= (byte)' ' ;
} else if( buff[ j ] != '%' ) {
buff[idx]= buff[j];
} else {
// read next 2 digits
if( j+2 = end ) {