approved!  awesome cleanup as usual, thanks!

On Sun, Mar 27, 2011 at 9:57 AM, AndrĂ© Bargull <[email protected]>wrote:

> Change bargull-20110327-wHm by bargull@Bargull02 on 2011-03-27 13:48:00
> in /home/anba/src/svn/openlaszlo/trunk
> for http://svn.openlaszlo.org/openlaszlo/trunk
>
> Summary: Support UTF-16 encoding for files
>
> Bugs Fixed: LPP-1657 (UTF16 data broken)
>
> Technical Reviewer: ptw
> QA Reviewer: (pending)
>
> Overview:
> UTF-16 encoding support is a bit inconsistent, it isn't supported for some
> files (e.g. lzx or proxied xml from server), while for other files it is
> (e.g. css files). In addition to that, I've also encountered "duplicated"
> code for BOM detection or encoding detection from the xml-declaration (in
> "<?xml encoding=... ?>"). "duplicated" is put in quotes here, because even
> though the code was used for the same purpose, it was slightly different
> leading to other results depending on which method was called. Here's an
> example for BOM detection from FileUtils:
>
> FileUtils contained several methods to read the BOM, but all of them were
> kind of broken:
> - detectBOMEncoding() used mark() to mark the current position in the
> stream, but never called reset()
> - stripByteOrderMark(PushbackInputStream) mixed up UTF-16LE and UTF-16BE,
> and it may unread too much bytes (if the input stream contains less than
> three bytes)
> - stripByteOrderMark(byte[]) was more or less a plain copy of the other
> stripByteOrderMark() method, duplicate code, yuk...
>
>
> Details:
> FileUtils:
> #Encoding
> -- Simple enum for the encoding information (utf-8 and utf-16 only)
> #readString(Reader)
> -- New utility method to return the content of a Reader
> #getXMLStreamReader(InputStream, String)
> -- no longer read the complete InputStream into memory, instead use mark()
> to reset the position resp. SequenceInputStream
> -- use BOM to deduce proper charset to read xml-prolog, if BOM not present,
> guess possible encoding per auto-detection described in xml specification
> #readFully(InputStream, byte[])
> -- helper method to read from InputStream into byte array
> #skipFully(InputStream, long)
> -- helper method to skip() bytes more reliably
> #getEncoding(byte[])
> -- helper method to return encoding from BOM
> #guessXMLEncoding(byte[])
> -- helper method to guess possible encoding by reading first bytes of input
> #getBOMEncoding(BufferedInputStream)
> -- use getEncoding() method to retrieve encoding from BOM
> #makeXMLReaderForFile(String, String)
> -- use toByteArrayInputStream() to read complete input into memory now that
> getXMLStreamReader() no longer uses in-memory streams
> #toByteArrayInputStream(InputStream)
> -- reads InputStream content into memory and returns ByteArrayInputStream
> with the same content
> #stripByteOrderMark(PushbackInputStream)
> -- use getEncoding() to read BOM
> -- don't unread bytes after EOF
> #stripByteOrderMark(byte[])
> -- use getEncoding() to read BOM
>
> CSSHandler:
> - use new methods from FileUtils to avoid duplicating code
>
> HttpData:
> - use encoding detection from FileUtils
>
> XMLGrabber:
> - the PullParser we're using doesn't perform any encoding detection,
> therefore fallback to the utility methods in FileUtils
>
> ResponderXML:
> - don't read files unconditionally in a certain encoding (here: UTF-8)
> instead just return the plain bytes
>
> FileUtils_Test2:
> - test case for the new resp. changed method from FileUtils
>
> WholeFile_Test:
> - initialize ReprocessComments.setOptions() in order to run test case
>
> build.xml:
> - re-enable FileUtils_Test, according to LPP-2054 it is currently failing,
> but I couldn't reproduce the failure
> - add FileUtils_Test2
> - remove stale EDU/** entry from the oswego concurrent classes
> - profiler properties for visual vm, they're not enabled by default but
> useful for things like LPP-9841
>
>
> Tests:
> test case from LPP-1657, junit test case
>
> Files:
> M WEB-INF/lps/server/build.xml
> M WEB-INF/lps/server/src/org/openlaszlo/js2doc/WholeFile_Test.java
> M WEB-INF/lps/server/src/org/openlaszlo/css/CSSHandler.java
> M WEB-INF/lps/server/src/org/openlaszlo/utils/FileUtils.java
> A WEB-INF/lps/server/src/org/openlaszlo/utils/FileUtils_Test2.java
> M
> WEB-INF/lps/server/src/org/openlaszlo/servlets/responders/ResponderXML.java
> M WEB-INF/lps/server/src/org/openlaszlo/data/HttpData.java
> M WEB-INF/lps/server/src/org/openlaszlo/data/XMLGrabber.java
>
> Changeset:
> http://svn.openlaszlo.org/openlaszlo/patches/bargull-20110327-wHm.tar
>
>


-- 
Henry Minsky
Software Architect
[email protected]

Reply via email to