I will review On Sun, Mar 27, 2011 at 1:37 PM, P T Withington <[email protected]> wrote:
> Henry, can you review this please? I think this is an area you have worked > in. > > I suspect the duplication is partly due to too many different people > working on these various substrates and none taking ownership. > > IIRC, Raju (cc-ed) did the CSS implementation, so he may want to have a > look too. > > On 2011-03-27, at 09:57, André Bargull wrote: > > > Change bargull-20110327-wHm by bargull@Bargull02 on 2011-03-27 13:48:00 > > in /home/anba/src/svn/openlaszlo/trunk > > for http://svn.openlaszlo.org/openlaszlo/trunk > > > > Summary: Support UTF-16 encoding for files > > > > Bugs Fixed: LPP-1657 (UTF16 data broken) > > > > Technical Reviewer: ptw > > QA Reviewer: (pending) > > > > Overview: > > UTF-16 encoding support is a bit inconsistent, it isn't supported for > some files (e.g. lzx or proxied xml from server), while for other files it > is (e.g. css files). In addition to that, I've also encountered "duplicated" > code for BOM detection or encoding detection from the xml-declaration (in > "<?xml encoding=... ?>"). "duplicated" is put in quotes here, because even > though the code was used for the same purpose, it was slightly different > leading to other results depending on which method was called. Here's an > example for BOM detection from FileUtils: > > > > FileUtils contained several methods to read the BOM, but all of them were > kind of broken: > > - detectBOMEncoding() used mark() to mark the current position in the > stream, but never called reset() > > - stripByteOrderMark(PushbackInputStream) mixed up UTF-16LE and UTF-16BE, > and it may unread too much bytes (if the input stream contains less than > three bytes) > > - stripByteOrderMark(byte[]) was more or less a plain copy of the other > stripByteOrderMark() method, duplicate code, yuk... > > > > > > Details: > > FileUtils: > > #Encoding > > -- Simple enum for the encoding information (utf-8 and utf-16 only) > > #readString(Reader) > > -- New utility method to return the content of a Reader > > #getXMLStreamReader(InputStream, String) > > -- no longer read the complete InputStream into memory, instead use > mark() to reset the position resp. SequenceInputStream > > -- use BOM to deduce proper charset to read xml-prolog, if BOM not > present, guess possible encoding per auto-detection described in xml > specification > > #readFully(InputStream, byte[]) > > -- helper method to read from InputStream into byte array > > #skipFully(InputStream, long) > > -- helper method to skip() bytes more reliably > > #getEncoding(byte[]) > > -- helper method to return encoding from BOM > > #guessXMLEncoding(byte[]) > > -- helper method to guess possible encoding by reading first bytes of > input > > #getBOMEncoding(BufferedInputStream) > > -- use getEncoding() method to retrieve encoding from BOM > > #makeXMLReaderForFile(String, String) > > -- use toByteArrayInputStream() to read complete input into memory now > that getXMLStreamReader() no longer uses in-memory streams > > #toByteArrayInputStream(InputStream) > > -- reads InputStream content into memory and returns ByteArrayInputStream > with the same content > > #stripByteOrderMark(PushbackInputStream) > > -- use getEncoding() to read BOM > > -- don't unread bytes after EOF > > #stripByteOrderMark(byte[]) > > -- use getEncoding() to read BOM > > > > CSSHandler: > > - use new methods from FileUtils to avoid duplicating code > > > > HttpData: > > - use encoding detection from FileUtils > > > > XMLGrabber: > > - the PullParser we're using doesn't perform any encoding detection, > therefore fallback to the utility methods in FileUtils > > > > ResponderXML: > > - don't read files unconditionally in a certain encoding (here: UTF-8) > instead just return the plain bytes > > > > FileUtils_Test2: > > - test case for the new resp. changed method from FileUtils > > > > WholeFile_Test: > > - initialize ReprocessComments.setOptions() in order to run test case > > > > build.xml: > > - re-enable FileUtils_Test, according to LPP-2054 it is currently > failing, but I couldn't reproduce the failure > > - add FileUtils_Test2 > > - remove stale EDU/** entry from the oswego concurrent classes > > - profiler properties for visual vm, they're not enabled by default but > useful for things like LPP-9841 > > > > > > Tests: > > test case from LPP-1657, junit test case > > > > Files: > > M WEB-INF/lps/server/build.xml > > M WEB-INF/lps/server/src/org/openlaszlo/js2doc/WholeFile_Test.java > > M WEB-INF/lps/server/src/org/openlaszlo/css/CSSHandler.java > > M WEB-INF/lps/server/src/org/openlaszlo/utils/FileUtils.java > > A WEB-INF/lps/server/src/org/openlaszlo/utils/FileUtils_Test2.java > > M > WEB-INF/lps/server/src/org/openlaszlo/servlets/responders/ResponderXML.java > > M WEB-INF/lps/server/src/org/openlaszlo/data/HttpData.java > > M WEB-INF/lps/server/src/org/openlaszlo/data/XMLGrabber.java > > > > Changeset: > http://svn.openlaszlo.org/openlaszlo/patches/bargull-20110327-wHm.tar > > > > -- Henry Minsky Software Architect [email protected]
