These are the changes I plan on submitting for M3. http://cr.openjdk.java.net/~jlaskey/jimagereview/webrev/index.html
I’ve filed the following bugs as follow up. https://bugs.openjdk.java.net/browse/JDK-8151806 JImage decompress code needs to be revised to be more effective https://bugs.openjdk.java.net/browse/JDK-8151807 ImageBufferCache should use sun.nio.ch.Util — Jim > On Mar 10, 2016, at 7:22 PM, Jim Laskey (Oracle) <james.las...@oracle.com> > wrote: > > Thank you Roger. Comments inline. In general, I will hold off changes until > after merge so as not to destabilize. Note that ImageBufferCache and > Decompression are not currently used. > >> On Mar 8, 2016, at 2:07 PM, Roger Riggs <roger.ri...@oracle.com> wrote: >> >> Hi, >> >> Review comments for the jimage code in java.base, mostly cleanup but perhaps >> a bug or two. >> General: inconsistent style and not following guidelines, copyright dates >> may need an update. >> Use of assert for error checking is not going to catch or help isolate >> errors in production builds. > > Noted. > >> >> dir: jdk/src/java.base/share/classes/jdk/internal/jimage/ >> >> >> BasicImageReader: >> - readHeader() exception message should say not a jimage or wrong version >> Other IOExceptions can occur if the file is corrupted or is too small, array >> index out of bounds, etc. > > Replaced with specific exceptions. > >> >> - findLocation() should use a more explicit computation for the redirection >> than rehashing; this would allow the hash seed to be encapsulated. > > As part of the review change set, I've documented the Minimal Perfect Hash > Method (see PerfectHashBuilder.) The algorithm requires a one time one > rehash when there is a collision. A new seed forces a different distribution > of the colliding strings. If you started with the existing hashCode, you > would not get a new distribution, merely a shift of all the colliding > elements. Random distribution is the essence of algorithm. > >> >> - slice() synchronizes on the ByteBuffer; does every other access? > > Byte buffer use here is immutable, however slicing of the ByteBuffer does > require modifying position and limit. Wrapping or cloning the buffers would > be of little benefit and would take a performance hit. The ByteBuffer API > might have been well served with an atomic non-modifying slice operation. > >> >> - getAllLocations() is unused and does not respect its sorted argument; >> Best to remove the method for now. > > Removed. > >> >> - getResourceBuffer(): unused - remove > > Removed. > >> >> - getResourceStream(): unused - remove > > Removed. > >> >> ImageStringsReader: >> >> - Is Modified UTF-8 really needed here; using normal UTF-8 would allow >> re-use and better performance. Alternatively, Modified UTF-8 should be >> added as a supported encoding. > > Modified UTF-8 is used to be more compact and to simplify use in native code. > The jimage string table has overlapping use with constant pool compaction so > is really an older revision of the Modified UTF-8 standard. Not sure why > Modifed UTF-8 is not offered as an encoding. It would have made constant > pool management easier. And note that relying on a particular encoding is > problematic. Encoding standards do change, affecting the stability of the > hash algorithm and may prevent the reading of older jimage files. > >> >> - The ImageStringsReader.hashCode() should be defined so it can take >> advantage of 32 or 64 bit accesses; since it is used frequently. And it >> needs to be stable across revisions that may be accessing jimages from >> different JDK versions. >> > > Reasonable and will consider. Not sure what you mean by stable across > revisions. This is not string hashCode. This is a hashCode that is specific > to MPHM and is deterministic. > >> - stringFromMUTF8(byte[]) - there may be some optimization to avoid >> expanding ascii to chars and then re-encoding in String. > > Will touch base with the Aleksey. > >> >> - charsFromByteBufferLength/charsFromByteBuffer should throw an exception if >> no terminating null, not just when assertions are enabled > > Will throw InternalError instead. > >> >> - inconsistent assert errors vs exceptions for malformed mutf8. >> Use of assert will not detect errors in production builds > > Will throw InternalError instead. > >> >> - mutf8FromChars allocates a new working buffer (byte[8]) potentially for >> every UNICODE char. > > Will hoist the buffer allocation and reuse. > >> >> - mutf8FromString: should take advantage of new String(bytes, offset, >> length, charsetName) to avoid extra allocations. >> (would need a Modified utf-8 charset). > > Modified UTF-8 not available. > >> >> - hashCode(byte[0, seed) should be private; the hashcode algorithm should >> not be variable. >> > > This is not String hashCode. see above. > >> ImageLocation: >> - Merge ImageLocation functions into ImageLocationBase and rename. >> ImageLocation does not have enough unique functionality to be a separate >> class. > > Merged. > >> >> - It is a poor design to create an instance that may not be valid. >> It would be better to have an ImageLocation factory method that >> encapsulated the creation and checking >> > > Noted. > >> >> ImageLocationBase: >> - Failing asserts will not cause a runtime error in production. >> But will degenerate into unpredictable other exceptions > > Will throw InternalError instead. > >> >> ImageStream: >> - compress(): Too heavyweight an object to be used and discarded just for >> compress / decompress of small values. > > compress()? I don’t see ImageStream used anywhere except for large buffers. > As a note, dynamic ByteBuffers really should be part of the JDK. > >> >> >> ImageBufferCache: >> - Comments on each method would make the intent clear > > Will add comments. > >> >> - getBuffer: mixing for loop and explicit indexes is fragile >> - releaseBuffer: check buffer.capacity before threadLocal.get() for early >> return >> - Logic of i,j is flawed, i is never changed; j is/becomes 0 >> The first buffer is removed if there are more than max >> Its not clear what algorithm was intended. >> A buffer that is in use can be removed. > > Rewrote using lambdas to clarify the algorithm. > >> >> - ImageBufferCache constructor: the isUsed flag should be set in getBuffer() >> when it is added to the list (not in the constructor) >> It keeps it consistent with when it is set for a re-used buffer. >> > > Modified. > >> ImageHeader: >> - BADMAGIC is not used > > Removed. > >> >> - readFrom could throw BufferUnderflowException - undocumented > > get(i) can IndexOutOfBoundsException [and get() BufferUnderflowException] > Will add explicit check. > > >> >> ImageModuleData: >> - ImageModuleData seems to be unused; allModuleNames() is unreferenced >> > > Removed. > >> >> ImageReader: >> - ImageReader(path): unused - remove > > Removed. > >> >> - Use of assert is non-reassuring, ineffective in production builds > > Added explicit check. > >> >> - In a long running application the per thread buffers will get to the point >> wehere they are unused and should be released. >> There is no support for soft or weak references to handle that case. >> Alternatively, if the buffers were not per thread then they would have a >> lesser impact. > > The core libs team was concerned about the performance of class loading. The > pattern used here is the one followed by NIO. We will probably come up with > a shared NIO cache at some point. > >> Also, why should there be more than one buffer per thread? > > The image cache is only used by 32 bit systems and decompression. It is > possible for a channel read buffer and a decompression buffer to be allocated > at the same time. > >> >> ImageReader.Resource: >> - unused create method > > Removed. > >> >> - unused parent argument (0) in Resource constructor (dir, loc, attr) > > Removed. > >> >> ImageReader.LinkNode: >> - constructor has unused parent arg >> >> > > Removed > >> decompress/CompressIndexes: >> - readInt: wasteful allocation of a single byte array and then another small >> byte array; >> and then another heap buffer allocation in the ByteBuffer. > > Removed all allocation. > >> >> - decompress is hugely wasteful of allocations of ByteBuffers of small sizes. >> > > Removed all allocation. > > >> decompress/CompressedResourceHeader: >> - MAGIC does not need to be public >> >> >> decompress/Decompressor: >> - IOException "Plugin name not found" should include the missing name >> >> - StringsProvider is an unnecessary interface; could use Supplier<String> >> >> - Redundant public modifiers in interface declarations >> >> decompress/SignatureParser: >> - Comments describing the input and output formats needed >> >> - style around operators does not consistently include spaces; and "if("... >> >> decompress/StringSharingDecompressor: >> - Wasteful use of ByteBuffers and small arrays without re-use (safeAdd...) >> >> - StringSharingDecompressor constructor with unused argument - remove >> >> >> decompress/ZipDecompressor: >> - Should be more specific about the zip algorithms supported >> since it needs to be stable across releases >> >> - decompress(): use try-with-resources for safer inflation >> > > I think decompression needs significant reworking.. > >> src/java.base/share/native/libjimage/imageDecompressor.cpp >> - "// If ptr is not aligned, sparc will fail." implies input argument must >> be aligned >> >> - General doubt about using assert; for example to check on malloc >> allocation failure >> >> - In production environment will exhibit as segv >> > > Noted.