Re: Initial webrev with changes for JDK 9 - jimage

Jim Laskey (Oracle) Mon, 14 Mar 2016 09:29:47 -0700

These are the changes I plan on submitting for M3.

http://cr.openjdk.java.net/~jlaskey/jimagereview/webrev/index.html


I’ve filed the following bugs as follow up.

https://bugs.openjdk.java.net/browse/JDK-8151806 JImage decompress code needs 
to be revised to be more effective
https://bugs.openjdk.java.net/browse/JDK-8151807 ImageBufferCache should use 
sun.nio.ch.Util

— Jim



> On Mar 10, 2016, at 7:22 PM, Jim Laskey (Oracle) <james.las...@oracle.com> 
> wrote:
> 
> Thank you Roger.  Comments inline.  In general, I will hold off changes until 
> after merge so as not to destabilize.  Note that ImageBufferCache and 
> Decompression are not currently used.
> 
>> On Mar 8, 2016, at 2:07 PM, Roger Riggs <roger.ri...@oracle.com> wrote:
>> 
>> Hi,
>> 
>> Review comments for the jimage code in java.base, mostly cleanup but perhaps 
>> a bug or two.
>> General:  inconsistent style and not following guidelines, copyright dates 
>> may need an update.
>> Use of assert for error checking is not going to catch or help isolate 
>> errors in production builds.
> 
> Noted.
> 
>> 
>> dir: jdk/src/java.base/share/classes/jdk/internal/jimage/
>> 
>> 
>> BasicImageReader:
>> - readHeader() exception message should say not a jimage or wrong version
>> Other IOExceptions can occur if the file is corrupted or is too small, array 
>> index out of bounds, etc.
> 
> Replaced with specific exceptions.
> 
>> 
>> - findLocation() should use a more explicit computation for the redirection
>>  than rehashing; this would allow the hash seed to be encapsulated.
> 
> As part of the review change set, I've documented the Minimal Perfect Hash 
> Method (see PerfectHashBuilder.)  The algorithm requires a one time one 
> rehash when there is a collision.  A new seed forces a different distribution 
> of the colliding strings.  If you started with the existing hashCode, you 
> would not get a new distribution, merely a shift of all the colliding 
> elements. Random distribution is the essence of algorithm.
> 
>> 
>> - slice() synchronizes on the ByteBuffer; does every other access?
> 
> Byte buffer use here is immutable, however slicing of the ByteBuffer does 
> require modifying position and limit.  Wrapping or cloning the buffers would 
> be of little benefit and would take a performance hit.  The ByteBuffer API 
> might have been well served with an atomic non-modifying slice operation.
> 
>> 
>> - getAllLocations() is unused and does not respect its sorted argument;
>>  Best to remove the method for now.
> 
> Removed.
> 
>> 
>> - getResourceBuffer(): unused - remove
> 
> Removed.
> 
>> 
>> - getResourceStream(): unused - remove
> 
> Removed.
> 
>> 
>> ImageStringsReader:
>> 
>> - Is Modified UTF-8 really needed here;  using normal UTF-8 would allow
>>  re-use and better performance.  Alternatively, Modified UTF-8 should be
>>  added as a supported encoding.
> 
> Modified UTF-8 is used to be more compact and to simplify use in native code. 
>  The jimage string table has overlapping use with constant pool compaction so 
> is really an older revision of the Modified UTF-8 standard.  Not sure why 
> Modifed UTF-8 is not offered as an encoding.  It would have made constant 
> pool management easier. And note that relying on a particular encoding is 
> problematic.  Encoding standards do change, affecting the stability of the 
> hash algorithm and may prevent the reading of older jimage files.
> 
>> 
>> - The ImageStringsReader.hashCode() should be defined so it can take 
>> advantage of 32 or 64 bit accesses; since it is used frequently.  And it 
>> needs to be stable across revisions that may be accessing jimages from 
>> different JDK versions.
>> 
> 
> Reasonable and will consider.  Not sure what you mean by stable across 
> revisions. This is not string hashCode.  This is a hashCode that is specific 
> to MPHM and is deterministic. 
> 
>> - stringFromMUTF8(byte[]) - there may be some optimization to avoid 
>> expanding ascii to chars and then re-encoding in String.
> 
> Will touch base with the Aleksey.
> 
>> 
>> - charsFromByteBufferLength/charsFromByteBuffer should throw an exception if 
>> no terminating null, not just when assertions are enabled
> 
> Will throw InternalError instead.
> 
>> 
>> - inconsistent assert errors vs exceptions for malformed mutf8.
>>  Use of assert will not detect errors in production builds
> 
> Will throw InternalError instead.
> 
>> 
>> - mutf8FromChars allocates a new working buffer (byte[8]) potentially for 
>> every UNICODE char.
> 
> Will hoist the buffer allocation and reuse.
> 
>> 
>> - mutf8FromString: should take advantage of new String(bytes, offset, 
>> length, charsetName) to avoid extra allocations.
>>  (would need a Modified utf-8 charset).
> 
> Modified UTF-8 not available.
> 
>> 
>> - hashCode(byte[0, seed) should be private; the hashcode algorithm should 
>> not be variable.
>> 
> 
> This is not String hashCode. see above.
> 
>> ImageLocation:
>> - Merge ImageLocation functions into ImageLocationBase and rename.
>>   ImageLocation does not have enough unique functionality to be a separate 
>> class.
> 
> Merged.
> 
>> 
>> - It is a poor design to create an instance that may not be valid.
>>  It would be better to have an ImageLocation factory method that 
>> encapsulated the creation and checking
>> 
> 
> Noted.
> 
>> 
>> ImageLocationBase:
>> - Failing asserts will not cause a runtime error in production.   
>>   But will degenerate into unpredictable other exceptions
> 
> Will throw InternalError instead.
> 
>> 
>> ImageStream:
>> - compress(): Too heavyweight an object to be used and discarded just for 
>> compress / decompress of small values.
> 
> compress()?   I don’t see ImageStream used anywhere except for large buffers. 
>  As a note, dynamic ByteBuffers really should be part of the JDK.
> 
>> 
>> 
>> ImageBufferCache:
>> - Comments on each method would make the intent clear
> 
> Will add comments.
> 
>> 
>> - getBuffer: mixing for loop and explicit indexes is fragile
>> - releaseBuffer: check buffer.capacity before threadLocal.get() for early 
>> return
>> - Logic of i,j is flawed, i is never changed; j is/becomes 0 
>>   The first buffer is removed if there are more than max
>>   Its not clear what algorithm was intended.
>>   A buffer that is in use can be removed.
> 
> Rewrote using lambdas to clarify the algorithm.
> 
>> 
>> - ImageBufferCache constructor: the isUsed flag should be set in getBuffer()
>>   when it is added to the list (not in the constructor)
>>   It keeps it consistent with when it is set for a re-used buffer.
>> 
> 
> Modified.
> 
>> ImageHeader:
>> - BADMAGIC is not used
> 
> Removed.
> 
>> 
>> - readFrom could throw BufferUnderflowException - undocumented
> 
> get(i) can IndexOutOfBoundsException [and get() BufferUnderflowException]  
> Will add explicit check.
> 
> 
>> 
>> ImageModuleData:
>> - ImageModuleData seems to be unused;  allModuleNames() is unreferenced
>> 
> 
> Removed.
> 
>> 
>> ImageReader:
>> - ImageReader(path): unused - remove
> 
> Removed.
> 
>> 
>> - Use of assert is non-reassuring, ineffective in production builds
> 
> Added explicit check.
> 
>> 
>> - In a long running application the per thread buffers will get to the point 
>> wehere they are unused and should be released.
>>  There is no support for soft or weak references to handle that case.
>>  Alternatively, if the buffers were not per thread then they would have a 
>> lesser impact.
> 
> The core libs team was concerned about the performance of class loading.  The 
> pattern used here is the one followed by NIO.  We will probably come up with 
> a shared NIO cache at some point.
> 
>>  Also, why should there be more than one buffer per thread?
> 
> The image cache is only used by 32 bit systems and decompression.  It is 
> possible for a channel read buffer and a decompression buffer to be allocated 
> at the same time.
> 
>> 
>> ImageReader.Resource:
>> - unused create method
> 
> Removed.
> 
>> 
>> - unused parent argument (0) in Resource constructor (dir, loc, attr)
> 
> Removed.
> 
>> 
>> ImageReader.LinkNode:
>> - constructor has unused parent arg
>> 
>> 
> 
> Removed
> 
>> decompress/CompressIndexes:
>> - readInt: wasteful allocation of a single byte array and then another small 
>> byte array;
>>  and then another heap buffer allocation in the ByteBuffer.
> 
> Removed all allocation.
> 
>> 
>> - decompress is hugely wasteful of allocations of ByteBuffers of small sizes.
>> 
> 
> Removed all allocation.
> 
> 
>> decompress/CompressedResourceHeader:
>> - MAGIC does not need to be public
>> 
>> 
>> decompress/Decompressor:
>> - IOException "Plugin name not found" should include the missing name
>> 
>> - StringsProvider is an unnecessary interface; could use Supplier<String>
>> 
>> - Redundant public modifiers in interface declarations
>> 
>> decompress/SignatureParser:
>> - Comments describing the input and output formats needed
>> 
>> - style around operators does not consistently include spaces; and "if("...
>> 
>> decompress/StringSharingDecompressor:
>> - Wasteful use of ByteBuffers and small arrays without re-use (safeAdd...)
>> 
>> - StringSharingDecompressor constructor with unused argument - remove
>> 
>> 
>> decompress/ZipDecompressor:
>> - Should be more specific about the zip algorithms supported
>>  since it needs to be stable across releases
>> 
>> - decompress(): use try-with-resources for safer inflation
>> 
> 
> I think decompression needs significant reworking..
> 
>> src/java.base/share/native/libjimage/imageDecompressor.cpp
>> - "// If ptr is not aligned, sparc will fail." implies input argument must 
>> be aligned
>> 
>> - General doubt about using assert; for example to check on malloc 
>> allocation failure
>> 
>> - In production environment will exhibit as segv
>> 
> 
> Noted.

Re: Initial webrev with changes for JDK 9 - jimage

Reply via email to