Re: Initial webrev with changes for JDK 9 - jimage

Roger Riggs Fri, 11 Mar 2016 06:28:43 -0800

Thanks Jim.

Beware of use of Lambdas, jimage may be used early enough in the bootcycle that lambdasare not yet fully operative. The failure is pretty obvious but can betrial and error to narrow it down.

It would be useful to create an jira entry for the changes that arebeing delayed.


Roger


On 3/10/2016 6:22 PM, Jim Laskey (Oracle) wrote:

Thank you Roger.  Comments inline.  In general, I will hold off changes until 
after merge so as not to destabilize.  Note that ImageBufferCache and 
Decompression are not currently used.

On Mar 8, 2016, at 2:07 PM, Roger Riggs <[email protected]> wrote:

Hi,

Review comments for the jimage code in java.base, mostly cleanup but perhaps a 
bug or two.
General:  inconsistent style and not following guidelines, copyright dates may 
need an update.
Use of assert for error checking is not going to catch or help isolate errors 
in production builds.

Noted.

dir: jdk/src/java.base/share/classes/jdk/internal/jimage/


BasicImageReader:
- readHeader() exception message should say not a jimage or wrong version
  Other IOExceptions can occur if the file is corrupted or is too small, array 
index out of bounds, etc.

Replaced with specific exceptions.

- findLocation() should use a more explicit computation for the redirection
   than rehashing; this would allow the hash seed to be encapsulated.

As part of the review change set, I've documented the Minimal Perfect Hash 
Method (see PerfectHashBuilder.)  The algorithm requires a one time one rehash 
when there is a collision.  A new seed forces a different distribution of the 
colliding strings.  If you started with the existing hashCode, you would not 
get a new distribution, merely a shift of all the colliding elements. Random 
distribution is the essence of algorithm.

- slice() synchronizes on the ByteBuffer; does every other access?

Byte buffer use here is immutable, however slicing of the ByteBuffer does 
require modifying position and limit.  Wrapping or cloning the buffers would be 
of little benefit and would take a performance hit.  The ByteBuffer API might 
have been well served with an atomic non-modifying slice operation.

- getAllLocations() is unused and does not respect its sorted argument;
   Best to remove the method for now.

Removed.

- getResourceBuffer(): unused - remove

Removed.

- getResourceStream(): unused - remove

Removed.

ImageStringsReader:

- Is Modified UTF-8 really needed here;  using normal UTF-8 would allow
   re-use and better performance.  Alternatively, Modified UTF-8 should be
   added as a supported encoding.

Modified UTF-8 is used to be more compact and to simplify use in native code.  
The jimage string table has overlapping use with constant pool compaction so is 
really an older revision of the Modified UTF-8 standard.  Not sure why Modifed 
UTF-8 is not offered as an encoding.  It would have made constant pool 
management easier. And note that relying on a particular encoding is 
problematic.  Encoding standards do change, affecting the stability of the hash 
algorithm and may prevent the reading of older jimage files.

- The ImageStringsReader.hashCode() should be defined so it can take advantage 
of 32 or 64 bit accesses; since it is used frequently.  And it needs to be 
stable across revisions that may be accessing jimages from different JDK 
versions.

Reasonable and will consider.  Not sure what you mean by stable across 
revisions. This is not string hashCode.  This is a hashCode that is specific to 
MPHM and is deterministic.

- stringFromMUTF8(byte[]) - there may be some optimization to avoid expanding 
ascii to chars and then re-encoding in String.

Will touch base with the Aleksey.

- charsFromByteBufferLength/charsFromByteBuffer should throw an exception if no 
terminating null, not just when assertions are enabled

Will throw InternalError instead.

- inconsistent assert errors vs exceptions for malformed mutf8.
   Use of assert will not detect errors in production builds

Will throw InternalError instead.

- mutf8FromChars allocates a new working buffer (byte[8]) potentially for every 
UNICODE char.

Will hoist the buffer allocation and reuse.

- mutf8FromString: should take advantage of new String(bytes, offset, length, 
charsetName) to avoid extra allocations.
   (would need a Modified utf-8 charset).

Modified UTF-8 not available.

- hashCode(byte[0, seed) should be private; the hashcode algorithm should not 
be variable.

This is not String hashCode. see above.

ImageLocation:
  - Merge ImageLocation functions into ImageLocationBase and rename.
    ImageLocation does not have enough unique functionality to be a separate 
class.

Merged.

- It is a poor design to create an instance that may not be valid.
   It would be better to have an ImageLocation factory method that encapsulated 
the creation and checking

Noted.

ImageLocationBase:

  - Failing asserts will not cause a runtime error in production.
    But will degenerate into unpredictable other exceptions

Will throw InternalError instead.

ImageStream:
- compress(): Too heavyweight an object to be used and discarded just for 
compress / decompress of small values.

compress()?   I don’t see ImageStream used anywhere except for large buffers.  
As a note, dynamic ByteBuffers really should be part of the JDK.


ImageBufferCache:
  - Comments on each method would make the intent clear

Will add comments.

  - getBuffer: mixing for loop and explicit indexes is fragile
  - releaseBuffer: check buffer.capacity before threadLocal.get() for early 
return
  - Logic of i,j is flawed, i is never changed; j is/becomes 0
    The first buffer is removed if there are more than max
    Its not clear what algorithm was intended.
    A buffer that is in use can be removed.

Rewrote using lambdas to clarify the algorithm.

  - ImageBufferCache constructor: the isUsed flag should be set in getBuffer()
    when it is added to the list (not in the constructor)
    It keeps it consistent with when it is set for a re-used buffer.

Modified.

ImageHeader:
- BADMAGIC is not used

Removed.

- readFrom could throw BufferUnderflowException - undocumented

get(i) can IndexOutOfBoundsException [and get() BufferUnderflowException]  Will 
add explicit check.

ImageModuleData:
- ImageModuleData seems to be unused;  allModuleNames() is unreferenced

Removed.

ImageReader:
- ImageReader(path): unused - remove

Removed.

- Use of assert is non-reassuring, ineffective in production builds

Added explicit check.

- In a long running application the per thread buffers will get to the point 
wehere they are unused and should be released.
   There is no support for soft or weak references to handle that case.
   Alternatively, if the buffers were not per thread then they would have a 
lesser impact.

The core libs team was concerned about the performance of class loading.  The 
pattern used here is the one followed by NIO.  We will probably come up with a 
shared NIO cache at some point.

   Also, why should there be more than one buffer per thread?

The image cache is only used by 32 bit systems and decompression.  It is 
possible for a channel read buffer and a decompression buffer to be allocated 
at the same time.

ImageReader.Resource:
- unused create method

Removed.

- unused parent argument (0) in Resource constructor (dir, loc, attr)

Removed.

ImageReader.LinkNode:
- constructor has unused parent arg

Removed

decompress/CompressIndexes:
- readInt: wasteful allocation of a single byte array and then another small 
byte array;
   and then another heap buffer allocation in the ByteBuffer.

Removed all allocation.

- decompress is hugely wasteful of allocations of ByteBuffers of small sizes.

Removed all allocation.

decompress/CompressedResourceHeader:
- MAGIC does not need to be public


decompress/Decompressor:
- IOException "Plugin name not found" should include the missing name

- StringsProvider is an unnecessary interface; could use Supplier<String>

- Redundant public modifiers in interface declarations

decompress/SignatureParser:
- Comments describing the input and output formats needed

- style around operators does not consistently include spaces; and "if("...

decompress/StringSharingDecompressor:
- Wasteful use of ByteBuffers and small arrays without re-use (safeAdd...)

- StringSharingDecompressor constructor with unused argument - remove


decompress/ZipDecompressor:
- Should be more specific about the zip algorithms supported
   since it needs to be stable across releases

- decompress(): use try-with-resources for safer inflation

I think decompression needs significant reworking..

src/java.base/share/native/libjimage/imageDecompressor.cpp
- "// If ptr is not aligned, sparc will fail." implies input argument must be 
aligned

- General doubt about using assert; for example to check on malloc allocation 
failure

- In production environment will exhibit as segv

Noted.

Re: Initial webrev with changes for JDK 9 - jimage

Reply via email to