On 6/3/20 6:20 AM, Jim Laskey wrote:
I absolutely agree with the reasoning. ArraysSupport.newLength() is also
probably easier for the compiler to optimize around (no exception.)
I guess the only conflict here is to be meaningfully informative. The bug
assigned to me (https://bugs.openjdk.java.net/browse/JDK-8230744) was really
about not getting clear information when the OOM is generated, Getting an OOM
from the bowels of the VM doesn't always give a clear picture (even with a
backtrace) when the allocate code is distant from the expression that calculated
the new size (VM: I can't do this. You figure it out, Developer: ? ).
I'm a little skeptical of having the libraries try to make OOME messages too
informative. If the system really is out of memory, then trying to construct an
informative message is also likely to fail, adding to the confusion.
Needed here is something like:
ArraysSupport.newLength(oldLength, minGrowth, perfGrowth, () -> "Queue exceeds
implementation limit")
I'm not sure I like the idea of a lambda here. Maybe if the message arg were a
string literal, it would be ok, but it seems likely that somebody would use a
lambda and capture something (which would itself allocate memory when the lambda
is encountered), and the evaluation of the lambda is likely to do things like
string formatting, which does even more allocation.
There are other JDK attempts using addExact/multiplyExact to catch the error
closer to action.
Yeah these cases probably deserve some inspection. For common growth factors
(1.5x and 2x), ArraysSupport.newLength avoids need for these by having the
*growth amount* passed as an argument and then doing the integer wraparound
check explicitly, e.g.,
newCapacity = ArraysSupport.newLength(oldCapacity, needed, oldCapacity >> 1);
Note that there are a few cases, ex. StringJoiner, where the growth is a three
way sum where any two args could go negative (overflow) and the third arg
bringing it positive again.
We might need to consider StringJoiner separately, if its computation is indeed
more complex than is currently supported by AS.newLength().
Based on what I see in the JDK, the world would be well served with an
Arrays.grow(array, minGrowth, perfGrowth, () -> "Queue exceeds implementation
limit"). This differs Arrays.copyOf in 1) does overflow checks 2) provides
meaningful message 3) eliminates a common and often poorly implemented code
pattern. The alternative to this is catching the OOM at site to replace with a
meaningful message (this of course is brimming with issues.)
I will replace sites modified by this bug with ArraysSupport.newLength.
Yes, I think it would be an improvement if we replaced several variations of
size computations and overflow checking scattered around the libraries, and
unified them to use ArraysSupport.newLength.
I guess we should figure out what cases we want to support, and whether there
are particular sites that would benefit from more detailed error messages. As I
said above, I'm skeptical, but examining particular cases is always enlightening.
Even without trying to support detailed error message, we should at least make
sure that error messages are reasonable for the different cases we know about.
Here are the ones I know about:
1) Actual shortage of heap space. Hotspot generates this:
OutOfMemoryError: Java heap space
2) Requested array size exceeds implementation limit. Hotspot generates this:
OutOfMemoryError: Requested array size exceeds VM limit
3) Integer wraparound/overflow during size computation. AS.newLength generates
this:
OutOfMemoryError: Required array length too large
For (1) I don't think the JVM is in any position to provide more details.
For (2) this is about array length, not heap, and it happens only for a very few
length values. This seems sufficiently rare that IMO it's not worth expending
much effort.
(3) is the only case generated by the library. In fact, AS.hugeLength() has
oldLength and minGrowth parameters, which seems like enough detail already.
These could reasonably be formatted into the error message, something like:
private static int hugeLength(int oldLength, int minGrowth) {
int minLength = oldLength + minGrowth;
if (minLength < 0) { // overflow
throw new OutOfMemoryError(
String.format("Required array length %d + %d too large",
oldLength, minGrowth));
}
Would this help? If this were added, would it be sufficient to allow various use
sites to convert to use AS.newLength? (Except possibly StringJoiner.)
**
There's a related issue mentioned in the bug reports (JDK-8230744 and
JDK-8210577) which is that OOME is thrown not only when the system is out of
memory, but also when some implementation limit is exceeded, whether that limit
is in the JVM or in the library. Maybe we should update the documentation of
java.lang.OutOfMemoryError to mention this.
s'marks
Cheers,
-- Jim
On Jun 2, 2020, at 7:08 PM, Stuart Marks <stuart.ma...@oracle.com
<mailto:stuart.ma...@oracle.com>> wrote:
Hi Jim,
This was mentioned previously in this thread but not discussed very much. I
suggest you take a look at jdk.internal.util.ArraysSupport.newLength(). Ivan
Gerasimov and I worked this over fairly closely last year, and I'm pretty sure
it does what Martin is saying, which I also think is the right thing.
The intent is that it be used for things that have growable arrays, where the
array might have a larger capacity than the logical number of elements
currently stored. Sometimes the array needs to be grown to accommodate an
immediate need (minGrowth) but it's desired to grow larger (prefGrowth) in
anticipation of future needs. If minGrowth can't be accommodated, it throws
OOME, but if prefGrowth can't be accommodated, it might be acceptable to
provide a smaller amount of growth.
(Of course, all this assumes that there is sufficient memory available to
allocate the actual array. ArraysSupport.newLength doesn't attempt to
ascertain that.)
One issue is integer wraparound (overflow). This is the primary value that
ArraysSupport.newLength provides. It would be good to centralize these
computations instead of having them be spread all over.
Another issue is the one that MAX_ARRAY_LENGTH (also called MAX_ARRAY_SIZE) is
trying to address. This is sort-of a misnomer. It's not the actual maximum
array size (which in fact isn't known the the library). It's actually the
maximum array size that the library is fairly confident the VM can provide,
assuming that enough memory is actually available. What the heck does that mean?
The theoretical maximum array size is Integer.MAX_VALUE, since the JLS and
JVMS don't allow anything larger. However, actual VM implementations will
refuse to allocate an array above a certain amount slightly smaller than that,
even if there is enough memory available. In practice, I believe the values
for current Hotspot are Integer.MAX_VALUE-3 or Integer.MAX_VALUE-2, depending
on whether compressed OOPS are in use.
Why is this significant? Consider the following case, where the capacity of
something is currently Integer.MAX_VALUE-100, and it's filled with elements.
The application performs some operation that requires 50 elements (minGrowth)
be added. A new array could certainly be allocated with size
Integer.MAX_VALUE-50, but typical growth policies for these kinds of
containers want to increase the current capacity by 1.5x or 2x (prefGrowth).
Doing this multiplication would exceed Integer.MAX_VALUE, so that won't work.
Clearly, we need to clamp the capacity somewhere.
We don't want to clamp the capacity at Integer.MAX_VALUE, because this
allocation would fail on every JVM I'm aware of, even if enough memory is
available. So we don't do that. Instead, we clamp the preferred growth at some
fairly arbitrary value smaller than Integer.MAX_VALUE, which is here called
MAX_ARRAY_LENGTH, and increase the capacity to that instead. This allows the
container's requested allocation to proceed: it satisfies minGrowth, but it
doesn't satisfy prefGrowth. Instead, it returns a capacity value that's
reasonably likely to succeed, given an unknown JVM implementation limit.
Recall that the container now has Integer.MAX_VALUE-50 elements and the
capacity is MAX_ARRAY_SIZE, which is currently defined somewhat arbitrarily at
Integer.MAX_VALUE-8. Suppose the application now wants to add 43 elements.
What should happen?
We could say, this exceeds MAX_ARRAY_LENGTH, so refuse the request and throw
OOME. This is reasonable and consistent in some sense, but perhaps not in
another. Suppose there is sufficient memory, and the JVM does allow arrays of
Integer.MAX_VALUE-7 to be created. Shouldn't we even try?
That's what hugeLength() does -- it returns a value that attempts an
allocation beyond the max preferential growth, and leaves it up to the JVM to
grant or refuse the request based on its own implementation limits.
Anyway, this is all quite subtle, and maybe comments in ArraysSupport don't
describe this adequately. But the code that implements this kind of policy has
been copied to different locations around the JDK, and it uses somewhat
different terminology, and it might have slightly different bugs, but they're
all essentially trying to implement this policy.
**
Several questions could be asked:
1) Is this the right policy for implementing growable arrays?
2) In cases where a class needs a growable array, can and should it be
refactored to use ArraysSupport.newLength()?
3) Does ArraysSupport.newLength() need to be modified to accommodate needs of
additional call sites?
4) We might want to consider refactoring PriorityBlockingQueue and ArrayDeque
to use ArraysSupport.newLength, in order to provide a consistent policy for
collections. Other growable-array-based collections (Vector, ArrayList,
PriorityQueue) do already.
s'marks
On 6/1/20 4:47 AM, Jim Laskey wrote:
Thanks David will run with that,
On May 31, 2020, at 8:34 PM, David Holmes <david.hol...@oracle.com
<mailto:david.hol...@oracle.com>> wrote:
On 31/05/2020 12:29 am, Jim Laskey wrote:
I'm working through https://bugs.openjdk.java.net/browse/JDK-8230744
<https://bugs.openjdk.java.net/browse/JDK-8230744> Several classes throw
OutOfMemoryError without message .
I'm wondering why hugeCapacity in
src/jdk.zipfs/share/classes/jdk/nio/zipfs/ByteArrayChannel.java is defined as
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = buf.length;
int newCapacity = oldCapacity << 1;
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
buf = Arrays.copyOf(buf, newCapacity);
}
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
Not sure how we could have minCapacity < 0 at this point. It should have
been checked before the call to grow, and grow will not make it negative.
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
That's a bug plain and simple. It should never report a size > MAX_ARRAY_SIZE.
}
It just seems that it's pushing the inevitable off to Arrays.copyOf.
Shouldn't it be:
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0 || minCapacity > MAX_ARRAY_SIZE) {
throw
new OutOfMemoryError("ByteArrayChannel exceeds maximum
size: " +
MAX_ARRAY_SIZE);
}
return MAX_ARRAY_SIZE;
}
That seems more appropriate to me - modulo the question mark over
minCapacity being negative.
Real question: is there some hidden purpose behind this kind of logic?
The basic strategy is to double the current capacity unless that will
trigger an unnecessary exception, in which case just use the requested
capacity, but again watch for the implementation limits.
Cheers,
David
-----
Cheers,
-- Jim