# JVM Collections Optimizations: Eliminating toArray() Performance Bottlenecks

## Summary

This PR addresses performance bottlenecks in ArrayList.addAll() and 
Collections.SingletonSet.toArray() methods by implementing direct optimizations 
that bypass inefficient intermediate allocations and abstract implementations. 
The optimizations target high-frequency operations identified through profiling 
analysis, delivering 37% performance improvements for ArrayList operations and 
17-43% performance improvements for SingletonSet operations under real-world 
conditions where multiple collection types are used.

## Problem Context

### ArrayList.addAll() Inefficiency
ArrayList.addAll() currently calls `c.toArray()` on the source collection to 
avoid iterator-based copying, but this creates unnecessary intermediate array 
allocation when the source is also an ArrayList. The method performs:

1. Call `c.toArray()` - creates intermediate array
2. Call `System.arraycopy()` to copy from intermediate array to destination
3. Discard intermediate array

When both source and destination are ArrayList instances, this can be optimized 
to direct array copying.

### Collections.SingletonSet toArray() Missing Implementation
Collections.SingletonSet inherits the default `AbstractCollection.toArray()` 
implementation, which:

1. Creates an ArrayList internally
2. Iterates through the collection (1 element)
3. Converts ArrayList to array
4. Returns the array

For a single-element collection, this overhead is disproportionate to the 
actual work needed. Additionally, this implementation is vulnerable to call 
site poisoning, showing 74-118% performance degradation under megamorphic 
conditions.

## Optimized Methods

### ArrayList
- **`addAll(Collection<? extends E> c)`**: Added fast path for 
ArrayList-to-ArrayList copying using direct `System.arraycopy()` from source's 
internal `elementData` array, eliminating intermediate `toArray()` allocation

### Collections.SingletonSet
- **`toArray()`**: Direct implementation returning `new Object[] {element}`
- **`toArray(T[] a)`**: Direct implementation with proper array sizing and null 
termination per Collection contract

## Performance Impact

| Class | Method | Size | Baseline | Optimized | Improvement |
|-------|--------|------|----------|-----------|-------------|
| ArrayList | addAll | 0 | 10.149 ns/op, 40 B/op | 3.929 ns/op, 24 B/op | **61% 
faster, 40% less allocation** |
| ArrayList | addAll | 5 | 23.804 ns/op, 104 B/op | 14.170 ns/op, 64 B/op | 
**40% faster, 38% less allocation** |
| ArrayList | addAll | 75 | 65.440 ns/op, 664 B/op | 42.187 ns/op, 344 B/op | 
**36% faster, 48% less allocation** |
| LinkedList | addAll | 0 | 6.951 ns/op, 40 B/op | 7.001 ns/op, 40 B/op | **No 
change (control)** |
| LinkedList | addAll | 5 | 26.019 ns/op, 104 B/op | 25.401 ns/op, 104 B/op | 
**No change (control)** |
| LinkedList | addAll | 75 | 199.813 ns/op, 664 B/op | 204.001 ns/op, 664 B/op 
| **No change (control)** |
| SingletonSet | addAll (unpoisoned) | 1 | 18.593 ns/op, 72 B/op | 21.418 
ns/op, 72 B/op | **15% slower** |
| SingletonSet | addAll (poisoned) | 1 | 48.798 ns/op, 96 B/op | 25.029 ns/op, 
72 B/op | **49% faster, 25% less allocation** |

## Implementation Details

### ArrayList Fast Path
Directly accesses `src.elementData` to eliminate the intermediate `toArray()` 
allocation.

### SingletonSet Optimizations
Both methods provide direct array creation without intermediate collections, 
eliminating call site poisoning vulnerability.

## FAQ

**Q: Why use exact class check instead of instanceof for ArrayList?**
A: This prevents the fast path from becoming megamorphic if ArrayList 
subclasses are common, ensuring the optimization remains effective.

**Q: Are these optimizations safe?**
A: Yes. ArrayList-to-ArrayList copying maintains identical behavior, and 
SingletonSet.toArray() implementations follow the Collection contract exactly.

**Q: What about other Collection types?**
A: Profiling data shows these are the biggest opportunities. Others exist but 
are lower-priority.

**Q: How do these optimizations interact with existing code?**
A: These are internal implementation optimizations that maintain full API 
compatibility. No external code changes are required.

## Benchmarks

JMH benchmarks are included to validate the optimizations:
- `ArrayListBulkOpsBenchmark.addAllArrayList()`: Tests ArrayList-to-ArrayList 
fast path
- `ArrayListBulkOpsBenchmark.addAllSingletonSet()`: Tests 
SingletonSet.toArray() optimization
- `ArrayListBulkOpsBenchmark.addAllLinkedList()`: Control case using existing 
implementation

-------------

Commit messages:
 - 8371164 Optimizing ArrayList.addAll()

Changes: https://git.openjdk.org/jdk/pull/28116/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28116&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8371164
  Stats: 208 lines in 3 files changed: 207 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/28116.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28116/head:pull/28116

PR: https://git.openjdk.org/jdk/pull/28116

Reply via email to