On Wed, 22 Jun 2022 14:24:05 GMT, Daniel Jeliński <djelin...@openjdk.org> wrote:
> This PR improves the performance of deduplication done by > ResourceBundleGenerator. > > The original implementation compared every pair of values, requiring O(n^2) > time. The new implementation uses a HashMap to find duplicates, trading off > some extra memory consumption for O(n) computational complexity. In practice > the time to generate jdk.localedata on my Linux VM files dropped from 14 to 8 > seconds. > > The resulting files (under build/support/gensrc/java.base and jdk.localedata) > have different contents; map iteration order depends on the insertion order, > and the insertion order of the new implementation is different from the > original. > The files generated before and after this change have the same size. make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java line 146: > 144: // generic reduction of duplicated values > 145: Map<String, Object> newMap = new HashMap<>(map); > 146: Map<BundleEntryValue, BundleEntryValue> dedup = new > HashMap<>(map.size()); LinkedHashMap could be used to retain the iteration order. Or TreeMap if some deterministic order was desirable. make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java line 157: > 155: fmt = new Formatter(); > 156: } > 157: String metaVal = oldEntry.metaKey(); The new instanceof pattern matching could be used avoid the cast below. make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java line 270: > 268: if (value instanceof String s) { > 269: return s.equals(entry.value); > 270: } else if (!(entry.value instanceof String[])) { Could be re-written to use instanceof pattern and save a cast. ------------- PR: https://git.openjdk.org/jdk/pull/9243