Re: RFR: 8288979: Improve CLDRConverter run time [v3]

2022-06-23 Thread Naoto Sato
On Thu, 23 Jun 2022 08:53:37 GMT, Daniel Jeliński wrote: >> This PR improves the performance of deduplication done by >> ResourceBundleGenerator. >> >> The original implementation compared every pair of values, requiring O(n^2) >> time. The new implementation uses a HashMap to find duplicates,

Re: RFR: 8288979: Improve CLDRConverter run time [v3]

2022-06-23 Thread Roger Riggs
On Thu, 23 Jun 2022 08:53:37 GMT, Daniel Jeliński wrote: >> This PR improves the performance of deduplication done by >> ResourceBundleGenerator. >> >> The original implementation compared every pair of values, requiring O(n^2) >> time. The new implementation uses a HashMap to find duplicates,

Re: RFR: 8288979: Improve CLDRConverter run time [v3]

2022-06-23 Thread Daniel Jeliński
On Thu, 23 Jun 2022 08:56:08 GMT, Daniel Jeliński wrote: >> BTW, this can utilize the new `HashMap.newHashMap()`, although I don't think >> resizing would be occurring in this case. > >> once this fix makes it to the repository, the build will be reproducible > > Yes, we always produce the same

Re: RFR: 8288979: Improve CLDRConverter run time [v3]

2022-06-23 Thread Daniel Jeliński
On Wed, 22 Jun 2022 21:45:25 GMT, Naoto Sato wrote: >> IIUC, once this fix makes it to the repository, the build will be >> reproducible. Making it to be sorted is a welcome enhancement (I compare the >> generated bundles manually from time to time), but it may be costly so it >> could defy th

Re: RFR: 8288979: Improve CLDRConverter run time [v3]

2022-06-23 Thread Daniel Jeliński
> This PR improves the performance of deduplication done by > ResourceBundleGenerator. > > The original implementation compared every pair of values, requiring O(n^2) > time. The new implementation uses a HashMap to find duplicates, trading off > some extra memory consumption for O(n) computati

Re: RFR: 8288979: Improve CLDRConverter run time [v2]

2022-06-22 Thread Naoto Sato
On Wed, 22 Jun 2022 17:57:44 GMT, Naoto Sato wrote: >> A stable order is useful when comparing between builds (by a human). >> It also supports the goal of reproducible builds. >> @naotoj What do you think? > > IIUC, once this fix makes it to the repository, the build will be > reproducible. M

Re: RFR: 8288979: Improve CLDRConverter run time [v2]

2022-06-22 Thread Naoto Sato
On Wed, 22 Jun 2022 17:30:48 GMT, Daniel Jeliński wrote: >> This PR improves the performance of deduplication done by >> ResourceBundleGenerator. >> >> The original implementation compared every pair of values, requiring O(n^2) >> time. The new implementation uses a HashMap to find duplicates,

Re: RFR: 8288979: Improve CLDRConverter run time [v2]

2022-06-22 Thread Naoto Sato
On Wed, 22 Jun 2022 17:27:11 GMT, Roger Riggs wrote: >> True. Which raises the question: do we need any arbitrary order? The >> original code used a hashmap too. It preserved the original order only when >> no duplicates were detected. > > A stable order is useful when comparing between builds

Re: RFR: 8288979: Improve CLDRConverter run time [v2]

2022-06-22 Thread Roger Riggs
On Wed, 22 Jun 2022 17:07:11 GMT, Daniel Jeliński wrote: >> make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java >> line 146: >> >>> 144: // generic reduction of duplicated values >>> 145: Map newMap = new HashMap<>(map); >>> 146: Map d

Re: RFR: 8288979: Improve CLDRConverter run time [v2]

2022-06-22 Thread Daniel Jeliński
> This PR improves the performance of deduplication done by > ResourceBundleGenerator. > > The original implementation compared every pair of values, requiring O(n^2) > time. The new implementation uses a HashMap to find duplicates, trading off > some extra memory consumption for O(n) computati

Re: RFR: 8288979: Improve CLDRConverter run time

2022-06-22 Thread Daniel Jeliński
On Wed, 22 Jun 2022 16:11:33 GMT, Roger Riggs wrote: >> This PR improves the performance of deduplication done by >> ResourceBundleGenerator. >> >> The original implementation compared every pair of values, requiring O(n^2) >> time. The new implementation uses a HashMap to find duplicates, tra

Re: RFR: 8288979: Improve CLDRConverter run time

2022-06-22 Thread Roger Riggs
On Wed, 22 Jun 2022 14:24:05 GMT, Daniel Jeliński wrote: > This PR improves the performance of deduplication done by > ResourceBundleGenerator. > > The original implementation compared every pair of values, requiring O(n^2) > time. The new implementation uses a HashMap to find duplicates, trad

RFR: 8288979: Improve CLDRConverter run time

2022-06-22 Thread Daniel Jeliński
This PR improves the performance of deduplication done by ResourceBundleGenerator. The original implementation compared every pair of values, requiring O(n^2) time. The new implementation uses a HashMap to find duplicates, trading off some extra memory consumption for O(n) computational complex