On Tue, 27 Apr 2021 22:47:06 GMT, Coleen Phillimore <cole...@openjdk.org> wrote:

>> Please review this change to the String Deduplication facility.  It is being
>> changed to use OopStorage to hold weak references to relevant objects,
>> rather than bespoke data structures requiring dedicated processing phases by
>> supporting GCs.
>> 
>> (The Shenandoah update was contributed by Zhengyu Gu.)
>> 
>> This change significantly simplifies the interface between a given GC and
>> the String Deduplication facility, which should make it much easier for
>> other GCs to opt in.  However, this change does not alter the set of GCs
>> providing support; currently only G1 and Shenandoah support string
>> deduplication.  Adding support by other GCs will be in followup RFEs.
>> 
>> Reviewing via the diffs might not be all that useful for some parts, as
>> several files have been essentially completely replaced, and a number of
>> files have been added or eliminated.  The full webrev might be a better
>> place to look.
>> 
>> The major changes are in gc/shared/stringdedup/* and in the supporting
>> collectors, but there are also some smaller changes in other places, most
>> notably classfile/{stringTable,javaClasses}.
>> 
>> This change is additionally labeled for review by core-libs, although there
>> are no source changes there.  This change injects a byte field of bits into
>> java.lang.String, using one of the previously unused padding bytes in that
>> class.  (There were two unused bytes, now only one.)
>> 
>> Testing:
>> mach5 tier1-2 with and without -XX:+UseStringDeduplication
>> 
>> Locally (linux-x64) ran all of the existing tests that use string
>> deduplication with both G1 and Shenandoah.  Note that
>> TestStringDeduplicationInterned.java is disabled for shenandoah, as it
>> currently fails, for reasons I haven't figured out but suspect are test
>> related.
>> 
>> - gc/stringdedup/   -- these used to be in gc/g1
>> - runtime/cds/SharedStringsDedup.java
>> - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java
>> - runtime/cds/appcds/sharedStrings/*
>> 
>> shenandoah-only:
>> - gc/shenandoah/jvmti/TestHeapDump.java
>> - gc/shenandoah/TestStringDedup.java
>> - gc/shenandoah/TestStringDedupStress.java
>> 
>> Performance tested baseline, baseline + stringdedup, new with stringdedup,
>> finding no significant differences.
>
> src/hotspot/share/classfile/stringTable.cpp line 784:
> 
>> 782:   SharedStringIterator iter(f);
>> 783:   _shared_table.iterate(&iter);
>> 784: }
> 
> So with this change (somewhere below) do you no longer deduplicate strings 
> from the shared string table? ie
> 
> // The CDS archive does not include the string deduplication table. Only the 
> string
> // table is saved in the archive. The shared strings from CDS archive need to 
> be
> // added to the string deduplication table before deduplication occurs. That 
> is
> // done in the beginning of the StringDedupThread (see 
> StringDedupThread::do_deduplication()).
> void StringDedupThread::deduplicate_shared_strings(StringDedupStat* stat) {
>   StringDedupSharedClosure sharedStringDedup(stat);
>   StringTable::shared_oops_do(&sharedStringDedup);
> }
> Asking @iklam to have a look then.

If I understand correctly, we no longer need to add the entire set of shared 
strings into the dedup table.


+// As a space optimization, when shared StringTable entries exist the shared
+// part of the StringTable is also used as a source for byte arrays.  This
+// permits deduplication of strings against those shared entries without
+// recording them in this table too.
+class StringDedup::Table : AllStatic {

...

+void StringDedup::Table::deduplicate(oop java_string) {
+  assert(java_lang_String::is_instance(java_string), "precondition");
+  _cur_stat.inc_inspected();
+  if ((StringTable::shared_entry_count() > 0) &&
+      try_deduplicate_shared(java_string)) {
+    return;                     // Done if deduplicated against shared 
StringTable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3662

Reply via email to