Re: a quick question about String
I don't recommend obtaining new String objects by calling the constructor unless you have specific requirements, such as decode a byte array with a char set, which is the most frequent use case: new String(Files.readAllBytes(path), StandardCharsets.UTF_8) Strings from java files like "abc" are compiled to ldc (load constant) instructions (as opposed to anew and calling method compiled from new String()), which provides an already interned string object, meaning all "abc" references in java code will yield the same String object. You can safely use == if you know both string objects you compare are interned: for example, method names from Method::getName are interned as well, so InvocationHandler implementations in jdk, such as in MethodHandleProxies, just compare them directly (significantly faster than equals if strings don't match), like (Object)name == "getWrapperInstanceTarget" This demo proves the constant identity of "load constant" String objects; in practice, synchronizing on arbitrary String objects is definitely a bad idea, especially in libraries, just like setting System.out by random, as it may cause conflicts. > Are you saying that new would always create a new object but the GC might > merge multiple instances of String into a single instance? About jvm's string optimizations, jvm may make different string objects with the same content share the backing array in order to reduce allocation (like what the original poster alan wonders). This optimization does not alter the identity of the relative string objects and thus has zero effect on the user end. Another point about (the slowness of) interning is that the intern table is a hash table, but unlike jdk hashmaps that feature red-black-tree buckets once a bucket gets too large, the intern table uses a linked list and can be quite inefficient when there are too many strings. On Fri, Dec 24, 2021 at 10:39 AM Xeno Amess wrote: > > import java.util.concurrent.ExecutorService; > import java.util.concurrent.Executors; > > class SyncDemo1 { > > static volatile int count; > > public static void add() { > synchronized (Demo.getString1()) { > System.out.println("count1 : " + (count++)); > } > } > > } > > class SyncDemo2 { > > static volatile int count; > > public static void add() { > synchronized (Demo.getString2()) { > System.out.println("count2 : " + (count++)); > } > } > > } > > public class Demo { > > public static String getString1() { > String str = "abc"; > return str; > } > > public static String getString2() { > char data[] = {'a', 'b', 'c'}; > String str = new String(data); > return str; > } > > public static void main(String[] args) throws InterruptedException { > System.out.println("test1"); > ExecutorService executorService1 = Executors.newFixedThreadPool(20); > for (int i = 0; i < 1000; i++) { > executorService1.submit( > SyncDemo1::add > ); > } > > ExecutorService executorService2 = Executors.newFixedThreadPool(20); > for (int i = 0; i < 1000; i++) { > executorService2.submit( > SyncDemo2::add > ); > } > } > > } > > > run the codes I send you, at a multi-core machine. > count1 can remain sequential, but count2 not. > That is what I said several hours ago : never should,as Object can be use > as lock. > And String is a kind of Object. > > > Brian Goetz 于2021年12月25日周六 00:29写道: > > > As the language currently stands, new means new; it is guaranteed to > > create a new identity. (When Valhalla comes along, instances of certain > > classes will be identity-free, so the meaning of new will change somewhat, > > but String seems likely to stay as it is.) > > > > The language is allowed (in some cases required) to intern string > > _literals_, so the expression > > > > “foo” == “foo” > > > > can be true. That’s OK because no one said “new”. > > > > In your example, the two instances would not be ==, but would be .equals. > > But since “equivalent” is not a precise term, we can have an angels-on-pins > > debate about whether they are indeed equivalent. IMO equivalent here means > > “for practical purposes”; locking on an arbitrary string which you did not > > allocate is a silly thing to do, so it is reasonable that the doc opted for > > a common-sense interpretation of equivalent, rather than a more precise > > one. > > > > > > > > > On Dec 24, 2021, at 11:12 AM, Alan Snyder > > wrote: > > > > > > Just when I thought the answer was simple, now it seems more complex. > > > > > > Are you saying that new would always create a new object but the GC > > might merge multiple instances of String into a single instance? > > > > > > Also, if new String() always creates a new instance, then it seems that > > this statement (from String.java) is not
Re: a quick question about String
Thanks. That makes sense. Speaking of Valhalla, how is that coming along? Should I start reading about it now, or would it be better to wait? Alan > On Dec 24, 2021, at 8:29 AM, Brian Goetz wrote: > > As the language currently stands, new means new; it is guaranteed to create a > new identity. (When Valhalla comes along, instances of certain classes will > be identity-free, so the meaning of new will change somewhat, but String > seems likely to stay as it is.) > > The language is allowed (in some cases required) to intern string _literals_, > so the expression > >“foo” == “foo” > > can be true. That’s OK because no one said “new”. > > In your example, the two instances would not be ==, but would be .equals. > But since “equivalent” is not a precise term, we can have an angels-on-pins > debate about whether they are indeed equivalent. IMO equivalent here means > “for practical purposes”; locking on an arbitrary string which you did not > allocate is a silly thing to do, so it is reasonable that the doc opted for a > common-sense interpretation of equivalent, rather than a more precise one. > > > >> On Dec 24, 2021, at 11:12 AM, Alan Snyder wrote: >> >> Just when I thought the answer was simple, now it seems more complex. >> >> Are you saying that new would always create a new object but the GC might >> merge multiple instances of String into a single instance? >> >> Also, if new String() always creates a new instance, then it seems that this >> statement (from String.java) is not quite right: >> >> Because String objects are immutable they can be shared. For example: >> >>String str = "abc"; >> is equivalent to: >> >>char data[] = {'a', 'b', 'c'}; >>String str = new String(data); >> >> >> >> >>> On Dec 23, 2021, at 2:55 PM, Simon Roberts >>> wrote: >>> >>> I think there are two things at stake here, one is that as I understand it, >>> "new means new", in every case. This is at least partly why the >>> constructors on soon-to-be value objects are deprecated; they become >>> meaningless. The other is that if the presumption is that we should >>> always intern new Strings, I must disagree. Pooling takes time and memory >>> to manage, and if there are very few duplicates, it's a waste of both. I >>> believe it should be up to the programmer to decide if this is appropriate >>> in their situation. Of course, the GC system seems to be capable of >>> stepping in in some incarnations, which adds something of a counterexample, >>> but that is, if I recall, configurable. >>> >>> >>> On Thu, Dec 23, 2021 at 2:53 PM Xeno Amess wrote: >>> never should,as Object can be use as lock. XenoAmess From: core-libs-dev on behalf of Bernd Eckenfels Sent: Friday, December 24, 2021 5:51:55 AM To: alan Snyder ; core-libs-dev < core-libs-dev@openjdk.java.net> Subject: Re: a quick question about String new String() always creates a new instance. Gruss Bernd -- http://bernd.eckenfels.net Von: core-libs-dev im Auftrag von Alan Snyder Gesendet: Thursday, December 23, 2021 6:59:18 PM An: core-libs-dev Betreff: a quick question about String Do the public constructors of String actually do what their documentation says (allocate a new instance), or is there some kind of compiler magic that might avoid allocation? >>> >>> -- >>> Simon Roberts >>> (303) 249 3613 >>> >> >
Re: a quick question about String
import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; class SyncDemo1 { static volatile int count; public static void add() { synchronized (Demo.getString1()) { System.out.println("count1 : " + (count++)); } } } class SyncDemo2 { static volatile int count; public static void add() { synchronized (Demo.getString2()) { System.out.println("count2 : " + (count++)); } } } public class Demo { public static String getString1() { String str = "abc"; return str; } public static String getString2() { char data[] = {'a', 'b', 'c'}; String str = new String(data); return str; } public static void main(String[] args) throws InterruptedException { System.out.println("test1"); ExecutorService executorService1 = Executors.newFixedThreadPool(20); for (int i = 0; i < 1000; i++) { executorService1.submit( SyncDemo1::add ); } ExecutorService executorService2 = Executors.newFixedThreadPool(20); for (int i = 0; i < 1000; i++) { executorService2.submit( SyncDemo2::add ); } } } run the codes I send you, at a multi-core machine. count1 can remain sequential, but count2 not. That is what I said several hours ago : never should,as Object can be use as lock. And String is a kind of Object. Brian Goetz 于2021年12月25日周六 00:29写道: > As the language currently stands, new means new; it is guaranteed to > create a new identity. (When Valhalla comes along, instances of certain > classes will be identity-free, so the meaning of new will change somewhat, > but String seems likely to stay as it is.) > > The language is allowed (in some cases required) to intern string > _literals_, so the expression > > “foo” == “foo” > > can be true. That’s OK because no one said “new”. > > In your example, the two instances would not be ==, but would be .equals. > But since “equivalent” is not a precise term, we can have an angels-on-pins > debate about whether they are indeed equivalent. IMO equivalent here means > “for practical purposes”; locking on an arbitrary string which you did not > allocate is a silly thing to do, so it is reasonable that the doc opted for > a common-sense interpretation of equivalent, rather than a more precise > one. > > > > > On Dec 24, 2021, at 11:12 AM, Alan Snyder > wrote: > > > > Just when I thought the answer was simple, now it seems more complex. > > > > Are you saying that new would always create a new object but the GC > might merge multiple instances of String into a single instance? > > > > Also, if new String() always creates a new instance, then it seems that > this statement (from String.java) is not quite right: > > > > Because String objects are immutable they can be shared. For example: > > > > String str = "abc"; > > is equivalent to: > > > > char data[] = {'a', 'b', 'c'}; > > String str = new String(data); > > > > > > > > > >> On Dec 23, 2021, at 2:55 PM, Simon Roberts < > si...@dancingcloudservices.com> wrote: > >> > >> I think there are two things at stake here, one is that as I understand > it, > >> "new means new", in every case. This is at least partly why the > >> constructors on soon-to-be value objects are deprecated; they become > >> meaningless. The other is that if the presumption is that we should > >> always intern new Strings, I must disagree. Pooling takes time and > memory > >> to manage, and if there are very few duplicates, it's a waste of both. I > >> believe it should be up to the programmer to decide if this is > appropriate > >> in their situation. Of course, the GC system seems to be capable of > >> stepping in in some incarnations, which adds something of a > counterexample, > >> but that is, if I recall, configurable. > >> > >> > >> On Thu, Dec 23, 2021 at 2:53 PM Xeno Amess wrote: > >> > >>> never should,as Object can be use as lock. > >>> > >>> XenoAmess > >>> > >>> From: core-libs-dev on behalf of > >>> Bernd Eckenfels > >>> Sent: Friday, December 24, 2021 5:51:55 AM > >>> To: alan Snyder ; core-libs-dev < > >>> core-libs-dev@openjdk.java.net> > >>> Subject: Re: a quick question about String > >>> > >>> new String() always creates a new instance. > >>> > >>> Gruss > >>> Bernd > >>> -- > >>> http://bernd.eckenfels.net > >>> > >>> Von: core-libs-dev im Auftrag > von > >>> Alan Snyder > >>> Gesendet: Thursday, December 23, 2021 6:59:18 PM > >>> An: core-libs-dev > >>> Betreff: a quick question about String > >>> > >>> Do the public constructors of String actually do what their > documentation > >>> says (allocate a new instance), or is there some kind of compiler magic > >>> that might avoid allocation? > >>> > >>> > >> > >> -- > >> Simon Roberts > >> (303) 249 3613 >
Re: a quick question about String
As the language currently stands, new means new; it is guaranteed to create a new identity. (When Valhalla comes along, instances of certain classes will be identity-free, so the meaning of new will change somewhat, but String seems likely to stay as it is.) The language is allowed (in some cases required) to intern string _literals_, so the expression “foo” == “foo” can be true. That’s OK because no one said “new”. In your example, the two instances would not be ==, but would be .equals. But since “equivalent” is not a precise term, we can have an angels-on-pins debate about whether they are indeed equivalent. IMO equivalent here means “for practical purposes”; locking on an arbitrary string which you did not allocate is a silly thing to do, so it is reasonable that the doc opted for a common-sense interpretation of equivalent, rather than a more precise one. > On Dec 24, 2021, at 11:12 AM, Alan Snyder wrote: > > Just when I thought the answer was simple, now it seems more complex. > > Are you saying that new would always create a new object but the GC might > merge multiple instances of String into a single instance? > > Also, if new String() always creates a new instance, then it seems that this > statement (from String.java) is not quite right: > > Because String objects are immutable they can be shared. For example: > > String str = "abc"; > is equivalent to: > > char data[] = {'a', 'b', 'c'}; > String str = new String(data); > > > > >> On Dec 23, 2021, at 2:55 PM, Simon Roberts >> wrote: >> >> I think there are two things at stake here, one is that as I understand it, >> "new means new", in every case. This is at least partly why the >> constructors on soon-to-be value objects are deprecated; they become >> meaningless. The other is that if the presumption is that we should >> always intern new Strings, I must disagree. Pooling takes time and memory >> to manage, and if there are very few duplicates, it's a waste of both. I >> believe it should be up to the programmer to decide if this is appropriate >> in their situation. Of course, the GC system seems to be capable of >> stepping in in some incarnations, which adds something of a counterexample, >> but that is, if I recall, configurable. >> >> >> On Thu, Dec 23, 2021 at 2:53 PM Xeno Amess wrote: >> >>> never should,as Object can be use as lock. >>> >>> XenoAmess >>> >>> From: core-libs-dev on behalf of >>> Bernd Eckenfels >>> Sent: Friday, December 24, 2021 5:51:55 AM >>> To: alan Snyder ; core-libs-dev < >>> core-libs-dev@openjdk.java.net> >>> Subject: Re: a quick question about String >>> >>> new String() always creates a new instance. >>> >>> Gruss >>> Bernd >>> -- >>> http://bernd.eckenfels.net >>> >>> Von: core-libs-dev im Auftrag von >>> Alan Snyder >>> Gesendet: Thursday, December 23, 2021 6:59:18 PM >>> An: core-libs-dev >>> Betreff: a quick question about String >>> >>> Do the public constructors of String actually do what their documentation >>> says (allocate a new instance), or is there some kind of compiler magic >>> that might avoid allocation? >>> >>> >> >> -- >> Simon Roberts >> (303) 249 3613 >> >
Re: a quick question about String
Just when I thought the answer was simple, now it seems more complex. Are you saying that new would always create a new object but the GC might merge multiple instances of String into a single instance? Also, if new String() always creates a new instance, then it seems that this statement (from String.java) is not quite right: Because String objects are immutable they can be shared. For example: String str = "abc"; is equivalent to: char data[] = {'a', 'b', 'c'}; String str = new String(data); > On Dec 23, 2021, at 2:55 PM, Simon Roberts > wrote: > > I think there are two things at stake here, one is that as I understand it, > "new means new", in every case. This is at least partly why the > constructors on soon-to-be value objects are deprecated; they become > meaningless. The other is that if the presumption is that we should > always intern new Strings, I must disagree. Pooling takes time and memory > to manage, and if there are very few duplicates, it's a waste of both. I > believe it should be up to the programmer to decide if this is appropriate > in their situation. Of course, the GC system seems to be capable of > stepping in in some incarnations, which adds something of a counterexample, > but that is, if I recall, configurable. > > > On Thu, Dec 23, 2021 at 2:53 PM Xeno Amess wrote: > >> never should,as Object can be use as lock. >> >> XenoAmess >> >> From: core-libs-dev on behalf of >> Bernd Eckenfels >> Sent: Friday, December 24, 2021 5:51:55 AM >> To: alan Snyder ; core-libs-dev < >> core-libs-dev@openjdk.java.net> >> Subject: Re: a quick question about String >> >> new String() always creates a new instance. >> >> Gruss >> Bernd >> -- >> http://bernd.eckenfels.net >> >> Von: core-libs-dev im Auftrag von >> Alan Snyder >> Gesendet: Thursday, December 23, 2021 6:59:18 PM >> An: core-libs-dev >> Betreff: a quick question about String >> >> Do the public constructors of String actually do what their documentation >> says (allocate a new instance), or is there some kind of compiler magic >> that might avoid allocation? >> >> > > -- > Simon Roberts > (303) 249 3613 >
Re: RFR: 8272746: ZipFile can't open big file (NegativeArraySizeException) [v2]
On Thu, 23 Dec 2021 16:42:50 GMT, Alan Bateman wrote: >> Masanori Yano has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8272746: ZipFile can't open big file (NegativeArraySizeException) > > src/java.base/share/classes/java/util/zip/ZipFile.java line 1501: > >> 1499: // read in the CEN and END >> 1500: if (end.cenlen + ENDHDR >= Integer.MAX_VALUE) { >> 1501: zerror("invalid END header (too large central >> directory size)"); > > This check looks correct. It might be a bit clearer to say that "central > directory size too large" rather than "too large central directory size". > > The bug report says that JDK 8 and the native zip handle these zip files, > were you able to check that? I will correct the message as you pointed out. I checked the JDK8 and Linux unzip commands and found no exceptions or errors. Since JDK9, [JDK-8145260](https://bugs.openjdk.java.net/browse/JDK-8145260) has been modified to significantly change the way zip files are handled, resulting in a different result than jdk8. - PR: https://git.openjdk.java.net/jdk/pull/6927
Re: RFR: 8272746: ZipFile can't open big file (NegativeArraySizeException) [v2]
> Could you please review the JDK-8272746 bug fixes? > Since the array index is of type int, the overflow occurs when the value of > end.cenlen is too large because of too many entries. > It is necessary to read a part of the CEN from the file to fix the problem > fundamentally, but the way will require an extensive fix and degrade > performance. > In practical terms, the size of the central directory rarely grows that > large. So, I fixed it to check the size of the central directory and throw an > exception if it is too large. Masanori Yano has updated the pull request incrementally with one additional commit since the last revision: 8272746: ZipFile can't open big file (NegativeArraySizeException) - Changes: - all: https://git.openjdk.java.net/jdk/pull/6927/files - new: https://git.openjdk.java.net/jdk/pull/6927/files/a85ef0f5..c54c50eb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=6927=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=6927=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6927.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6927/head:pull/6927 PR: https://git.openjdk.java.net/jdk/pull/6927