Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 6/17/18 1:50 AM, Peter Levart wrote: It's a little strange that the generator function is used to construct an empty array (at least in the default method, but overrides will likely do the same if they want to avoid pre-zeroing overhead) in order to just extract the array's type from it. Someone could reasonably expect the provided function to be called with the exact length of needed array. The Collection.toArray(T[]) at least gives user an option to actually fully use the provided array, if user provides the correct length. This is actually what we're trying to avoid. The toArray(T[]) API has to deal with the cases where the array isn't the right length, and it reallocs or inserts an extra null in cases where it isn't. To avoid this, people do coll.toArray(new MyClass[coll.size()]) which turns out to be an anti-pattern. We could try to teach people to write coll.toArray(new MyClass[0]) instead, and this works, but it's quite non-obvious. ("Why do I need to create a zero-length array first?") (#include Tagir's comment about caching zero-length instances) Instead we want to direct people to write coll.toArray(MyClass[]::new) which creates an array of the right type and requested length. The argument about using (and re-using) a method so that a method reference can be passed to the method without any unchecked casts is equally true for any of the 3 alternatives shown - the method itself might need unchecked casts, but its usage not: static List[] array(int len) static Class> elementType() static Class[]> arrayType() In principle all of these are possible. I don't see these as equal, though. It's quite common to have to create arrays of a generic type, either inline with unchecked casts, or possibly refactored into a method. I very rarely see Class objects of generic types. But I can see that you want to align the API with Stream.toArray, while still providing the optimal implementation. It's just that the API doesn't fit the usecase. The function approach makes more sense in the Stream API where it is explicitly explained that it may be used to construct arrays needed to hold intermediary results of partitioned parallel execution too, but in Collection API it is usually the case to just provide a copy of the underlying data structure in the most optimal way (without pre-zeroing overhead) and for that purpose, 2nd and 3rd alternatives are a better fit. Sure, the Stream API has additional uses for holding intermediate results. That doesn't imply that Collection.toArray(generator) doesn't meet its use case (which I described above). I also don't see how class type tokens are a better fit. A type token is "natural" if you're thinking of implementing it in terms of Arrays.copyOf() -- which it is right now, but that's an implementation detail. Suppose Stream took a different approach and used the 2nd or 3rd approach (which is universal). Would then Collection API also get the same method? I'm not sure where this is headed. I'm pretty sure we considered using type tokens for Stream.toArray() and rejected them in favor of toArray(generator). If there had been some reason to use a type token instead, maybe we would have used them, in which case we'd consider modifying Collection.toArray() would take a type token as well. But I'm not aware of such a reason, so It might have been the case in the past when Java generics were introduced, that class literals like List.class would just confuse users, because most aspects of such type token would be erased and there were fears that enabling them might limit options for reified generics in the future. But long years have passed and java programmers have generally become acquainted with Java generics and erasure to the point where it doesn't confuse them any more, while reifying Java generics has been explored further in Valhalla to the point where it has become obvious that erasure of reference types is here to stay. Java could enable use of class literals like List.class without fear that such literals would make users write buggy code or that such uses would limit options for Java in the future. Quite the contrary, they would enable users to write type-safer code than what they can write today. In the light of future Java (valhalla specialized generics), where such class literals make even more sense, enabling generic class literals could be viewed as a stepping stone that has some purpose in the short term too. Again, I'm not sure where this is headed. I certainly would not propose changing the language to allow generic class literals in order to support the present use case (creating an array from a collection). As a longer term proposal, generic class literals might or might not be worthwhile on their own merits. It's certainly an irritant not having them; I could imagine some uses for them. But there are probably some downsides (such as unsoundness) that will need to be thought through. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
+1 from me. > On Jun 14, 2018, at 9:02 PM, Stuart Marks wrote: > > Hi all, > > I wanted to restart review of the changeset for bug JDK-8060192 [1]. The > latest webrev is here: [2]. > > The previous review was from December 2017: [3]. The only difference between > the current changeset and the previous one is the removal of the overriding > implementations (see explanation below). Much of the discussion in the > previous review thread was around performance and the shape of the API. > > # Performance > > I previously had done some benchmarking on this and reported the results: > [4]. (I've recently re-done the benchmarks and conclusions are essentially > the same.) > > The upshot is that implementations that use Arrays.copyOf() are the fastest, > probably because it's an intrinsic. It can avoid zero-filling the freshly > allocated array because it knows the entire array contents will be > overwritten. This is true regardless of what the public API looks like. The > default implementation calls generator.apply(0) to get a zero-length array > and then simply calls toArray(T[]). This goes through the Arrays.copyOf() > fast path, so it's essentially the same speed as toArray(new T[0]). > > The overrides I had previously provided in specific implementation classes > like ArrayList actually are slower, because the allocation of the array is > done separately from filling it. This necessitates the extra zero-filling > step. Thus, I've removed the overrides. > > # API Shape > > There was some discussion about whether it would be preferable to have a > Class parameter as a type token for the array component type or the array > type itself, instead of using an IntFunction generator. Most of this boils > down to what people are comfortable with. However, there are a few points > that make the generator function approach preferable. > > One point in favor of using a generator is that Stream already has a similar > toArray(generator) method. > > Comparing this to the type token alternatives, for simple tasks like > converting Collection to String[], things are about equal: > >toArray(String[]::new) >toArray(String.class) >toArray(String[].class) > > However, things are hairier if the element type of the collection is generic, > such as Collection>. The result type is a generic array > List[]. Using class literals or array constructor references will > necessitate using an unchecked cast, because none of these can be used to > represent a generic type. > > However, it's possible to use a method reference to provide a properly > generic IntFunction. This would enable the toArray(generator) method to be > used without any unchecked casts. (The method it refers to might need have an > unchecked cast within it, though.) Such a method could also be reused for the > corresponding Stream.toArray(generator) method. > > For these reasons I'd like to proceed with adding toArray(generator) API. > > Thanks, > > s'marks > > > [1] https://bugs.openjdk.java.net/browse/JDK-8060192 > > [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/ > > [3] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html > > [4] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html >
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Tagir Valeev wrote: If you are going to create long-living array which is likely to be empty, it's good to deduplicate them to spare the heap space. This can be easily done via existing toArray method like collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array constant. We use this pattern very often in IntelliJ IDEA source code (e. g. think of method which returns an array of Java member annotations; often there are none). New generator methods is much less convenient for this: toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually missing a method accepting an empty array instead of generator in the Stream API. And I don't think that new collection method is useful. The existing one is perfect. Rémi Forax wrote: we can expect the VM to not allocate the array if once inlined the code is new MyClass[0].getClass() if it's not the case, i think it's a better idea to tweak the VM rather than try to change an API based on a pattern that should not be necessary. I think the two of you are talking about different things. Tagir is concerned about *long-lived* zero-length arrays, whereas Rémi is talking about the possibility of short-circuiting the allocation of a zero-length array if it's replaced by a nonzero-length array and thus has an extremely short life. Tagir, if your use case is that you know you are creating lots of long-lived zero-length arrays, and you want to deduplicate them, then sure, using toArray(MyClass.EMPTY_ARRAY) is a fine thing to do. There are a bunch of assumptions here about the longevity and frequency of creation of such arrays. Having a shared empty array might indeed be the right thing for the IntelliJ IDEA code base, but that doesn't mean it's true in general. The toArray(generator) might be perfectly fine for many code bases even if it isn't suitable for the one you have in mind. You also mention the lack of a T[] Stream.toArray(T[]) method. This would seem to help the case of sharing a zero-length array, but Collection.toArray(T[]) also has odd semantics that we didn't want to replicate in streams. The point of that method is to *reuse* the argument array. The odd semantics are that, when the argument array is longer than necessary, null is added after the last written element. Instead of replicating the semantics of Collection.toArray(T[]) on Stream, we ended up with Stream.toArray(generator) instead. Now maybe the Stream.toArray() overloads aren't sufficient for what you want to do, in which case you might want to propose something. But that sounds like a different discussion. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi Rémi, On 6/15/18 12:26 AM, Remi Forax wrote: The overrides I had previously provided in specific implementation classes like ArrayList actually are slower, because the allocation of the array is done separately from filling it. This necessitates the extra zero-filling step. Thus, I've removed the overrides. for ArrayList, you may use a code like this, T[] toArray(IntFunction generator) { return (T[]) Arrays.copyOf(elementData, size, generator.apply(0).getClass()); } so you win only one comparison (yeah !), which can be perfectly predicted, so you should not see any perf difference :) True, this will probably work better than the previous code (allocate correct size, fill with System.arraycopy) but it doesn't seem likely to be any faster than the default method. List.class or List[].class do not work either. I think they can be made to work (see other sub-thread with Peter Levart) but I don't see any advantages going in that direction. For these reasons I'd like to proceed with adding toArray(generator) API. so thumb up for me ! Great! s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 06/17/18 10:50, Peter Levart wrote: The argument about using (and re-using) a method so that its method reference can be passed to the toArray method without any unchecked casts is equally true for any of the 3 alternatives shown - the method itself might need unchecked casts, but its usage not: static List[] array(int len) static Class> elementType() static Class[]> arrayType() For that purpose, I would rather get rid of the need to create a method at all. It might have been the case in the past when Java generics were introduced, that class literals like List.class would just confuse users, because most aspects of such type token would be erased and there were fears that enabling them might limit options for reified generics in the future. But long years have passed and java programmers have generally become acquainted with Java generics and erasure to the point where it doesn't confuse them any more, while reifying Java generics has been explored further in Valhalla to the point where it has become obvious that erasure of reference types is here to stay. Java could enable use of class literals like List.class without fear that such literals would make users write buggy code or that such uses would limit options for Java in the future. Quite the contrary, they would enable users to write type-safer code than what they can write today. In the light of future Java (valhalla specialized generics), where such class literals make even more sense, enabling generic class literals could be viewed as a stepping stone that has some purpose in the short term too. So what do you think? Regards, Peter
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi Tagir, - Mail original - > De: "Tagir Valeev" > À: "Stuart Marks" > Cc: "core-libs-dev" > Envoyé: Samedi 16 Juin 2018 06:01:35 > Objet: Re: RFR(s): 8060192: Add default method Collection.toArray(generator) > Hello! > > If you are going to create long-living array which is likely to be empty, > it's good to deduplicate them to spare the heap space. This can be easily > done via existing toArray method like > collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array > constant. We use this pattern very often in IntelliJ IDEA source code (e. > g. think of method which returns an array of Java member annotations; often > there are none). New generator methods is much less convenient for this: > toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually > missing a method accepting an empty array instead of generator in the > Stream API. And I don't think that new collection method is useful. The > existing one is perfect. > > With best regards, > Tagir Valeev we can expect the VM to not allocate the array if once inlined the code is new MyClass[0].getClass() if it's not the case, i think it's a better idea to tweak the VM rather than try to change an API based on a pattern that should not be necessary. Rémi > > пт, 15 июня 2018 г., 8:03 Stuart Marks : > >> Hi all, >> >> I wanted to restart review of the changeset for bug JDK-8060192 [1]. The >> latest >> webrev is here: [2]. >> >> The previous review was from December 2017: [3]. The only difference >> between the >> current changeset and the previous one is the removal of the overriding >> implementations (see explanation below). Much of the discussion in the >> previous >> review thread was around performance and the shape of the API. >> >> # Performance >> >> I previously had done some benchmarking on this and reported the results: >> [4]. >> (I've recently re-done the benchmarks and conclusions are essentially the >> same.) >> >> The upshot is that implementations that use Arrays.copyOf() are the >> fastest, >> probably because it's an intrinsic. It can avoid zero-filling the freshly >> allocated array because it knows the entire array contents will be >> overwritten. >> This is true regardless of what the public API looks like. The default >> implementation calls generator.apply(0) to get a zero-length array and >> then >> simply calls toArray(T[]). This goes through the Arrays.copyOf() fast >> path, so >> it's essentially the same speed as toArray(new T[0]). >> >> The overrides I had previously provided in specific implementation classes >> like >> ArrayList actually are slower, because the allocation of the array is done >> separately from filling it. This necessitates the extra zero-filling step. >> Thus, >> I've removed the overrides. >> >> # API Shape >> >> There was some discussion about whether it would be preferable to have a >> Class >> parameter as a type token for the array component type or the array type >> itself, >> instead of using an IntFunction generator. Most of this boils down to what >> people are comfortable with. However, there are a few points that make the >> generator function approach preferable. >> >> One point in favor of using a generator is that Stream already has a >> similar >> toArray(generator) method. >> >> Comparing this to the type token alternatives, for simple tasks like >> converting >> Collection to String[], things are about equal: >> >> toArray(String[]::new) >> toArray(String.class) >> toArray(String[].class) >> >> However, things are hairier if the element type of the collection is >> generic, >> such as Collection>. The result type is a generic array >> List[]. Using class literals or array constructor references will >> necessitate using an unchecked cast, because none of these can be used to >> represent a generic type. >> >> However, it's possible to use a method reference to provide a properly >> generic >> IntFunction. This would enable the toArray(generator) method to be used >> without >> any unchecked casts. (The method it refers to might need have an unchecked >> cast >> within it, though.) Such a method could also be reused for the >> corresponding >> Stream.toArray(generator) method. >> >> For these reasons I'd like to proceed with adding toArray(generator) API. >> >> Thanks, >> >> s'marks >> >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8060192 >> >> [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/ >> >> [3] >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html >> >> [4] >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html >>
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi Stuart, On 06/15/18 03:02, Stuart Marks wrote: Comparing this to the type token alternatives, for simple tasks like converting Collection to String[], things are about equal: toArray(String[]::new) toArray(String.class) toArray(String[].class) However, things are hairier if the element type of the collection is generic, such as Collection>. The result type is a generic array List[]. Using class literals or array constructor references will necessitate using an unchecked cast, because none of these can be used to represent a generic type. However, it's possible to use a method reference to provide a properly generic IntFunction. This would enable the toArray(generator) method to be used without any unchecked casts. (The method it refers to might need have an unchecked cast within it, though.) Such a method could also be reused for the corresponding Stream.toArray(generator) method. For these reasons I'd like to proceed with adding toArray(generator) API. It's a little strange that the generator function is used to construct an empty array (at least in the default method, but overrides will likely do the same if they want to avoid pre-zeroing overhead) in order to just extract the array's type from it. Someone could reasonably expect the provided function to be called with the exact length of needed array. The Collection.toArray(T[]) at least gives user an option to actually fully use the provided array, if user provides the correct length. The argument about using (and re-using) a method so that a method reference can be passed to the method without any unchecked casts is equally true for any of the 3 alternatives shown - the method itself might need unchecked casts, but its usage not: static List[] array(int len) static Class> elementType() static Class[]> arrayType() But I can see that you want to align the API with Stream.toArray, while still providing the optimal implementation. It's just that the API doesn't fit the usecase. The function approach makes more sense in the Stream API where it is explicitly explained that it may be used to construct arrays needed to hold intermediary results of partitioned parallel execution too, but in Collection API it is usually the case to just provide a copy of the underlying data structure in the most optimal way (without pre-zeroing overhead) and for that purpose, 2nd and 3rd alternatives are a better fit. Suppose Stream took a different approach and used the 2nd or 3rd approach (which is universal). Would then Collection API also get the same method? Regards, Peter
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hello! If you are going to create long-living array which is likely to be empty, it's good to deduplicate them to spare the heap space. This can be easily done via existing toArray method like collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array constant. We use this pattern very often in IntelliJ IDEA source code (e. g. think of method which returns an array of Java member annotations; often there are none). New generator methods is much less convenient for this: toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually missing a method accepting an empty array instead of generator in the Stream API. And I don't think that new collection method is useful. The existing one is perfect. With best regards, Tagir Valeev пт, 15 июня 2018 г., 8:03 Stuart Marks : > Hi all, > > I wanted to restart review of the changeset for bug JDK-8060192 [1]. The > latest > webrev is here: [2]. > > The previous review was from December 2017: [3]. The only difference > between the > current changeset and the previous one is the removal of the overriding > implementations (see explanation below). Much of the discussion in the > previous > review thread was around performance and the shape of the API. > > # Performance > > I previously had done some benchmarking on this and reported the results: > [4]. > (I've recently re-done the benchmarks and conclusions are essentially the > same.) > > The upshot is that implementations that use Arrays.copyOf() are the > fastest, > probably because it's an intrinsic. It can avoid zero-filling the freshly > allocated array because it knows the entire array contents will be > overwritten. > This is true regardless of what the public API looks like. The default > implementation calls generator.apply(0) to get a zero-length array and > then > simply calls toArray(T[]). This goes through the Arrays.copyOf() fast > path, so > it's essentially the same speed as toArray(new T[0]). > > The overrides I had previously provided in specific implementation classes > like > ArrayList actually are slower, because the allocation of the array is done > separately from filling it. This necessitates the extra zero-filling step. > Thus, > I've removed the overrides. > > # API Shape > > There was some discussion about whether it would be preferable to have a > Class > parameter as a type token for the array component type or the array type > itself, > instead of using an IntFunction generator. Most of this boils down to what > people are comfortable with. However, there are a few points that make the > generator function approach preferable. > > One point in favor of using a generator is that Stream already has a > similar > toArray(generator) method. > > Comparing this to the type token alternatives, for simple tasks like > converting > Collection to String[], things are about equal: > > toArray(String[]::new) > toArray(String.class) > toArray(String[].class) > > However, things are hairier if the element type of the collection is > generic, > such as Collection>. The result type is a generic array > List[]. Using class literals or array constructor references will > necessitate using an unchecked cast, because none of these can be used to > represent a generic type. > > However, it's possible to use a method reference to provide a properly > generic > IntFunction. This would enable the toArray(generator) method to be used > without > any unchecked casts. (The method it refers to might need have an unchecked > cast > within it, though.) Such a method could also be reused for the > corresponding > Stream.toArray(generator) method. > > For these reasons I'd like to proceed with adding toArray(generator) API. > > Thanks, > > s'marks > > > [1] https://bugs.openjdk.java.net/browse/JDK-8060192 > > [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/ > > [3] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html > > [4] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html > >
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
- Mail original - > De: "Stuart Marks" > À: "core-libs-dev" > Envoyé: Vendredi 15 Juin 2018 03:02:04 > Objet: RFR(s): 8060192: Add default method Collection.toArray(generator) > Hi all, Hi Stuart, > > I wanted to restart review of the changeset for bug JDK-8060192 [1]. The > latest > webrev is here: [2]. > > The previous review was from December 2017: [3]. The only difference between > the > current changeset and the previous one is the removal of the overriding > implementations (see explanation below). Much of the discussion in the > previous > review thread was around performance and the shape of the API. > > # Performance > > I previously had done some benchmarking on this and reported the results: [4]. > (I've recently re-done the benchmarks and conclusions are essentially the > same.) > > The upshot is that implementations that use Arrays.copyOf() are the fastest, > probably because it's an intrinsic. It can avoid zero-filling the freshly > allocated array because it knows the entire array contents will be > overwritten. > This is true regardless of what the public API looks like. The default > implementation calls generator.apply(0) to get a zero-length array and then > simply calls toArray(T[]). This goes through the Arrays.copyOf() fast path, so > it's essentially the same speed as toArray(new T[0]). > > The overrides I had previously provided in specific implementation classes > like > ArrayList actually are slower, because the allocation of the array is done > separately from filling it. This necessitates the extra zero-filling step. > Thus, > I've removed the overrides. for ArrayList, you may use a code like this, T[] toArray(IntFunction generator) { return (T[]) Arrays.copyOf(elementData, size, generator.apply(0).getClass()); } so you win only one comparison (yeah !), which can be perfectly predicted, so you should not see any perf difference :) > > # API Shape > > There was some discussion about whether it would be preferable to have a Class > parameter as a type token for the array component type or the array type > itself, > instead of using an IntFunction generator. Most of this boils down to what > people are comfortable with. However, there are a few points that make the > generator function approach preferable. > > One point in favor of using a generator is that Stream already has a similar > toArray(generator) method. > > Comparing this to the type token alternatives, for simple tasks like > converting > Collection to String[], things are about equal: > > toArray(String[]::new) > toArray(String.class) > toArray(String[].class) > > However, things are hairier if the element type of the collection is generic, > such as Collection>. The result type is a generic array > List[]. Using class literals or array constructor references will > necessitate using an unchecked cast, because none of these can be used to > represent a generic type. > > However, it's possible to use a method reference to provide a properly generic > IntFunction. This would enable the toArray(generator) method to be used > without > any unchecked casts. (The method it refers to might need have an unchecked > cast > within it, though.) Such a method could also be reused for the corresponding > Stream.toArray(generator) method. List.class or List[].class do not work either. > > For these reasons I'd like to proceed with adding toArray(generator) API. > so thumb up for me ! > Thanks, > > s'marks Rémi > > > [1] https://bugs.openjdk.java.net/browse/JDK-8060192 > > [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/ > > [3] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html > > [4] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html
RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi all, I wanted to restart review of the changeset for bug JDK-8060192 [1]. The latest webrev is here: [2]. The previous review was from December 2017: [3]. The only difference between the current changeset and the previous one is the removal of the overriding implementations (see explanation below). Much of the discussion in the previous review thread was around performance and the shape of the API. # Performance I previously had done some benchmarking on this and reported the results: [4]. (I've recently re-done the benchmarks and conclusions are essentially the same.) The upshot is that implementations that use Arrays.copyOf() are the fastest, probably because it's an intrinsic. It can avoid zero-filling the freshly allocated array because it knows the entire array contents will be overwritten. This is true regardless of what the public API looks like. The default implementation calls generator.apply(0) to get a zero-length array and then simply calls toArray(T[]). This goes through the Arrays.copyOf() fast path, so it's essentially the same speed as toArray(new T[0]). The overrides I had previously provided in specific implementation classes like ArrayList actually are slower, because the allocation of the array is done separately from filling it. This necessitates the extra zero-filling step. Thus, I've removed the overrides. # API Shape There was some discussion about whether it would be preferable to have a Class parameter as a type token for the array component type or the array type itself, instead of using an IntFunction generator. Most of this boils down to what people are comfortable with. However, there are a few points that make the generator function approach preferable. One point in favor of using a generator is that Stream already has a similar toArray(generator) method. Comparing this to the type token alternatives, for simple tasks like converting Collection to String[], things are about equal: toArray(String[]::new) toArray(String.class) toArray(String[].class) However, things are hairier if the element type of the collection is generic, such as Collection>. The result type is a generic array List[]. Using class literals or array constructor references will necessitate using an unchecked cast, because none of these can be used to represent a generic type. However, it's possible to use a method reference to provide a properly generic IntFunction. This would enable the toArray(generator) method to be used without any unchecked casts. (The method it refers to might need have an unchecked cast within it, though.) Such a method could also be reused for the corresponding Stream.toArray(generator) method. For these reasons I'd like to proceed with adding toArray(generator) API. Thanks, s'marks [1] https://bugs.openjdk.java.net/browse/JDK-8060192 [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/ [3] http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html [4] http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/7/17 5:03 PM, Martin Buchholz wrote: (I'm still trying to love this new API) The changes to the jsr166 tck are fine. I'm not convinced that the new implementation for ArrayList is progress. The current implementation of toArray(T[]) does // Make a new array of a's runtime type, but my contents: return (T[]) Arrays.copyOf(elementData, size, a.getClass()); which does not have to pay the cost of zeroing the array, so is potentially faster. (depends on whether the VM provides cheap pre-zeroed memory?! what does jmh say?) If we're not getting type safety and not getting significantly better performance, all we have is a more natural API. But toArray(IntFunction) also seems not very natural to me, and we'll have to live with the toArray(new String[0]) wart forever anyways. (But is it really so bad?) (Maybe toArray(Class) is actually more natural?) A lot of ideas flying around here. (Thanks John! :-)) But for this message let me focus on performance. I think implicitly there's some concern about the new API and whether it might suffer inherent performance disadvantages compared to the existing APIs. It certainly doesn't seem to provide any inherent advantages. I spent some time doing benchmarking more rigorously, and I was able to get similar results to those from Alexey Shipilëv's article: https://shipilev.net/blog/2016/arrays-wisdom-ancients/ That is, I was able to discern the difference between the copy-to-Object[] case, the copy to presized array case, and copy to zero-sized array case. Also added a potential new API toArray(Class) for comparison. Here are the results. For brevity, I ran only the tests with an ArrayList of size 1000: Benchmark (size) (type) Mode Cnt ScoreError Units ToArrayBench.clazz 1000 arraylist avgt 30 1423.924 ± 11.437 ns/op ToArrayBench.clazz0 1000 arraylist avgt 30 1173.442 ± 11.614 ns/op ToArrayBench.clazzDf 1000 arraylist avgt 30 1179.648 ± 17.662 ns/op ToArrayBench.lambda 1000 arraylist avgt 30 1421.910 ± 21.320 ns/op ToArrayBench.lambda0 1000 arraylist avgt 30 1168.863 ± 11.109 ns/op ToArrayBench.lambdaDf1000 arraylist avgt 30 1168.372 ± 9.512 ns/op ToArrayBench.simple 1000 arraylist avgt 30 462.371 ± 8.693 ns/op ToArrayBench.sized 1000 arraylist avgt 30 1417.483 ± 11.451 ns/op ToArrayBench.zero1000 arraylist avgt 30 1182.932 ± 27.773 ns/op Legend: clazz - ArrayList override of toArray(Class) T[] a = (T[])java.lang.reflect.Array.newInstance(Foo.class, size); System.arraycopy(elementData, 0, a, 0, size); clazz0 - ArrayList override of toArray(Class) T[] a = (T[])java.lang.reflect.Array.newInstance(clazz, 0); return (T[]) Arrays.copyOf(elementData, size, a.getClass()); classDf - Collection default method toArray(Class) toArray((T[])java.lang.reflect.Array.newInstance(clazz, 0)) lambda - ArrayList override of toArray(IntFunc) T[] a = generator.apply(size); System.arraycopy(elementData, 0, a, 0, size); lambda0 - ArrayList override of toArray(IntFunc) (T[]) Arrays.copyOf(elementData, size, generator.apply(0).getClass()) lambdaDf - Collection default method toArray(IntFunc) toArray(generator.apply(0)) simple - existing no-arg toArray() method sized - existing toArray(T[]) called with presized array coll.toArray(new Foo[coll.size()]) zero - existing toArray(T[]) called with zero-sized array coll.toArray(new Foo[0]) -- A couple observations from this. First, "lambda", the ArrayList override of toArray(IntFunc), is slower than "lambda0" or "lambdaDf", both of which create zero-length arrays and pass them elsewhere. So Martin, you're right, the override I put into ArrayList isn't buying anything. I'll take it out. Oddly enough, just inheriting the default might be sufficient. (And the same probably applies to Arrays$ArrayList as well.) Second, setting aside "simple" case (which is fast because doesn't have to do any array store checking) we see performance falling into two buckets: one that takes around 1400ns, and the other that takes around 1170ns. It turns out that the faster cases all end up calling the Arrays.copyOf() method. This is an intrinsic that can avoid the work of zeroing the array, because it allocates the array immediately before filling it with data from the source. I would have thought that allocating the array immediately prior to filling it with System.arraycopy would have done the same, but apparently not. We can see the effect of intrinsics by disabling the Arrays.copyOf intrinsic by applying the -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_copyOf options: Benchmark (size) (type) Mode Cnt ScoreError Units ToArrayBench.clazz 1000 arraylist avgt 30 1413.922 ± 9.570 ns/op ToArrayBench.clazz0 1000 arraylist avgt 30
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Mon, Dec 11, 2017 at 8:00 PM, John Rosewrote: > I submit to you that such a factory is *not* an IntFunction, because > that only creates half of an array (or 0.01% of one), the empty > version that needs to be populated. A natural array factory API > [...] > The interface would look something like one of these: > > interface ArrayContent { int size(); T get(int i); } > interface ArrayContent { int size(); Iterator iterator(); } > interface ArrayContent { int count(); Stream stream(); } > interface ArrayContent { Spliterator elements(); } //must be SIZED Wasn't it suggested somewhat recently that arrays themselves might be retrofitted with this kind of interface? It seems like it would make sense for arrays to be able act as their own factories, bringing the idea full circle. -- - DML
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Dec 11, 2017, at 6:00 PM, John Rosewrote: > > class Arrays { > // people who want x.toArray can also use x.copy(Arrays::make): > static T[] make(SequenceContent content) { … } Correction; that one needs an array type to know what to make: static T[] make(SequenceContent content, Class arrayClass) { … } > // ArrayList.copy uses this one internally as "myContent": > static SequenceContent slice(T[] a, int start, int end) { … } > } > class List { > // people who want x.asUnmodifiableList can also use x.copy(List::make): > default SequenceCopier make(SequenceContent content) { … } > > // external access to a sized snapshot of a list: > default SequenceContent sizedContent() { … } > } Also, sizedContent is there for completeness but doesn't carry as much weight as the other three. It might be derived from Collection::copy applied to the identity function, unless there are scoping rules on the operand to Collection::copy.
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Frankly, I think the JDK is just sloppy here. Most of the existing implementations didn't use @Override -- possibly because they were introduced before annotations existed -- and later on, whoever implemented the overrides of the Java 8 default methods decided to add @Override. Part of it is that Doug and I are not huge fans of @Override. Is the gain in safety really worth the added verbosity? If we do decide to mandate the use of @Override in all of openjdk, then the errorprone project has a bug pattern check for you, which will provide more safety yet. As I said, I don't think there's a rule that requires @Override. Thanks for verifying this. :-) I'm not proposing adding such a rule either. I agree, it adds little benefit, at least for the JDK, and it adds clutter. A quick grep reveals over 33,000 occurrences of @Override in the JDK sources. That seems like a lot, but if we were to add it to all locations where it could be applied, this might add hundreds if not thousands more occurrences. I'm not sure it's worth the effort to do either the additions or the removals that'd be necessary to make things consistent. Oh well. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Dec 7, 2017, at 5:03 PM, Martin Buchholzwrote: > > (I'm still trying to love this new API) > > The changes to the jsr166 tck are fine. > > I'm not convinced that the new implementation for ArrayList is progress. > The current implementation of toArray(T[]) does > >// Make a new array of a's runtime type, but my contents: >return (T[]) Arrays.copyOf(elementData, size, a.getClass()); > > which does not have to pay the cost of zeroing the array, so is potentially > faster. (depends on whether the VM provides cheap pre-zeroed memory?! what > does jmh say?) The VM does not do pre-zeroing. We've done experiments over the years with that and it has never beaten pre-garbaged memory, with zero filling at creation time. The reason is that we can vectorize the zero filling at any time *and* omit zeroing at all in some important cases like copyOf. OTOH, pre-zeroing assumes there is background bandwidth available for pre-processing cache lines that are not yet in use, which isn't always true; sometimes the mutator needs all the cache bandwidth. All of which is to say that the VM is likely *never* to do pre-zeroing, unless there is some special hardware feature that does so, in bulk, in main memory. Which AFAIK doesn't exist today. This particular set of performance problems is (IMO) sensitive to the number of passes over the input and output arrays, as well as over any intermediate arrays. Zeroing isn't a separate pass in the case of copyOf, but in any two phase array creation protocol, zeroing *is* likely to require a separate pass. What do I mean by "two phase"? I mean a scheme where in phase one an empty array is created, and in phase two the array is filled with content. If the two phases are not reliably juxtaposed (as they are in copyOf) the VM must conservatively fill the empty array with zeros. Different code shapes have (inherently) different inlining characteristics. In the case of toArray(IntFunction), three things must happen to drop the zeroing pass (as in Arrays.copyOf): First, the particular IntFunction must fully inline (that's phase one), and second, the (phase two) loop that populates the new array must be juxtaposed with the inlined code of the IntFunction call, without intermediate logic that might discourage the optimizer. Third, both code shapes must use intrinsics that are recognized by the optimizer. These three conditions are carefully engineered in the body of Arrays.copyOf, but appear "emergently" ("accidentally") via the toArray(IF) API. So the optimization isn't impossible, but it is subject to more risk, and/or will require more optimizer work to support that code shape. Bottom line: Arrays.copyOf is not as arbitrary as one may think; it is designed that way because it is easy to optimize. And it covers two interesting use cases in the present discussion, which are (a) copy my internal array wholesale, and (b) make me a fresh array from a zero-length exemplar. > If we're not getting type safety and not getting significantly better > performance, all we have is a more natural API. But toArray(IntFunction) > also seems not very natural to me, Whether or not it bears on the present API discussion, I think it would be fruitful to step back and ask ourselves the big picture question here, "What is the natural shape of an array factory?" I submit to you that such a factory is *not* an IntFunction, because that only creates half of an array (or 0.01% of one), the empty version that needs to be populated. A natural array factory API will provide two pieces of data: First a size (an accurate, not a provisional one like List.size), and second a series of values to put in the array. *That* structure is the natural, might I say categorical, argument to an array factory. Also a List factory. As you can probably tell, I'm taking the value-based view of things here. (Proof by generalization: If/when we eventually create frozen arrays, and flattened arrays, and volatile arrays, and hybrid instance-with-sized-tail arrays, it will not always be an option to create the empty array in a first phase, and then fill it from the outside. Frozen arrays are the obvious counterexample, but any immutable data structure will do, as will a private object tail, even if mutable.) The interface would look something like one of these: interface ArrayContent { int size(); T get(int i); } interface ArrayContent { int size(); Iterator iterator(); } interface ArrayContent { int count(); Stream stream(); } interface ArrayContent { Spliterator elements(); } //must be SIZED That last suggests that we don't need a new API at all, that perhaps the natural input to an array factory is just a Spliterator with the SIZED characteristic, and a range check on the estimateSize() result. Or maybe a Stream. There are obvious bootstrapping issues here, but I don't think they are complex, because the source object can create a tiny Stream or Spliterator
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/08/2017 04:09 AM, David Lloyd wrote: Yes! It would be even better for the "toArray(Class)" case if Class had a "getArrayClass()" method which returned the Class, which would allow: return (T[]) Arrays.copyOf(elementData, size, componentType.getArrayClass()); and similar for other array-backed collections. I never understood why that method doesn't exist; in fact, am I wrong in understanding that "Array.newInstance(clazz, 0).getClass()" is effectively the only way to do this today? I think there is a reason for Arrays.copyOf() to take an array type, not the component type. If it was taking component type, you could pass in primitive types too with no compile-time type checking, since int.class is of Class static type. So why not the following: public interface Collection extends Iterable { @SuppressWarnings("unchecked") default T[] toArray(Class arrayType) { return toArray((T[]) Array.newInstance(arrayType.getComponentType(), size())); } ArrayList override could then simply be this: @Override public T[] toArray(Class arrayType) { return Arrays.copyOf(elementData, size, arrayType); } Regards, Peter On Thu, Dec 7, 2017 at 7:03 PM, Martin Buchholz wrote: (I'm still trying to love this new API) The changes to the jsr166 tck are fine. I'm not convinced that the new implementation for ArrayList is progress. The current implementation of toArray(T[]) does // Make a new array of a's runtime type, but my contents: return (T[]) Arrays.copyOf(elementData, size, a.getClass()); which does not have to pay the cost of zeroing the array, so is potentially faster. (depends on whether the VM provides cheap pre-zeroed memory?! what does jmh say?) If we're not getting type safety and not getting significantly better performance, all we have is a more natural API. But toArray(IntFunction) also seems not very natural to me, and we'll have to live with the toArray(new String[0]) wart forever anyways. (But is it really so bad?) (Maybe toArray(Class) is actually more natural?) On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks wrote: [Martin: please review changes to the JSR 166 TCK.] Another updated webrev for this changeset: http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ This includes an override of toArray(IntFunction) in the implementation of Arrays.asList(), as requested by Tagir Valeev. Overrides are also added for various wrapper classes in j.u.Collections. Turns out there's a test test/jdk/java/util/Collections/Wrappers.java that checks to ensure that the wrappers don't inherit any default methods. (This has been a source of bugs in the past.) Significantly, this webrev also includes changes to several tests in the JSR 166 TCK. The issue is that these tests have cases for null handling, where they call coll.toArray(null) to ensure that NPE is thrown. Given that this method is now overloaded: toArray(T[]) toArray(IntFunction) passing null is now ambiguous and thus generates a compiler error. I've modified the tests to call toArray((Object[])null) which is a bit of a stopgap. I can't think of anything obviously better to do, though. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Thu, Dec 7, 2017 at 4:47 PM, Stuart Markswrote: > > > On 12/7/17 3:50 PM, Jonathan Bluett-Duncan wrote: > > Looking over http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ >> src/java.base/share/classes/java/util/ArrayList.java.cdiff.html, I'm >> wondering if the method `ArrayList.toArray(IntFunction)` should have an >> `@Override` to make it clear that it's overriding the default method in >> Collection. What do you think? >> > > I guess it wouldn't hurt. I had thought about adding it, but most of the > methods in ArrayList that override implementations from supertypes *don't* > have @Override. Looking more closely, it appears that the ones that > override interface default methods *do* have @Override. > > I don't think there's a rule that says that @Override should be used when > overriding interface default methods but not when overriding > implementations from a superclass or when implementing an abstract > interface method. Frankly, I think the JDK is just sloppy here. Most of the > existing implementations didn't use @Override -- possibly because they were > introduced before annotations existed -- and later on, whoever implemented > the overrides of the Java 8 default methods decided to add @Override. > Part of it is that Doug and I are not huge fans of @Override. Is the gain in safety really worth the added verbosity? If we do decide to mandate the use of @Override in all of openjdk, then the errorprone project has a bug pattern check for you, which will provide more safety yet.
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Yes! It would be even better for the "toArray(Class)" case if Class had a "getArrayClass()" method which returned the Class, which would allow: return (T[]) Arrays.copyOf(elementData, size, componentType.getArrayClass()); and similar for other array-backed collections. I never understood why that method doesn't exist; in fact, am I wrong in understanding that "Array.newInstance(clazz, 0).getClass()" is effectively the only way to do this today? On Thu, Dec 7, 2017 at 7:03 PM, Martin Buchholz wrote: > (I'm still trying to love this new API) > > The changes to the jsr166 tck are fine. > > I'm not convinced that the new implementation for ArrayList is progress. > The current implementation of toArray(T[]) does > > // Make a new array of a's runtime type, but my contents: > return (T[]) Arrays.copyOf(elementData, size, a.getClass()); > > which does not have to pay the cost of zeroing the array, so is potentially > faster. (depends on whether the VM provides cheap pre-zeroed memory?! what > does jmh say?) > > If we're not getting type safety and not getting significantly better > performance, all we have is a more natural API. But toArray(IntFunction) > also seems not very natural to me, and we'll have to live with the > toArray(new String[0]) wart forever anyways. (But is it really so bad?) > (Maybe toArray(Class) is actually more natural?) > > On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks > wrote: > >> [Martin: please review changes to the JSR 166 TCK.] >> >> Another updated webrev for this changeset: >> >> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ >> >> This includes an override of toArray(IntFunction) in the implementation of >> Arrays.asList(), as requested by Tagir Valeev. >> >> Overrides are also added for various wrapper classes in j.u.Collections. >> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java >> that checks to ensure that the wrappers don't inherit any default methods. >> (This has been a source of bugs in the past.) >> >> Significantly, this webrev also includes changes to several tests in the >> JSR 166 TCK. The issue is that these tests have cases for null handling, >> where they call >> >> coll.toArray(null) >> >> to ensure that NPE is thrown. Given that this method is now overloaded: >> >> toArray(T[]) >> toArray(IntFunction) >> >> passing null is now ambiguous and thus generates a compiler error. I've >> modified the tests to call toArray((Object[])null) which is a bit of a >> stopgap. I can't think of anything obviously better to do, though. >> >> s'marks >> -- - DML
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
(I'm still trying to love this new API) The changes to the jsr166 tck are fine. I'm not convinced that the new implementation for ArrayList is progress. The current implementation of toArray(T[]) does // Make a new array of a's runtime type, but my contents: return (T[]) Arrays.copyOf(elementData, size, a.getClass()); which does not have to pay the cost of zeroing the array, so is potentially faster. (depends on whether the VM provides cheap pre-zeroed memory?! what does jmh say?) If we're not getting type safety and not getting significantly better performance, all we have is a more natural API. But toArray(IntFunction) also seems not very natural to me, and we'll have to live with the toArray(new String[0]) wart forever anyways. (But is it really so bad?) (Maybe toArray(Class) is actually more natural?) On Thu, Dec 7, 2017 at 2:58 PM, Stuart Markswrote: > [Martin: please review changes to the JSR 166 TCK.] > > Another updated webrev for this changeset: > > http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ > > This includes an override of toArray(IntFunction) in the implementation of > Arrays.asList(), as requested by Tagir Valeev. > > Overrides are also added for various wrapper classes in j.u.Collections. > Turns out there's a test test/jdk/java/util/Collections/Wrappers.java > that checks to ensure that the wrappers don't inherit any default methods. > (This has been a source of bugs in the past.) > > Significantly, this webrev also includes changes to several tests in the > JSR 166 TCK. The issue is that these tests have cases for null handling, > where they call > > coll.toArray(null) > > to ensure that NPE is thrown. Given that this method is now overloaded: > > toArray(T[]) > toArray(IntFunction) > > passing null is now ambiguous and thus generates a compiler error. I've > modified the tests to call toArray((Object[])null) which is a bit of a > stopgap. I can't think of anything obviously better to do, though. > > s'marks >
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/7/17 3:50 PM, Jonathan Bluett-Duncan wrote: Looking over http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html, I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an `@Override` to make it clear that it's overriding the default method in Collection. What do you think? I guess it wouldn't hurt. I had thought about adding it, but most of the methods in ArrayList that override implementations from supertypes *don't* have @Override. Looking more closely, it appears that the ones that override interface default methods *do* have @Override. I don't think there's a rule that says that @Override should be used when overriding interface default methods but not when overriding implementations from a superclass or when implementing an abstract interface method. Frankly, I think the JDK is just sloppy here. Most of the existing implementations didn't use @Override -- possibly because they were introduced before annotations existed -- and later on, whoever implemented the overrides of the Java 8 default methods decided to add @Override. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi Stuart, Looking over http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html, I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an `@Override` to make it clear that it's overriding the default method in Collection. What do you think? Cheers, Jonathan On 7 December 2017 at 22:58, Stuart Markswrote: > [Martin: please review changes to the JSR 166 TCK.] > > Another updated webrev for this changeset: > > http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ > > This includes an override of toArray(IntFunction) in the implementation of > Arrays.asList(), as requested by Tagir Valeev. > > Overrides are also added for various wrapper classes in j.u.Collections. > Turns out there's a test test/jdk/java/util/Collections/Wrappers.java > that checks to ensure that the wrappers don't inherit any default methods. > (This has been a source of bugs in the past.) > > Significantly, this webrev also includes changes to several tests in the > JSR 166 TCK. The issue is that these tests have cases for null handling, > where they call > > coll.toArray(null) > > to ensure that NPE is thrown. Given that this method is now overloaded: > > toArray(T[]) > toArray(IntFunction) > > passing null is now ambiguous and thus generates a compiler error. I've > modified the tests to call toArray((Object[])null) which is a bit of a > stopgap. I can't think of anything obviously better to do, though. > > s'marks >
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
[Martin: please review changes to the JSR 166 TCK.] Another updated webrev for this changeset: http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/ This includes an override of toArray(IntFunction) in the implementation of Arrays.asList(), as requested by Tagir Valeev. Overrides are also added for various wrapper classes in j.u.Collections. Turns out there's a test test/jdk/java/util/Collections/Wrappers.java that checks to ensure that the wrappers don't inherit any default methods. (This has been a source of bugs in the past.) Significantly, this webrev also includes changes to several tests in the JSR 166 TCK. The issue is that these tests have cases for null handling, where they call coll.toArray(null) to ensure that NPE is thrown. Given that this method is now overloaded: toArray(T[]) toArray(IntFunction) passing null is now ambiguous and thus generates a compiler error. I've modified the tests to call toArray((Object[])null) which is a bit of a stopgap. I can't think of anything obviously better to do, though. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/5/17 8:41 PM, Bernd Eckenfels wrote: Should the test also check for the case the generator returns under- and oversized arrays? Did you mean that the default implementation (and various overriding implementations) of toArray(generator) should do checks on the array that's returned from the generator? If so, I'm skeptical of the utility of adding such checks. If the array is too short, ArrayIndexOutOfBoundsException will occur, so no check is necessary. If the array is too long, it will work, but the application might have trouble ascertaining the number of elements copied. Presumably it wouldn't create a too-long array unless it can deal with this. In any case, it doesn't necessarily seem like this should be an error. The default impl looks less efficient than (new T[0]), should it really be removed as a major Javadoc example? (Or maybe add more specific implementations besides ArrayList?) Although my benchmarks were inconclusive, it doesn't seem like the extra call to the generator with zero should be a performance problem. Even if this is a problem, it will disappear as more overrides are added to the system. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/5/17 6:54 PM, Tagir Valeev wrote: I think, CopyOnWriteArrayList and a List returned by Arrays.asList deserve specialized implementation as well. Sure, I can add one to Array.asList right away (as part of this changeset). It's even covered by tests already. I'll work with Martin to update COWAL. Depending on how he wants to handle it, this might be done separately. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Should the test also check for the case the generator returns under- and oversized arrays? The default impl looks less efficient than (new T[0]), should it really be removed as a major Javadoc example? (Or maybe add more specific implementations besides ArrayList?) Gruss Bernd -- http://bernd.eckenfels.net From: core-libs-dev <core-libs-dev-boun...@openjdk.java.net> on behalf of Stuart Marks <stuart.ma...@oracle.com> Sent: Wednesday, December 6, 2017 2:27:44 AM To: core-libs-dev Subject: Re: RFR(s): 8060192: Add default method Collection.toArray(generator) Revised webrev based on comments from Martin Buchholz and Paul Sandoz: http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/ I've done a bit of rewriting of the @apiNote sections to clarify the intended use of the various toArray() methods. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hello! I think, CopyOnWriteArrayList and a List returned by Arrays.asList deserve specialized implementation as well. With best regards, Tagir Valeev. On Wed, Dec 6, 2017 at 8:27 AM, Stuart Markswrote: > Revised webrev based on comments from Martin Buchholz and Paul Sandoz: > > http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/ > > I've done a bit of rewriting of the @apiNote sections to clarify the > intended use of the various toArray() methods. > > s'marks >
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
The signature of the proposed generator is public T[] toArray(IntFunctiongenerator) So I don't think that anything has really changed in this regard; T can still be unrelated to E. Right, the new method doesn't offer any safety compared to toArray(T[]). You can still get ArrayStoreException if T is incompatible with E. That's in part what led me to update the @throws ArrayStoreException text, since the previous text was basically wrong. But fundamentally the problem is unchanged. There's a related enhancement request https://bugs.openjdk.java.net/browse/JDK-7023484 add typesafe static Collections.toArray(Collection, IntFunction ) that would provide static type safety for this operation. Unclear whether it's worthwhile to add something like this, though. That said, sending in a constant String[0] is probably just as good as a generator, or better (in my naive estimation the tradeoff is a >0 comparison versus an interface method dispatch), than an i -> new String[i] unless some kind of benchmark would appear which disproves that hypothesis. I did some quick benchmarking, and it looks like String[]::new is a shade faster (heh) than new String[size], but not quite as fast as new String[0]. But take these results with a grain of salt. I was doing the benchmarking on my laptop with my entire environment running, and I was also measuring my development fastdebug build. I wasn't able to see the same difference between new String[size] and new String[0] that Shipilëv observed.[1] It would be interesting to get some rigorous benchmarks to compare these, but I haven't yet taken the time to do this. I don't think the main point of this is performance, though. I think the new preferred way to create an array of a particular type should be toArray(Foo[]::new), which is preferable to toArray(new Foo[0]) or toArray(EMPTY_FOO_ARRAY). I suppose those of us "in the know" would recognize toArray(new Foo[0]) to be idiomatic. But to understand it, you have to fully understand the semantics of toArray(T[]), and you have to have read Shipilëv's article to understand why creating an apparently wasted zero-sized array is (usually) the right thing. That seems pretty obscure. s'marks [1] https://shipilev.net/blog/2016/arrays-wisdom-ancients/
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Revised webrev based on comments from Martin Buchholz and Paul Sandoz: http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/ I've done a bit of rewriting of the @apiNote sections to clarify the intended use of the various toArray() methods. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
> On 5 Dec 2017, at 11:19, Stuart Markswrote: > > > > On 12/5/17 10:47 AM, Paul Sandoz wrote: >> 345 * Suppose {@code x} is a collection known to contain only >> strings. >> 346 * The following code can be used to dump the collection into a >> newly >> 347 * allocated array of {@code String}: >> Make it an API note? (same for toArray(T[]) > > Will do. > >> 352 * @implSpec >> 353 * The default implementation calls the generator function with zero >> 354 * and then passes the resulting array to {@link #toArray(T[])}. >> That’s reasonable. I pondered about being vague and specifying a value in >> the range if [0, size()), to allow wiggle room for using 0 or size(). > > No, I really think it has to be exactly zero. If the default were to choose > some value K < size(), the collection size could change to be smaller than K > after allocation but before the call to toArray(T[]), resulting in an array > that is unexpectedly large. Ah yes, concurrent collections. Paul. > >> 360 * @throws ArrayStoreException if the runtime type of any element >> in this >> 361 * collection is not assignable to the {@linkplain >> Class#getComponentType >> 362 * runtime component type} of array returned by the >> generator function >> s/of array returned by the generator function/of the generated array/ ? > > Will do. > > s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On 12/5/17 10:47 AM, Paul Sandoz wrote: 345 * Suppose {@code x} is a collection known to contain only strings. 346 * The following code can be used to dump the collection into a newly 347 * allocated array of {@code String}: Make it an API note? (same for toArray(T[]) Will do. 352 * @implSpec 353 * The default implementation calls the generator function with zero 354 * and then passes the resulting array to {@link #toArray(T[])}. That’s reasonable. I pondered about being vague and specifying a value in the range if [0, size()), to allow wiggle room for using 0 or size(). No, I really think it has to be exactly zero. If the default were to choose some value K < size(), the collection size could change to be smaller than K after allocation but before the call to toArray(T[]), resulting in an array that is unexpectedly large. 360 * @throws ArrayStoreException if the runtime type of any element in this 361 * collection is not assignable to the {@linkplain Class#getComponentType 362 * runtime component type} of array returned by the generator function s/of array returned by the generator function/of the generated array/ ? Will do. s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
> On 4 Dec 2017, at 19:20, Stuart Markswrote: > > Hi all, > > Please review this small enhancement to add a default method > Collection.toArray(generator) that takes a function that creates the > destination array. This is analogous to Stream.toArray(generator). > > Bug: > >https://bugs.openjdk.java.net/browse/JDK-8060192 > 345 * Suppose {@code x} is a collection known to contain only strings. 346 * The following code can be used to dump the collection into a newly 347 * allocated array of {@code String}: Make it an API note? (same for toArray(T[]) 352 * @implSpec 353 * The default implementation calls the generator function with zero 354 * and then passes the resulting array to {@link #toArray(T[])}. That’s reasonable. I pondered about being vague and specifying a value in the range if [0, size()), to allow wiggle room for using 0 or size(). 360 * @throws ArrayStoreException if the runtime type of any element in this 361 * collection is not assignable to the {@linkplain Class#getComponentType 362 * runtime component type} of array returned by the generator function s/of array returned by the generator function/of the generated array/ ? Paul. > Webrev: > >http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/ > > Thanks, > > s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Tue, Dec 5, 2017 at 1:03 AM, Remi Foraxwrote: > Dumping an ArrayList into an array of String is fairly frequent, i > think. > > The main issue with the current API, T[] toArray(T[] array), is that T > can be unrelated to E (the type of the element in the collection) so one can > write > ArrayList list = ... > String[] array = list.toArray(new String[0]); > it may even works if the list is empty. > > It's getting worst if E and T can be a primitive type (with valhalla), > because toArray(T[]) as to support all combinations. > > So in my opinion, introducing toArray(generator) is a step in the right > direction. The signature of the proposed generator is > public T[] toArray(IntFunction generator) So I don't think that anything has really changed in this regard; T can still be unrelated to E. That said, sending in a constant String[0] is probably just as good as a generator, or better (in my naive estimation the tradeoff is a >0 comparison versus an interface method dispatch), than an i -> new String[i] unless some kind of benchmark would appear which disproves that hypothesis. -- - DML
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
+// no spec changes relative to supertype +public T[] toArray(IntFunctiongenerator) { You probably at least need @inheritDoc for the unchecked exception throws (as I've forgotten many times) There's a new javadoc option "--override-methods summary" that was recently added that we're using now. If an overriding method has no doc comment, it's simply listed in the "Methods inherited from" section at the bottom of the summary table. (It's now called "Methods declared in".) See it in action here: http://download.java.net/java/jdk10/docs/api/java/util/Properties.html Compare it to the JDK 9 version of Properties. (This suggests a big cleanup in collections, and probably many other areas, where docs were copied into subclasses for no good reason, or because they wanted to disinherit stuff that properly belongs in @implSpec.) The needToWorkAround6260652 changes ought to be in a separate changeset. OK, I can pull these out. The biggest question is whether Collection.toArray(generator) pulls its weight, especially in view of https://shipilev.net/blog/2016/arrays-wisdom-ancients. I rarely want to dump elements into a typed array. Dumping into Object[] with toArray() is just fine for me (but I'm a biased core library developer). Yeah, most collections want to use Object[]. But when you want a typed array, this is really helpful. Oh wait - Aleksey actually links to this bug! Sounds like he would be the most qualified reviewer! :-) s'marks
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi Martin, - Mail original - > De: "Martin Buchholz" <marti...@google.com> > À: "Stuart Marks" <stuart.ma...@oracle.com> > Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net> > Envoyé: Mardi 5 Décembre 2017 05:26:02 > Objet: Re: RFR(s): 8060192: Add default method Collection.toArray(generator) > The needToWorkAround6260652 changes ought to be in a separate changeset. > > The biggest question is whether Collection.toArray(generator) pulls its > weight, especially in view of > https://shipilev.net/blog/2016/arrays-wisdom-ancients. > > I rarely want to dump elements into a typed array. Dumping into Object[] > with toArray() is just fine for me (but I'm a biased core library > developer). Dumping an ArrayList into an array of String is fairly frequent, i think. The main issue with the current API, T[] toArray(T[] array), is that T can be unrelated to E (the type of the element in the collection) so one can write ArrayList list = ... String[] array = list.toArray(new String[0]); it may even works if the list is empty. It's getting worst if E and T can be a primitive type (with valhalla), because toArray(T[]) as to support all combinations. So in my opinion, introducing toArray(generator) is a step in the right direction. cheers, Rémi
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
On Mon, Dec 4, 2017 at 8:26 PM, Martin Buchholzwrote: > > The biggest question is whether Collection.toArray(generator) pulls its > weight, especially in view of https://shipilev.net/blog/ > 2016/arrays-wisdom-ancients. > Oh wait - Aleksey actually links to this bug! Sounds like he would be the most qualified reviewer!
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
The needToWorkAround6260652 changes ought to be in a separate changeset. The biggest question is whether Collection.toArray(generator) pulls its weight, especially in view of https://shipilev.net/blog/2016/arrays-wisdom-ancients. I rarely want to dump elements into a typed array. Dumping into Object[] with toArray() is just fine for me (but I'm a biased core library developer).
Re: RFR(s): 8060192: Add default method Collection.toArray(generator)
+// no spec changes relative to supertype +public T[] toArray(IntFunctiongenerator) { You probably at least need @inheritDoc for the unchecked exception throws (as I've forgotten many times) On Mon, Dec 4, 2017 at 7:20 PM, Stuart Marks wrote: > Hi all, > > Please review this small enhancement to add a default method > Collection.toArray(generator) that takes a function that creates the > destination array. This is analogous to Stream.toArray(generator). > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8060192 > > Webrev: > > http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/ > > Thanks, > > s'marks >
RFR(s): 8060192: Add default method Collection.toArray(generator)
Hi all, Please review this small enhancement to add a default method Collection.toArray(generator) that takes a function that creates the destination array. This is analogous to Stream.toArray(generator). Bug: https://bugs.openjdk.java.net/browse/JDK-8060192 Webrev: http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/ Thanks, s'marks