Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-18 Thread Stuart Marks




On 6/17/18 1:50 AM, Peter Levart wrote:

It's a little strange that the generator function is used to construct an
empty array (at least in the default method, but overrides will likely do the
same if they want to avoid pre-zeroing overhead) in order to just extract the
array's type from it. Someone could reasonably expect the provided function
to be called with the exact length of needed array. The
Collection.toArray(T[]) at least gives user an option to actually fully use
the provided array, if user provides the correct length.


This is actually what we're trying to avoid. The toArray(T[]) API has to deal
with the cases where the array isn't the right length, and it reallocs or
inserts an extra null in cases where it isn't. To avoid this, people do

coll.toArray(new MyClass[coll.size()])

which turns out to be an anti-pattern. We could try to teach people to write

coll.toArray(new MyClass[0])

instead, and this works, but it's quite non-obvious. ("Why do I need to create a
zero-length array first?") (#include Tagir's comment about caching zero-length
instances) Instead we want to direct people to write

coll.toArray(MyClass[]::new)

which creates an array of the right type and requested length.


The argument about using (and re-using) a method so that a method reference
can be passed to the method without any unchecked casts is equally true for
any of the 3 alternatives shown - the method itself might need unchecked
casts, but its usage not:

static List[] array(int len) static Class>
elementType() static Class[]> arrayType()


In principle all of these are possible. I don't see these as equal, though. It's 
quite common to have to create arrays of a generic type, either inline with 
unchecked casts, or possibly refactored into a method. I very rarely see Class 
objects of generic types.



But I can see that you want to align the API with Stream.toArray, while still
 providing the optimal implementation. It's just that the API doesn't fit the
 usecase. The function approach makes more sense in the Stream API where it
is explicitly explained that it may be used to construct arrays needed to
hold intermediary results of partitioned parallel execution too, but in
Collection API it is usually the case to just provide a copy of the
underlying data structure in the most optimal way (without pre-zeroing
overhead) and for that purpose, 2nd and 3rd alternatives are a better fit.


Sure, the Stream API has additional uses for holding intermediate results. That
doesn't imply that Collection.toArray(generator) doesn't meet its use case
(which I described above).

I also don't see how class type tokens are a better fit. A type token is
"natural" if you're thinking of implementing it in terms of Arrays.copyOf() --
which it is right now, but that's an implementation detail.


Suppose Stream took a different approach and used the 2nd or 3rd approach
(which is universal). Would then Collection API also get the same method?


I'm not sure where this is headed. I'm pretty sure we considered using type
tokens for Stream.toArray() and rejected them in favor of toArray(generator). If
there had been some reason to use a type token instead, maybe we would have used
them, in which case we'd consider modifying Collection.toArray() would take a 
type token as well. But I'm not aware of such a reason, so



It might have been the case in the past when Java generics were introduced,
that class literals like List.class  would just confuse users,
because most aspects of such type token would be erased and there were fears
that enabling them might limit options for reified generics in the future.
But long years have passed and java programmers have generally become
acquainted with Java generics and erasure to the point where it doesn't
confuse them any more, while reifying Java generics has been explored further
in Valhalla to the point where it has become obvious that erasure of
reference types is here to stay.

Java could enable use of class literals like List.class without fear
that such literals would make users write buggy code or that such uses would
limit options for Java in the future. Quite the contrary, they would enable
users to write type-safer code than what they can write today. In the light
of future Java  (valhalla specialized generics), where such class literals
make even more sense, enabling generic class literals could be viewed as a
stepping stone that has some purpose in the short term too.

Again, I'm not sure where this is headed.

I certainly would not propose changing the language to allow generic class 
literals in order to support the present use case (creating an array from a 
collection).


As a longer term proposal, generic class literals might or might not be 
worthwhile on their own merits. It's certainly an irritant not having them; I 
could imagine some uses for them. But there are probably some downsides (such as 
unsoundness) that will need to be thought through.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-18 Thread Brian Goetz
+1 from me.

> On Jun 14, 2018, at 9:02 PM, Stuart Marks  wrote:
> 
> Hi all,
> 
> I wanted to restart review of the changeset for bug JDK-8060192 [1]. The 
> latest webrev is here: [2].
> 
> The previous review was from December 2017: [3]. The only difference between 
> the current changeset and the previous one is the removal of the overriding 
> implementations (see explanation below). Much of the discussion in the 
> previous review thread was around performance and the shape of the API.
> 
> # Performance
> 
> I previously had done some benchmarking on this and reported the results: 
> [4]. (I've recently re-done the benchmarks and conclusions are essentially 
> the same.)
> 
> The upshot is that implementations that use Arrays.copyOf() are the fastest, 
> probably because it's an intrinsic. It can avoid zero-filling the freshly 
> allocated array because it knows the entire array contents will be 
> overwritten. This is true regardless of what the public API looks like. The 
> default implementation calls generator.apply(0) to get a zero-length array 
> and then simply calls toArray(T[]). This goes through the Arrays.copyOf() 
> fast path, so it's essentially the same speed as toArray(new T[0]).
> 
> The overrides I had previously provided in specific implementation classes 
> like ArrayList actually are slower, because the allocation of the array is 
> done separately from filling it. This necessitates the extra zero-filling 
> step. Thus, I've removed the overrides.
> 
> # API Shape
> 
> There was some discussion about whether it would be preferable to have a 
> Class parameter as a type token for the array component type or the array 
> type itself, instead of using an IntFunction generator. Most of this boils 
> down to what people are comfortable with. However, there are a few points 
> that make the generator function approach preferable.
> 
> One point in favor of using a generator is that Stream already has a similar 
> toArray(generator) method.
> 
> Comparing this to the type token alternatives, for simple tasks like 
> converting Collection to String[], things are about equal:
> 
>toArray(String[]::new)
>toArray(String.class)
>toArray(String[].class)
> 
> However, things are hairier if the element type of the collection is generic, 
> such as Collection>. The result type is a generic array 
> List[]. Using class literals or array constructor references will 
> necessitate using an unchecked cast, because none of these can be used to 
> represent a generic type.
> 
> However, it's possible to use a method reference to provide a properly 
> generic IntFunction. This would enable the toArray(generator) method to be 
> used without any unchecked casts. (The method it refers to might need have an 
> unchecked cast within it, though.) Such a method could also be reused for the 
> corresponding Stream.toArray(generator) method.
> 
> For these reasons I'd like to proceed with adding toArray(generator) API.
> 
> Thanks,
> 
> s'marks
> 
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8060192
> 
> [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/
> 
> [3] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html
> 
> [4] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html
> 



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-18 Thread Stuart Marks




Tagir Valeev wrote:


If you are going to create long-living array which is likely to be empty,
it's good to deduplicate them to spare the heap space. This can be easily
done via existing toArray method like
collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array
constant. We use this pattern very often in IntelliJ IDEA source code (e.
g. think of method which returns an array of Java member annotations; often
there are none). New generator methods is much less convenient for this:
toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually
missing a method accepting an empty array instead of generator in the
Stream API. And I don't think that new collection method is useful. The
existing one is perfect.


Rémi Forax wrote:


we can expect the VM to not allocate the array if once inlined the code is
   new MyClass[0].getClass()

if it's not the case, i think it's a better idea to tweak the VM rather than 
try to change an API based on a pattern that should not be necessary.


I think the two of you are talking about different things. Tagir is concerned 
about *long-lived* zero-length arrays, whereas Rémi is talking about the 
possibility of short-circuiting the allocation of a zero-length array if it's 
replaced by a nonzero-length array and thus has an extremely short life.


Tagir, if your use case is that you know you are creating lots of long-lived 
zero-length arrays, and you want to deduplicate them, then sure, using 
toArray(MyClass.EMPTY_ARRAY) is a fine thing to do. There are a bunch of 
assumptions here about the longevity and frequency of creation of such arrays. 
Having a shared empty array might indeed be the right thing for the IntelliJ 
IDEA code base, but that doesn't mean it's true in general. The 
toArray(generator) might be perfectly fine for many code bases even if it isn't 
suitable for the one you have in mind.


You also mention the lack of a  T[] Stream.toArray(T[]) method. This would 
seem to help the case of sharing a zero-length array, but 
Collection.toArray(T[]) also has odd semantics that we didn't want to replicate 
in streams. The point of that method is to *reuse* the argument array. The odd 
semantics are that, when the argument array is longer than necessary, null is 
added after the last written element. Instead of replicating the semantics of 
Collection.toArray(T[]) on Stream, we ended up with Stream.toArray(generator) 
instead.


Now maybe the Stream.toArray() overloads aren't sufficient for what you want to 
do, in which case you might want to propose something. But that sounds like a 
different discussion.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-18 Thread Stuart Marks

Hi Rémi,

On 6/15/18 12:26 AM, Remi Forax wrote:

The overrides I had previously provided in specific implementation classes like
ArrayList actually are slower, because the allocation of the array is done
separately from filling it. This necessitates the extra zero-filling step. Thus,
I've removed the overrides.


for ArrayList, you may use a code like this,
T[] toArray(IntFunction generator) {
 return (T[]) Arrays.copyOf(elementData, size, 
generator.apply(0).getClass());
   }
so you win only one comparison (yeah !), which can be perfectly predicted, so 
you should not see any perf difference :)


True, this will probably work better than the previous code (allocate correct 
size, fill with System.arraycopy) but it doesn't seem likely to be any faster 
than the default method.



List.class or List[].class do not work either.


I think they can be made to work (see other sub-thread with Peter Levart) but I 
don't see any advantages going in that direction.



For these reasons I'd like to proceed with adding toArray(generator) API.


so thumb up for me !


Great!

s'marks



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-17 Thread Peter Levart




On 06/17/18 10:50, Peter Levart wrote:
The argument about using (and re-using) a method so that its method 
reference can be passed to the toArray method without any unchecked 
casts is equally true for any of the 3 alternatives shown - the method 
itself might need unchecked casts, but its usage not:


static List[] array(int len)
static Class> elementType()
static Class[]> arrayType()


For that purpose, I would rather get rid of the need to create a method 
at all.


It might have been the case in the past when Java generics were 
introduced, that class literals like List.class  would just 
confuse users, because most aspects of such type token would be erased 
and there were fears that enabling them might limit options for reified 
generics in the future. But long years have passed and java programmers 
have generally become acquainted with Java generics and erasure to the 
point where it doesn't confuse them any more, while reifying Java 
generics has been explored further in Valhalla to the point where it has 
become obvious that erasure of reference types is here to stay.


Java could enable use of class literals like List.class without 
fear that such literals would make users write buggy code or that such 
uses would limit options for Java in the future. Quite the contrary, 
they would enable users to write type-safer code than what they can 
write today. In the light of future Java  (valhalla specialized 
generics), where such class literals make even more sense, enabling 
generic class literals could be viewed as a stepping stone that has some 
purpose in the short term too.


So what do you think?

Regards, Peter



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-17 Thread Remi Forax
Hi Tagir,

- Mail original -
> De: "Tagir Valeev" 
> À: "Stuart Marks" 
> Cc: "core-libs-dev" 
> Envoyé: Samedi 16 Juin 2018 06:01:35
> Objet: Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

> Hello!
> 
> If you are going to create long-living array which is likely to be empty,
> it's good to deduplicate them to spare the heap space. This can be easily
> done via existing toArray method like
> collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array
> constant. We use this pattern very often in IntelliJ IDEA source code (e.
> g. think of method which returns an array of Java member annotations; often
> there are none). New generator methods is much less convenient for this:
> toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually
> missing a method accepting an empty array instead of generator in the
> Stream API. And I don't think that new collection method is useful. The
> existing one is perfect.
> 
> With best regards,
> Tagir Valeev

we can expect the VM to not allocate the array if once inlined the code is
  new MyClass[0].getClass()

if it's not the case, i think it's a better idea to tweak the VM rather than 
try to change an API based on a pattern that should not be necessary.

Rémi

> 
> пт, 15 июня 2018 г., 8:03 Stuart Marks :
> 
>> Hi all,
>>
>> I wanted to restart review of the changeset for bug JDK-8060192 [1]. The
>> latest
>> webrev is here: [2].
>>
>> The previous review was from December 2017: [3]. The only difference
>> between the
>> current changeset and the previous one is the removal of the overriding
>> implementations (see explanation below). Much of the discussion in the
>> previous
>> review thread was around performance and the shape of the API.
>>
>> # Performance
>>
>> I previously had done some benchmarking on this and reported the results:
>> [4].
>> (I've recently re-done the benchmarks and conclusions are essentially the
>> same.)
>>
>> The upshot is that implementations that use Arrays.copyOf() are the
>> fastest,
>> probably because it's an intrinsic. It can avoid zero-filling the freshly
>> allocated array because it knows the entire array contents will be
>> overwritten.
>> This is true regardless of what the public API looks like. The default
>> implementation calls generator.apply(0) to get a zero-length array and
>> then
>> simply calls toArray(T[]). This goes through the Arrays.copyOf() fast
>> path, so
>> it's essentially the same speed as toArray(new T[0]).
>>
>> The overrides I had previously provided in specific implementation classes
>> like
>> ArrayList actually are slower, because the allocation of the array is done
>> separately from filling it. This necessitates the extra zero-filling step.
>> Thus,
>> I've removed the overrides.
>>
>> # API Shape
>>
>> There was some discussion about whether it would be preferable to have a
>> Class
>> parameter as a type token for the array component type or the array type
>> itself,
>> instead of using an IntFunction generator. Most of this boils down to what
>> people are comfortable with. However, there are a few points that make the
>> generator function approach preferable.
>>
>> One point in favor of using a generator is that Stream already has a
>> similar
>> toArray(generator) method.
>>
>> Comparing this to the type token alternatives, for simple tasks like
>> converting
>> Collection to String[], things are about equal:
>>
>>  toArray(String[]::new)
>>  toArray(String.class)
>>  toArray(String[].class)
>>
>> However, things are hairier if the element type of the collection is
>> generic,
>> such as Collection>. The result type is a generic array
>> List[]. Using class literals or array constructor references will
>> necessitate using an unchecked cast, because none of these can be used to
>> represent a generic type.
>>
>> However, it's possible to use a method reference to provide a properly
>> generic
>> IntFunction. This would enable the toArray(generator) method to be used
>> without
>> any unchecked casts. (The method it refers to might need have an unchecked
>> cast
>> within it, though.) Such a method could also be reused for the
>> corresponding
>> Stream.toArray(generator) method.
>>
>> For these reasons I'd like to proceed with adding toArray(generator) API.
>>
>> Thanks,
>>
>> s'marks
>>
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8060192
>>
>> [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/
>>
>> [3]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html
>>
>> [4]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html
>>


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-17 Thread Peter Levart

Hi Stuart,

On 06/15/18 03:02, Stuart Marks wrote:
Comparing this to the type token alternatives, for simple tasks like 
converting Collection to String[], things are about equal:


    toArray(String[]::new)
    toArray(String.class)
    toArray(String[].class)

However, things are hairier if the element type of the collection is 
generic, such as Collection>. The result type is a 
generic array List[]. Using class literals or array 
constructor references will necessitate using an unchecked cast, 
because none of these can be used to represent a generic type.


However, it's possible to use a method reference to provide a properly 
generic IntFunction. This would enable the toArray(generator) method 
to be used without any unchecked casts. (The method it refers to might 
need have an unchecked cast within it, though.) Such a method could 
also be reused for the corresponding Stream.toArray(generator) method.


For these reasons I'd like to proceed with adding toArray(generator) API. 


It's a little strange that the generator function is used to construct 
an empty array (at least in the default method, but overrides will 
likely do the same if they want to avoid pre-zeroing overhead) in order 
to just extract the array's type from it. Someone could reasonably 
expect the provided function to be called with the exact length of 
needed array. The Collection.toArray(T[]) at least gives user an option 
to actually fully use the provided array, if user provides the correct 
length. The argument about using (and re-using) a method so that a 
method reference can be passed to the method without any unchecked casts 
is equally true for any of the 3 alternatives shown - the method itself 
might need unchecked casts, but its usage not:


static List[] array(int len)
static Class> elementType()
static Class[]> arrayType()


But I can see that you want to align the API with Stream.toArray, while 
still providing the optimal implementation. It's just that the API 
doesn't fit the usecase. The function approach makes more sense in the 
Stream API where it is explicitly explained that it may be used to 
construct arrays needed to hold intermediary results of partitioned 
parallel execution too, but in Collection API it is usually the case to 
just provide a copy of the underlying data structure in the most optimal 
way (without pre-zeroing overhead) and for that purpose, 2nd and 3rd 
alternatives are a better fit.


Suppose Stream took a different approach and used the 2nd or 3rd 
approach (which is universal). Would then Collection API also get the 
same method?



Regards, Peter



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-15 Thread Tagir Valeev
Hello!

If you are going to create long-living array which is likely to be empty,
it's good to deduplicate them to spare the heap space. This can be easily
done via existing toArray method like
collection.toArray(MyClass.EMPTY_ARRAY) assuming we have an empty array
constant. We use this pattern very often in IntelliJ IDEA source code (e.
g. think of method which returns an array of Java member annotations; often
there are none). New generator methods is much less convenient for this:
toArray(s -> s == 0?MyClass.EMPTY_ARRAY:new MyClass[s]). I'm actually
missing a method accepting an empty array instead of generator in the
Stream API. And I don't think that new collection method is useful. The
existing one is perfect.

With best regards,
Tagir Valeev

пт, 15 июня 2018 г., 8:03 Stuart Marks :

> Hi all,
>
> I wanted to restart review of the changeset for bug JDK-8060192 [1]. The
> latest
> webrev is here: [2].
>
> The previous review was from December 2017: [3]. The only difference
> between the
> current changeset and the previous one is the removal of the overriding
> implementations (see explanation below). Much of the discussion in the
> previous
> review thread was around performance and the shape of the API.
>
> # Performance
>
> I previously had done some benchmarking on this and reported the results:
> [4].
> (I've recently re-done the benchmarks and conclusions are essentially the
> same.)
>
> The upshot is that implementations that use Arrays.copyOf() are the
> fastest,
> probably because it's an intrinsic. It can avoid zero-filling the freshly
> allocated array because it knows the entire array contents will be
> overwritten.
> This is true regardless of what the public API looks like. The default
> implementation calls generator.apply(0) to get a zero-length array and
> then
> simply calls toArray(T[]). This goes through the Arrays.copyOf() fast
> path, so
> it's essentially the same speed as toArray(new T[0]).
>
> The overrides I had previously provided in specific implementation classes
> like
> ArrayList actually are slower, because the allocation of the array is done
> separately from filling it. This necessitates the extra zero-filling step.
> Thus,
> I've removed the overrides.
>
> # API Shape
>
> There was some discussion about whether it would be preferable to have a
> Class
> parameter as a type token for the array component type or the array type
> itself,
> instead of using an IntFunction generator. Most of this boils down to what
> people are comfortable with. However, there are a few points that make the
> generator function approach preferable.
>
> One point in favor of using a generator is that Stream already has a
> similar
> toArray(generator) method.
>
> Comparing this to the type token alternatives, for simple tasks like
> converting
> Collection to String[], things are about equal:
>
>  toArray(String[]::new)
>  toArray(String.class)
>  toArray(String[].class)
>
> However, things are hairier if the element type of the collection is
> generic,
> such as Collection>. The result type is a generic array
> List[]. Using class literals or array constructor references will
> necessitate using an unchecked cast, because none of these can be used to
> represent a generic type.
>
> However, it's possible to use a method reference to provide a properly
> generic
> IntFunction. This would enable the toArray(generator) method to be used
> without
> any unchecked casts. (The method it refers to might need have an unchecked
> cast
> within it, though.) Such a method could also be reused for the
> corresponding
> Stream.toArray(generator) method.
>
> For these reasons I'd like to proceed with adding toArray(generator) API.
>
> Thanks,
>
> s'marks
>
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8060192
>
> [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/
>
> [3]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html
>
> [4]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html
>
>


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-15 Thread Remi Forax
- Mail original -
> De: "Stuart Marks" 
> À: "core-libs-dev" 
> Envoyé: Vendredi 15 Juin 2018 03:02:04
> Objet: RFR(s): 8060192: Add default method Collection.toArray(generator)

> Hi all,

Hi Stuart,

> 
> I wanted to restart review of the changeset for bug JDK-8060192 [1]. The 
> latest
> webrev is here: [2].
> 
> The previous review was from December 2017: [3]. The only difference between 
> the
> current changeset and the previous one is the removal of the overriding
> implementations (see explanation below). Much of the discussion in the 
> previous
> review thread was around performance and the shape of the API.
> 
> # Performance
> 
> I previously had done some benchmarking on this and reported the results: [4].
> (I've recently re-done the benchmarks and conclusions are essentially the 
> same.)
> 
> The upshot is that implementations that use Arrays.copyOf() are the fastest,
> probably because it's an intrinsic. It can avoid zero-filling the freshly
> allocated array because it knows the entire array contents will be 
> overwritten.
> This is true regardless of what the public API looks like. The default
> implementation calls generator.apply(0) to get a zero-length array and then
> simply calls toArray(T[]). This goes through the Arrays.copyOf() fast path, so
> it's essentially the same speed as toArray(new T[0]).
> 
> The overrides I had previously provided in specific implementation classes 
> like
> ArrayList actually are slower, because the allocation of the array is done
> separately from filling it. This necessitates the extra zero-filling step. 
> Thus,
> I've removed the overrides.


for ArrayList, you may use a code like this,
   T[] toArray(IntFunction generator) {
return (T[]) Arrays.copyOf(elementData, size, 
generator.apply(0).getClass());
  }
so you win only one comparison (yeah !), which can be perfectly predicted, so 
you should not see any perf difference :)

> 
> # API Shape
> 
> There was some discussion about whether it would be preferable to have a Class
> parameter as a type token for the array component type or the array type 
> itself,
> instead of using an IntFunction generator. Most of this boils down to what
> people are comfortable with. However, there are a few points that make the
> generator function approach preferable.
> 
> One point in favor of using a generator is that Stream already has a similar
> toArray(generator) method.
> 
> Comparing this to the type token alternatives, for simple tasks like 
> converting
> Collection to String[], things are about equal:
> 
> toArray(String[]::new)
> toArray(String.class)
> toArray(String[].class)
> 
> However, things are hairier if the element type of the collection is generic,
> such as Collection>. The result type is a generic array
> List[]. Using class literals or array constructor references will
> necessitate using an unchecked cast, because none of these can be used to
> represent a generic type.
> 
> However, it's possible to use a method reference to provide a properly generic
> IntFunction. This would enable the toArray(generator) method to be used 
> without
> any unchecked casts. (The method it refers to might need have an unchecked 
> cast
> within it, though.) Such a method could also be reused for the corresponding
> Stream.toArray(generator) method.

List.class or List[].class do not work either.

> 
> For these reasons I'd like to proceed with adding toArray(generator) API.
> 

so thumb up for me !

> Thanks,
> 
> s'marks

Rémi

> 
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8060192
> 
> [2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/
> 
> [3]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html
> 
> [4]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html


RFR(s): 8060192: Add default method Collection.toArray(generator)

2018-06-14 Thread Stuart Marks

Hi all,

I wanted to restart review of the changeset for bug JDK-8060192 [1]. The latest 
webrev is here: [2].


The previous review was from December 2017: [3]. The only difference between the 
current changeset and the previous one is the removal of the overriding 
implementations (see explanation below). Much of the discussion in the previous 
review thread was around performance and the shape of the API.


# Performance

I previously had done some benchmarking on this and reported the results: [4]. 
(I've recently re-done the benchmarks and conclusions are essentially the same.)


The upshot is that implementations that use Arrays.copyOf() are the fastest, 
probably because it's an intrinsic. It can avoid zero-filling the freshly 
allocated array because it knows the entire array contents will be overwritten. 
This is true regardless of what the public API looks like. The default 
implementation calls generator.apply(0) to get a zero-length array and then 
simply calls toArray(T[]). This goes through the Arrays.copyOf() fast path, so 
it's essentially the same speed as toArray(new T[0]).


The overrides I had previously provided in specific implementation classes like 
ArrayList actually are slower, because the allocation of the array is done 
separately from filling it. This necessitates the extra zero-filling step. Thus, 
I've removed the overrides.


# API Shape

There was some discussion about whether it would be preferable to have a Class 
parameter as a type token for the array component type or the array type itself, 
instead of using an IntFunction generator. Most of this boils down to what 
people are comfortable with. However, there are a few points that make the 
generator function approach preferable.


One point in favor of using a generator is that Stream already has a similar 
toArray(generator) method.


Comparing this to the type token alternatives, for simple tasks like converting 
Collection to String[], things are about equal:


toArray(String[]::new)
toArray(String.class)
toArray(String[].class)

However, things are hairier if the element type of the collection is generic, 
such as Collection>. The result type is a generic array 
List[]. Using class literals or array constructor references will 
necessitate using an unchecked cast, because none of these can be used to 
represent a generic type.


However, it's possible to use a method reference to provide a properly generic 
IntFunction. This would enable the toArray(generator) method to be used without 
any unchecked casts. (The method it refers to might need have an unchecked cast 
within it, though.) Such a method could also be reused for the corresponding 
Stream.toArray(generator) method.


For these reasons I'd like to proceed with adding toArray(generator) API.

Thanks,

s'marks


[1] https://bugs.openjdk.java.net/browse/JDK-8060192

[2] http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.3/

[3] 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050325.html

[4] 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050585.html



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-12 Thread Stuart Marks



On 12/7/17 5:03 PM, Martin Buchholz wrote:

(I'm still trying to love this new API)

The changes to the jsr166 tck are fine.

I'm not convinced that the new implementation for ArrayList is progress.  The 
current implementation of toArray(T[]) does


             // Make a new array of a's runtime type, but my contents:
             return (T[]) Arrays.copyOf(elementData, size, a.getClass());

which does not have to pay the cost of zeroing the array, so is potentially 
faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what does 
jmh say?)


If we're not getting type safety and not getting significantly better 
performance, all we have is a more natural API.  But toArray(IntFunction) also 
seems not very natural to me, and we'll have to live with the toArray(new 
String[0]) wart forever anyways.  (But is it really so bad?)  (Maybe 
toArray(Class) is actually more natural?)


A lot of ideas flying around here. (Thanks John! :-)) But for this message let 
me focus on performance.


I think implicitly there's some concern about the new API and whether it might 
suffer inherent performance disadvantages compared to the existing APIs. It 
certainly doesn't seem to provide any inherent advantages. I spent some time 
doing benchmarking more rigorously, and I was able to get similar results to 
those from Alexey Shipilëv's article:


https://shipilev.net/blog/2016/arrays-wisdom-ancients/

That is, I was able to discern the difference between the copy-to-Object[] case, 
the copy to presized array case, and copy to zero-sized array case. Also added a 
potential new API toArray(Class) for comparison.


Here are the results. For brevity, I ran only the tests with an ArrayList of 
size 1000:


Benchmark  (size) (type)  Mode  Cnt ScoreError  Units
ToArrayBench.clazz   1000  arraylist  avgt   30  1423.924 ± 11.437  ns/op
ToArrayBench.clazz0  1000  arraylist  avgt   30  1173.442 ± 11.614  ns/op
ToArrayBench.clazzDf 1000  arraylist  avgt   30  1179.648 ± 17.662  ns/op
ToArrayBench.lambda  1000  arraylist  avgt   30  1421.910 ± 21.320  ns/op
ToArrayBench.lambda0 1000  arraylist  avgt   30  1168.863 ± 11.109  ns/op
ToArrayBench.lambdaDf1000  arraylist  avgt   30  1168.372 ±  9.512  ns/op
ToArrayBench.simple  1000  arraylist  avgt   30   462.371 ±  8.693  ns/op
ToArrayBench.sized   1000  arraylist  avgt   30  1417.483 ± 11.451  ns/op
ToArrayBench.zero1000  arraylist  avgt   30  1182.932 ± 27.773  ns/op

Legend:

clazz - ArrayList override of toArray(Class)

T[] a = (T[])java.lang.reflect.Array.newInstance(Foo.class, size);
System.arraycopy(elementData, 0, a, 0, size);

clazz0 - ArrayList override of toArray(Class)

T[] a = (T[])java.lang.reflect.Array.newInstance(clazz, 0);
return (T[]) Arrays.copyOf(elementData, size, a.getClass());

classDf - Collection default method toArray(Class)

toArray((T[])java.lang.reflect.Array.newInstance(clazz, 0))

lambda - ArrayList override of toArray(IntFunc)

T[] a = generator.apply(size);
System.arraycopy(elementData, 0, a, 0, size);

lambda0 - ArrayList override of toArray(IntFunc)

(T[]) Arrays.copyOf(elementData, size, generator.apply(0).getClass())

lambdaDf - Collection default method toArray(IntFunc)

toArray(generator.apply(0))

simple - existing no-arg toArray() method

sized - existing toArray(T[]) called with presized array

coll.toArray(new Foo[coll.size()])

zero - existing toArray(T[]) called with zero-sized array

coll.toArray(new Foo[0])

--

A couple observations from this. First, "lambda", the ArrayList override of 
toArray(IntFunc), is slower than "lambda0" or "lambdaDf", both of which create 
zero-length arrays and pass them elsewhere. So Martin, you're right, the 
override I put into ArrayList isn't buying anything. I'll take it out. Oddly 
enough, just inheriting the default might be sufficient.


(And the same probably applies to Arrays$ArrayList as well.)

Second, setting aside "simple" case (which is fast because doesn't have to do 
any array store checking) we see performance falling into two buckets: one that 
takes around 1400ns, and the other that takes around 1170ns. It turns out that 
the faster cases all end up calling the Arrays.copyOf() method. This is an 
intrinsic that can avoid the work of zeroing the array, because it allocates the 
array immediately before filling it with data from the source.


I would have thought that allocating the array immediately prior to filling it 
with System.arraycopy would have done the same, but apparently not.


We can see the effect of intrinsics by disabling the Arrays.copyOf intrinsic by 
applying the -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_copyOf options:


Benchmark  (size) (type)  Mode  Cnt ScoreError  Units
ToArrayBench.clazz   1000  arraylist  avgt   30  1413.922 ±  9.570  ns/op
ToArrayBench.clazz0  1000  arraylist  avgt   30  

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-12 Thread David Lloyd
On Mon, Dec 11, 2017 at 8:00 PM, John Rose  wrote:
> I submit to you that such a factory is *not* an IntFunction, because
> that only creates half of an array (or 0.01% of one), the empty
> version that needs to be populated.  A natural array factory API
> [...]
> The interface would look something like one of these:
>
> interface ArrayContent { int size(); T get(int i); }
> interface ArrayContent { int size(); Iterator iterator(); }
> interface ArrayContent { int count(); Stream stream(); }
> interface ArrayContent { Spliterator elements(); } //must be SIZED

Wasn't it suggested somewhat recently that arrays themselves might be
retrofitted with this kind of interface?  It seems like it would make
sense for arrays to be able act as their own factories, bringing the
idea full circle.

-- 
- DML


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-11 Thread John Rose
On Dec 11, 2017, at 6:00 PM, John Rose  wrote:
> 
> class Arrays {
>  // people who want x.toArray can also use x.copy(Arrays::make):
>  static  T[] make(SequenceContent content) { … }

Correction; that one needs an array type to know what to make:

static  T[] make(SequenceContent content,
   Class arrayClass) { … }


> // ArrayList.copy uses this one internally as "myContent":
>  static  SequenceContent slice(T[] a, int start, int end) { … }
> }
> class List {
>  // people who want x.asUnmodifiableList can also use x.copy(List::make):
>  default SequenceCopier make(SequenceContent content) { … }
> 
>  // external access to a sized snapshot of a list:
>  default SequenceContent sizedContent() { … }
> }

Also, sizedContent is there for completeness but doesn't carry
as much weight as the other three.  It might be derived from
Collection::copy applied to the identity function, unless there
are scoping rules on the operand to Collection::copy.



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-11 Thread Stuart Marks




Frankly, I think the JDK is just sloppy here. Most of the existing
implementations didn't use @Override -- possibly because they were
introduced before annotations existed -- and later on, whoever implemented
the overrides of the Java 8 default methods decided to add @Override.

Part of it is that Doug and I are not huge fans of @Override.  Is the gain in 
safety really worth the added verbosity?


If we do decide to mandate the use of @Override in all of openjdk, then the 
errorprone project has a bug pattern check for you, which will provide more 
safety yet.


As I said, I don't think there's a rule that requires @Override. Thanks for 
verifying this. :-)


I'm not proposing adding such a rule either. I agree, it adds little benefit, at 
least for the JDK, and it adds clutter.


A quick grep reveals over 33,000 occurrences of @Override in the JDK sources. 
That seems like a lot, but if we were to add it to all locations where it could 
be applied, this might add hundreds if not thousands more occurrences. I'm not 
sure it's worth the effort to do either the additions or the removals that'd be 
necessary to make things consistent. Oh well.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-11 Thread John Rose
On Dec 7, 2017, at 5:03 PM, Martin Buchholz  wrote:
> 
> (I'm still trying to love this new API)
> 
> The changes to the jsr166 tck are fine.
> 
> I'm not convinced that the new implementation for ArrayList is progress.
> The current implementation of toArray(T[]) does
> 
>// Make a new array of a's runtime type, but my contents:
>return (T[]) Arrays.copyOf(elementData, size, a.getClass());
> 
> which does not have to pay the cost of zeroing the array, so is potentially
> faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
> does jmh say?)

The VM does not do pre-zeroing.  We've done experiments over the years
with that and it has never beaten pre-garbaged memory, with zero filling
at creation time.  The reason is that we can vectorize the zero filling at
any time *and* omit zeroing at all in some important cases like copyOf.
OTOH, pre-zeroing assumes there is background bandwidth available
for pre-processing cache lines that are not yet in use, which isn't always
true; sometimes the mutator needs all the cache bandwidth.  All of which
is to say that the VM is likely *never* to do pre-zeroing, unless there is
some special hardware feature that does so, in bulk, in main memory.
Which AFAIK doesn't exist today.

This particular set of performance problems is (IMO) sensitive to the
number of passes over the input and output arrays, as well as over
any intermediate arrays.  Zeroing isn't a separate pass in the case
of copyOf, but in any two phase array creation protocol, zeroing *is*
likely to require a separate pass.  What do I mean by "two phase"?
I mean a scheme where in phase one an empty array is created,
and in phase two the array is filled with content.  If the two phases
are not reliably juxtaposed (as they are in copyOf) the VM must
conservatively fill the empty array with zeros.  Different code shapes
have (inherently) different inlining characteristics.  In the case of
toArray(IntFunction), three things must happen to drop the zeroing
pass (as in Arrays.copyOf):  First, the particular IntFunction must
fully inline (that's phase one), and second, the (phase two) loop
that populates the new array must be juxtaposed with the inlined
code of the IntFunction call, without intermediate logic that might
discourage the optimizer.  Third, both code shapes must use
intrinsics that are recognized by the optimizer.  These three
conditions are carefully engineered in the body of Arrays.copyOf,
but appear "emergently" ("accidentally") via the toArray(IF) API.
So the optimization isn't impossible, but it is subject to more risk,
and/or will require more optimizer work to support that code shape.

Bottom line:  Arrays.copyOf is not as arbitrary as one may think;
it is designed that way because it is easy to optimize.  And it covers
two interesting use cases in the present discussion, which are
(a) copy my internal array wholesale, and (b) make me a fresh
array from a zero-length exemplar.

> If we're not getting type safety and not getting significantly better
> performance, all we have is a more natural API.  But toArray(IntFunction)
> also seems not very natural to me,

Whether or not it bears on the present API discussion, I think it
would be fruitful to step back and ask ourselves the big picture
question here, "What is the natural shape of an array factory?"

I submit to you that such a factory is *not* an IntFunction, because
that only creates half of an array (or 0.01% of one), the empty
version that needs to be populated.  A natural array factory API
will provide two pieces of data:  First a size (an accurate, not a
provisional one like List.size), and second a series of values to
put in the array.  *That* structure is the natural, might I say
categorical, argument to an array factory.  Also a List factory.
As you can probably tell, I'm taking the value-based view of
things here.

(Proof by generalization:  If/when we eventually create frozen
arrays, and flattened arrays, and volatile arrays, and hybrid
instance-with-sized-tail arrays, it will not always be an option
to create the empty array in a first phase, and then fill it from
the outside.  Frozen arrays are the obvious counterexample,
but any immutable data structure will do, as will a private
object tail, even if mutable.)

The interface would look something like one of these:

interface ArrayContent { int size(); T get(int i); }
interface ArrayContent { int size(); Iterator iterator(); }
interface ArrayContent { int count(); Stream stream(); }
interface ArrayContent { Spliterator elements(); } //must be SIZED

That last suggests that we don't need a new API at all,
that perhaps the natural input to an array factory is just
a Spliterator with the SIZED characteristic, and a range
check on the estimateSize() result.  Or maybe a Stream.

There are obvious bootstrapping issues here, but I don't
think they are complex, because the source object can
create a tiny Stream or Spliterator 

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-08 Thread Peter Levart



On 12/08/2017 04:09 AM, David Lloyd wrote:

Yes!  It would be even better for the "toArray(Class)" case if
Class had a "getArrayClass()" method which returned the Class,
which would allow:

   return (T[]) Arrays.copyOf(elementData, size, componentType.getArrayClass());

and similar for other array-backed collections.  I never understood
why that method doesn't exist; in fact, am I wrong in understanding
that "Array.newInstance(clazz, 0).getClass()" is effectively the only
way to do this today?


I think there is a reason for Arrays.copyOf() to take an array type, not 
the component type. If it was taking component type, you could pass in 
primitive types too with no compile-time type checking, since int.class 
is of Class static type. So why not the following:


public interface Collection extends Iterable {

    @SuppressWarnings("unchecked")
    default  T[] toArray(Class arrayType) {
    return toArray((T[]) 
Array.newInstance(arrayType.getComponentType(), size()));

    }


ArrayList override could then simply be this:

    @Override
    public  T[] toArray(Class arrayType) {
    return Arrays.copyOf(elementData, size, arrayType);
    }


Regards, Peter



On Thu, Dec 7, 2017 at 7:03 PM, Martin Buchholz  wrote:

(I'm still trying to love this new API)

The changes to the jsr166 tck are fine.

I'm not convinced that the new implementation for ArrayList is progress.
The current implementation of toArray(T[]) does

 // Make a new array of a's runtime type, but my contents:
 return (T[]) Arrays.copyOf(elementData, size, a.getClass());

which does not have to pay the cost of zeroing the array, so is potentially
faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
does jmh say?)

If we're not getting type safety and not getting significantly better
performance, all we have is a more natural API.  But toArray(IntFunction)
also seems not very natural to me, and we'll have to live with the
toArray(new String[0]) wart forever anyways.  (But is it really so bad?)
  (Maybe toArray(Class) is actually more natural?)

On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks 
wrote:


[Martin: please review changes to the JSR 166 TCK.]

Another updated webrev for this changeset:

 http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/

This includes an override of toArray(IntFunction) in the implementation of
Arrays.asList(), as requested by Tagir Valeev.

Overrides are also added for various wrapper classes in j.u.Collections.
Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
that checks to ensure that the wrappers don't inherit any default methods.
(This has been a source of bugs in the past.)

Significantly, this webrev also includes changes to several tests in the
JSR 166 TCK. The issue is that these tests have cases for null handling,
where they call

 coll.toArray(null)

to ensure that NPE is thrown. Given that this method is now overloaded:

 toArray(T[])
 toArray(IntFunction)

passing null is now ambiguous and thus generates a compiler error. I've
modified the tests to call toArray((Object[])null) which is a bit of a
stopgap. I can't think of anything obviously better to do, though.

s'marks








Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread Martin Buchholz
On Thu, Dec 7, 2017 at 4:47 PM, Stuart Marks 
wrote:

>
>
> On 12/7/17 3:50 PM, Jonathan Bluett-Duncan wrote:
>
> Looking over http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>> src/java.base/share/classes/java/util/ArrayList.java.cdiff.html, I'm
>> wondering if the method `ArrayList.toArray(IntFunction)` should have an
>> `@Override` to make it clear that it's overriding the default method in
>> Collection. What do you think?
>>
>
> I guess it wouldn't hurt. I had thought about adding it, but most of the
> methods in ArrayList that override implementations from supertypes *don't*
> have @Override. Looking more closely, it appears that the ones that
> override interface default methods *do* have @Override.
>
> I don't think there's a rule that says that @Override should be used when
> overriding interface default methods but not when overriding
> implementations from a superclass or when implementing an abstract
> interface method. Frankly, I think the JDK is just sloppy here. Most of the
> existing implementations didn't use @Override -- possibly because they were
> introduced before annotations existed -- and later on, whoever implemented
> the overrides of the Java 8 default methods decided to add @Override.
>

Part of it is that Doug and I are not huge fans of @Override.  Is the gain
in safety really worth the added verbosity?

If we do decide to mandate the use of @Override in all of openjdk, then the
errorprone project has a bug pattern check for you, which will provide more
safety yet.


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread David Lloyd
Yes!  It would be even better for the "toArray(Class)" case if
Class had a "getArrayClass()" method which returned the Class,
which would allow:

  return (T[]) Arrays.copyOf(elementData, size, componentType.getArrayClass());

and similar for other array-backed collections.  I never understood
why that method doesn't exist; in fact, am I wrong in understanding
that "Array.newInstance(clazz, 0).getClass()" is effectively the only
way to do this today?

On Thu, Dec 7, 2017 at 7:03 PM, Martin Buchholz  wrote:
> (I'm still trying to love this new API)
>
> The changes to the jsr166 tck are fine.
>
> I'm not convinced that the new implementation for ArrayList is progress.
> The current implementation of toArray(T[]) does
>
> // Make a new array of a's runtime type, but my contents:
> return (T[]) Arrays.copyOf(elementData, size, a.getClass());
>
> which does not have to pay the cost of zeroing the array, so is potentially
> faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
> does jmh say?)
>
> If we're not getting type safety and not getting significantly better
> performance, all we have is a more natural API.  But toArray(IntFunction)
> also seems not very natural to me, and we'll have to live with the
> toArray(new String[0]) wart forever anyways.  (But is it really so bad?)
>  (Maybe toArray(Class) is actually more natural?)
>
> On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks 
> wrote:
>
>> [Martin: please review changes to the JSR 166 TCK.]
>>
>> Another updated webrev for this changeset:
>>
>> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>>
>> This includes an override of toArray(IntFunction) in the implementation of
>> Arrays.asList(), as requested by Tagir Valeev.
>>
>> Overrides are also added for various wrapper classes in j.u.Collections.
>> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
>> that checks to ensure that the wrappers don't inherit any default methods.
>> (This has been a source of bugs in the past.)
>>
>> Significantly, this webrev also includes changes to several tests in the
>> JSR 166 TCK. The issue is that these tests have cases for null handling,
>> where they call
>>
>> coll.toArray(null)
>>
>> to ensure that NPE is thrown. Given that this method is now overloaded:
>>
>> toArray(T[])
>> toArray(IntFunction)
>>
>> passing null is now ambiguous and thus generates a compiler error. I've
>> modified the tests to call toArray((Object[])null) which is a bit of a
>> stopgap. I can't think of anything obviously better to do, though.
>>
>> s'marks
>>



-- 
- DML


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread Martin Buchholz
(I'm still trying to love this new API)

The changes to the jsr166 tck are fine.

I'm not convinced that the new implementation for ArrayList is progress.
The current implementation of toArray(T[]) does

// Make a new array of a's runtime type, but my contents:
return (T[]) Arrays.copyOf(elementData, size, a.getClass());

which does not have to pay the cost of zeroing the array, so is potentially
faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
does jmh say?)

If we're not getting type safety and not getting significantly better
performance, all we have is a more natural API.  But toArray(IntFunction)
also seems not very natural to me, and we'll have to live with the
toArray(new String[0]) wart forever anyways.  (But is it really so bad?)
 (Maybe toArray(Class) is actually more natural?)

On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks 
wrote:

> [Martin: please review changes to the JSR 166 TCK.]
>
> Another updated webrev for this changeset:
>
> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>
> This includes an override of toArray(IntFunction) in the implementation of
> Arrays.asList(), as requested by Tagir Valeev.
>
> Overrides are also added for various wrapper classes in j.u.Collections.
> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
> that checks to ensure that the wrappers don't inherit any default methods.
> (This has been a source of bugs in the past.)
>
> Significantly, this webrev also includes changes to several tests in the
> JSR 166 TCK. The issue is that these tests have cases for null handling,
> where they call
>
> coll.toArray(null)
>
> to ensure that NPE is thrown. Given that this method is now overloaded:
>
> toArray(T[])
> toArray(IntFunction)
>
> passing null is now ambiguous and thus generates a compiler error. I've
> modified the tests to call toArray((Object[])null) which is a bit of a
> stopgap. I can't think of anything obviously better to do, though.
>
> s'marks
>


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread Stuart Marks



On 12/7/17 3:50 PM, Jonathan Bluett-Duncan wrote:

Looking over 
http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html, 
I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an 
`@Override` to make it clear that it's overriding the default method in 
Collection. What do you think?


I guess it wouldn't hurt. I had thought about adding it, but most of the methods 
in ArrayList that override implementations from supertypes *don't* have 
@Override. Looking more closely, it appears that the ones that override 
interface default methods *do* have @Override.


I don't think there's a rule that says that @Override should be used when 
overriding interface default methods but not when overriding implementations 
from a superclass or when implementing an abstract interface method. Frankly, I 
think the JDK is just sloppy here. Most of the existing implementations didn't 
use @Override -- possibly because they were introduced before annotations 
existed -- and later on, whoever implemented the overrides of the Java 8 default 
methods decided to add @Override.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread Jonathan Bluett-Duncan
Hi Stuart,

Looking over
http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html,
I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an
`@Override` to make it clear that it's overriding the default method in
Collection. What do you think?

Cheers,
Jonathan

On 7 December 2017 at 22:58, Stuart Marks  wrote:

> [Martin: please review changes to the JSR 166 TCK.]
>
> Another updated webrev for this changeset:
>
> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>
> This includes an override of toArray(IntFunction) in the implementation of
> Arrays.asList(), as requested by Tagir Valeev.
>
> Overrides are also added for various wrapper classes in j.u.Collections.
> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
> that checks to ensure that the wrappers don't inherit any default methods.
> (This has been a source of bugs in the past.)
>
> Significantly, this webrev also includes changes to several tests in the
> JSR 166 TCK. The issue is that these tests have cases for null handling,
> where they call
>
> coll.toArray(null)
>
> to ensure that NPE is thrown. Given that this method is now overloaded:
>
> toArray(T[])
> toArray(IntFunction)
>
> passing null is now ambiguous and thus generates a compiler error. I've
> modified the tests to call toArray((Object[])null) which is a bit of a
> stopgap. I can't think of anything obviously better to do, though.
>
> s'marks
>


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-07 Thread Stuart Marks

[Martin: please review changes to the JSR 166 TCK.]

Another updated webrev for this changeset:

http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/

This includes an override of toArray(IntFunction) in the implementation of 
Arrays.asList(), as requested by Tagir Valeev.


Overrides are also added for various wrapper classes in j.u.Collections. Turns 
out there's a test test/jdk/java/util/Collections/Wrappers.java that checks to 
ensure that the wrappers don't inherit any default methods. (This has been a 
source of bugs in the past.)


Significantly, this webrev also includes changes to several tests in the JSR 166 
TCK. The issue is that these tests have cases for null handling, where they call


coll.toArray(null)

to ensure that NPE is thrown. Given that this method is now overloaded:

toArray(T[])
toArray(IntFunction)

passing null is now ambiguous and thus generates a compiler error. I've modified 
the tests to call toArray((Object[])null) which is a bit of a stopgap. I can't 
think of anything obviously better to do, though.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-06 Thread Stuart Marks



On 12/5/17 8:41 PM, Bernd Eckenfels wrote:

Should the test also check for the case the generator returns under- and 
oversized arrays?


Did you mean that the default implementation (and various overriding 
implementations) of toArray(generator) should do checks on the array that's 
returned from the generator? If so, I'm skeptical of the utility of adding such 
checks.


If the array is too short, ArrayIndexOutOfBoundsException will occur, so no 
check is necessary. If the array is too long, it will work, but the application 
might have trouble ascertaining the number of elements copied. Presumably it 
wouldn't create a too-long array unless it can deal with this. In any case, it 
doesn't necessarily seem like this should be an error.



The default impl looks less efficient than (new T[0]), should it really be 
removed as a major Javadoc example? (Or maybe add more specific implementations 
besides ArrayList?)


Although my benchmarks were inconclusive, it doesn't seem like the extra call to 
the generator with zero should be a performance problem. Even if this is a 
problem, it will disappear as more overrides are added to the system.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-06 Thread Stuart Marks

On 12/5/17 6:54 PM, Tagir Valeev wrote:

I think, CopyOnWriteArrayList and a List returned by Arrays.asList
deserve specialized implementation as well.


Sure, I can add one to Array.asList right away (as part of this changeset). It's 
even covered by tests already.


I'll work with Martin to update COWAL. Depending on how he wants to handle it, 
this might be done separately.


s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Bernd Eckenfels
Should the test also check for the case the generator returns under- and 
oversized arrays?

The default impl looks less efficient than (new T[0]), should it really be 
removed as a major Javadoc example? (Or maybe add more specific implementations 
besides ArrayList?)

Gruss
Bernd
--
http://bernd.eckenfels.net

From: core-libs-dev <core-libs-dev-boun...@openjdk.java.net> on behalf of 
Stuart Marks <stuart.ma...@oracle.com>
Sent: Wednesday, December 6, 2017 2:27:44 AM
To: core-libs-dev
Subject: Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Revised webrev based on comments from Martin Buchholz and Paul Sandoz:

 http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/

I've done a bit of rewriting of the @apiNote sections to clarify the intended
use of the various toArray() methods.

s'marks



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Tagir Valeev
Hello!

I think, CopyOnWriteArrayList and a List returned by Arrays.asList
deserve specialized implementation as well.

With best regards,
Tagir Valeev.

On Wed, Dec 6, 2017 at 8:27 AM, Stuart Marks  wrote:
> Revised webrev based on comments from Martin Buchholz and Paul Sandoz:
>
> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/
>
> I've done a bit of rewriting of the @apiNote sections to clarify the
> intended use of the various toArray() methods.
>
> s'marks
>


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Stuart Marks




The signature of the proposed generator is


public  T[] toArray(IntFunction generator)


So I don't think that anything has really changed in this regard; T
can still be unrelated to E.


Right, the new method doesn't offer any safety compared to toArray(T[]). You can 
still get ArrayStoreException if T is incompatible with E. That's in part what 
led me to update the @throws ArrayStoreException text, since the previous text 
was basically wrong. But fundamentally the problem is unchanged.


There's a related enhancement request

https://bugs.openjdk.java.net/browse/JDK-7023484
add typesafe static Collections.toArray(Collection, 
IntFunction)


that would provide static type safety for this operation. Unclear whether it's 
worthwhile to add something like this, though.



That said, sending in a constant String[0] is probably just as good as
a generator, or better (in my naive estimation the tradeoff is a >0
comparison versus an interface method dispatch), than an i -> new
String[i] unless some kind of benchmark would appear which disproves
that hypothesis.


I did some quick benchmarking, and it looks like String[]::new is a shade faster 
(heh) than new String[size], but not quite as fast as new String[0]. But take 
these results with a grain of salt. I was doing the benchmarking on my laptop 
with my entire environment running, and I was also measuring my development 
fastdebug build. I wasn't able to see the same difference between new 
String[size] and new String[0] that Shipilëv observed.[1] It would be 
interesting to get some rigorous benchmarks to compare these, but I haven't yet 
taken the time to do this.


I don't think the main point of this is performance, though. I think the new 
preferred way to create an array of a particular type should be 
toArray(Foo[]::new), which is preferable to toArray(new Foo[0]) or 
toArray(EMPTY_FOO_ARRAY). I suppose those of us "in the know" would recognize 
toArray(new Foo[0]) to be idiomatic. But to understand it, you have to fully 
understand the semantics of toArray(T[]), and you have to have read Shipilëv's 
article to understand why creating an apparently wasted zero-sized array is 
(usually) the right thing. That seems pretty obscure.


s'marks


[1] https://shipilev.net/blog/2016/arrays-wisdom-ancients/


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Stuart Marks

Revised webrev based on comments from Martin Buchholz and Paul Sandoz:

http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/

I've done a bit of rewriting of the @apiNote sections to clarify the intended 
use of the various toArray() methods.


s'marks



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Paul Sandoz


> On 5 Dec 2017, at 11:19, Stuart Marks  wrote:
> 
> 
> 
> On 12/5/17 10:47 AM, Paul Sandoz wrote:
>>  345  * Suppose {@code x} is a collection known to contain only 
>> strings.
>>  346  * The following code can be used to dump the collection into a 
>> newly
>>  347  * allocated array of {@code String}:
>> Make it an API note? (same for toArray(T[])
> 
> Will do.
> 
>>  352  * @implSpec
>>  353  * The default implementation calls the generator function with zero
>>  354  * and then passes the resulting array to {@link #toArray(T[])}.
>> That’s reasonable. I pondered about being vague and specifying a value in 
>> the range if [0, size()), to allow wiggle room for using 0 or size().
> 
> No, I really think it has to be exactly zero. If the default were to choose 
> some value K < size(), the collection size could change to be smaller than K 
> after allocation but before the call to toArray(T[]), resulting in an array 
> that is unexpectedly large.

Ah yes, concurrent collections.

Paul.

> 
>>  360  * @throws ArrayStoreException if the runtime type of any element 
>> in this
>>  361  * collection is not assignable to the {@linkplain 
>> Class#getComponentType
>>  362  * runtime component type} of array returned by the 
>> generator function
>> s/of array returned by the generator function/of the generated array/ ?
> 
> Will do.
> 
> s'marks



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Stuart Marks



On 12/5/17 10:47 AM, Paul Sandoz wrote:


  345  * Suppose {@code x} is a collection known to contain only strings.
  346  * The following code can be used to dump the collection into a newly
  347  * allocated array of {@code String}:

Make it an API note? (same for toArray(T[])


Will do.


  352  * @implSpec
  353  * The default implementation calls the generator function with zero
  354  * and then passes the resulting array to {@link #toArray(T[])}.

That’s reasonable. I pondered about being vague and specifying a value in the 
range if [0, size()), to allow wiggle room for using 0 or size().


No, I really think it has to be exactly zero. If the default were to choose some 
value K < size(), the collection size could change to be smaller than K after 
allocation but before the call to toArray(T[]), resulting in an array that is 
unexpectedly large.



  360  * @throws ArrayStoreException if the runtime type of any element in 
this
  361  * collection is not assignable to the {@linkplain 
Class#getComponentType
  362  * runtime component type} of array returned by the generator 
function

s/of array returned by the generator function/of the generated array/ ?


Will do.

s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread Paul Sandoz


> On 4 Dec 2017, at 19:20, Stuart Marks  wrote:
> 
> Hi all,
> 
> Please review this small enhancement to add a default method 
> Collection.toArray(generator) that takes a function that creates the 
> destination array. This is analogous to Stream.toArray(generator).
> 
> Bug:
> 
>https://bugs.openjdk.java.net/browse/JDK-8060192
> 

 345  * Suppose {@code x} is a collection known to contain only strings.
 346  * The following code can be used to dump the collection into a newly
 347  * allocated array of {@code String}:

Make it an API note? (same for toArray(T[])


 352  * @implSpec
 353  * The default implementation calls the generator function with zero
 354  * and then passes the resulting array to {@link #toArray(T[])}.

That’s reasonable. I pondered about being vague and specifying a value in the 
range if [0, size()), to allow wiggle room for using 0 or size().


 360  * @throws ArrayStoreException if the runtime type of any element in 
this
 361  * collection is not assignable to the {@linkplain 
Class#getComponentType
 362  * runtime component type} of array returned by the generator 
function

s/of array returned by the generator function/of the generated array/ ?

Paul.

> Webrev:
> 
>http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/
> 
> Thanks,
> 
> s'marks



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-05 Thread David Lloyd
On Tue, Dec 5, 2017 at 1:03 AM, Remi Forax  wrote:
> Dumping an ArrayList into an array of String is fairly frequent, i 
> think.
>
> The main issue with the current API,  T[] toArray(T[] array), is that T 
> can be unrelated to E (the type of the element in the collection) so one can 
> write
>   ArrayList list = ...
>   String[] array = list.toArray(new String[0]);
> it may even works if the list is empty.
>
> It's getting worst if E and T can be a primitive type (with valhalla), 
> because toArray(T[]) as to support all combinations.
>
> So in my opinion, introducing toArray(generator) is a step in the right 
> direction.

The signature of the proposed generator is

> public  T[] toArray(IntFunction generator)

So I don't think that anything has really changed in this regard; T
can still be unrelated to E.

That said, sending in a constant String[0] is probably just as good as
a generator, or better (in my naive estimation the tradeoff is a >0
comparison versus an interface method dispatch), than an i -> new
String[i] unless some kind of benchmark would appear which disproves
that hypothesis.
-- 
- DML


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Stuart Marks
 
+// no spec changes relative to supertype

+public  T[] toArray(IntFunction generator) {

You probably at least need @inheritDoc for the unchecked exception throws (as 
I've forgotten many times)



There's a new javadoc option "--override-methods summary" that was recently 
added that we're using now. If an overriding method has no doc comment, it's 
simply listed in the "Methods inherited from" section at the bottom of the 
summary table. (It's now called "Methods declared in".)


See it in action here:

http://download.java.net/java/jdk10/docs/api/java/util/Properties.html

Compare it to the JDK 9 version of Properties.

(This suggests a big cleanup in collections, and probably many other areas, 
where docs were copied into subclasses for no good reason, or because they 
wanted to disinherit stuff that properly belongs in @implSpec.)



The needToWorkAround6260652 changes ought to be in a separate changeset.


OK, I can pull these out.


The biggest question is whether Collection.toArray(generator) pulls its weight, 
especially in view of https://shipilev.net/blog/2016/arrays-wisdom-ancients.

I rarely want to dump elements into a typed array.  Dumping into Object[] with 
toArray() is just fine for me (but I'm a biased core library developer).


Yeah, most collections want to use Object[]. But when you want a typed array, 
this is really helpful.


Oh wait - Aleksey actually links to this bug!  Sounds like he would be the most 
qualified reviewer!


:-)

s'marks


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Remi Forax
Hi Martin,

- Mail original -
> De: "Martin Buchholz" <marti...@google.com>
> À: "Stuart Marks" <stuart.ma...@oracle.com>
> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
> Envoyé: Mardi 5 Décembre 2017 05:26:02
> Objet: Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

> The needToWorkAround6260652 changes ought to be in a separate changeset.
> 
> The biggest question is whether Collection.toArray(generator) pulls its
> weight, especially in view of
> https://shipilev.net/blog/2016/arrays-wisdom-ancients.
> 
> I rarely want to dump elements into a typed array.  Dumping into Object[]
> with toArray() is just fine for me (but I'm a biased core library
> developer).

Dumping an ArrayList into an array of String is fairly frequent, i 
think.

The main issue with the current API,  T[] toArray(T[] array), is that T can 
be unrelated to E (the type of the element in the collection) so one can write
  ArrayList list = ...
  String[] array = list.toArray(new String[0]); 
it may even works if the list is empty.

It's getting worst if E and T can be a primitive type (with valhalla), because 
toArray(T[]) as to support all combinations.

So in my opinion, introducing toArray(generator) is a step in the right 
direction.

cheers,
Rémi


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Martin Buchholz
On Mon, Dec 4, 2017 at 8:26 PM, Martin Buchholz  wrote:

>
> The biggest question is whether Collection.toArray(generator) pulls its
> weight, especially in view of https://shipilev.net/blog/
> 2016/arrays-wisdom-ancients.
>

Oh wait - Aleksey actually links to this bug!  Sounds like he would be the
most qualified reviewer!


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Martin Buchholz
The needToWorkAround6260652 changes ought to be in a separate changeset.

The biggest question is whether Collection.toArray(generator) pulls its
weight, especially in view of
https://shipilev.net/blog/2016/arrays-wisdom-ancients.

I rarely want to dump elements into a typed array.  Dumping into Object[]
with toArray() is just fine for me (but I'm a biased core library
developer).


Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Martin Buchholz
+// no spec changes relative to supertype
+public  T[] toArray(IntFunction generator) {

You probably at least need @inheritDoc for the unchecked exception throws
(as I've forgotten many times)

On Mon, Dec 4, 2017 at 7:20 PM, Stuart Marks 
wrote:

> Hi all,
>
> Please review this small enhancement to add a default method
> Collection.toArray(generator) that takes a function that creates the
> destination array. This is analogous to Stream.toArray(generator).
>
> Bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8060192
>
> Webrev:
>
> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/
>
> Thanks,
>
> s'marks
>


RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-04 Thread Stuart Marks

Hi all,

Please review this small enhancement to add a default method 
Collection.toArray(generator) that takes a function that creates the destination 
array. This is analogous to Stream.toArray(generator).


Bug:

https://bugs.openjdk.java.net/browse/JDK-8060192

Webrev:

http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/

Thanks,

s'marks