This is an interesting problem.

> > for (; i < hi; ++i) {
> >         @SuppressWarnings("unchecked")
> >         E e = lst.get(i);
> >         action.accept(e);
> >         checkForComodification(mc);
> > }
> > return;
>
> For better performance ArrayList.spliterator() does not check for 
> comodification on every iteration. I guess, here it's also possible to do 
> this. Note that as lst.get() and especially action.accept() will unlikely to 
> be inlined by JIT as such call sites are usually polymorphic in real 
> applications. As a consequence, checkForComodification cannot be optimized to 
> eliminate field load and instanceof check, so these operations will be 
> performed on every iteration. forEachRemaining() implementation avoids this 
> by storing field into local variable lst, but checkForComodification uses 
> field, not local.

The challenge here is that, unless we check comodification for every iteration, 
list.get(i) can throw IndexOutOfBoundsException when commodification (e.g. 
remove()) happens. That's what I faced in SpliteratorLateBindingFailFastTest. 
ArrayList can avoid this by leveraging its backing array. We don't have access 
to underlying data structure for generic List/AbstractList, so it's challenging 
to take similar approach. Appreciate any advice to address this!  

As for JIT optimization, would it help if checkForComodification() method takes 
a list as a parameter intead of always using field list, and we pass local 
variable lst within forEachRemaining() method?
        
Thanks,
Hiroshi

-----Original Message-----
From: Tagir F. Valeev [mailto:amae...@gmail.com] 
Sent: Tuesday, May 31, 2016 1:45 PM
To: Ito, Hiroshi [Tech]; Paul Sandoz
Cc: Raab, Donald [Tech]; core-libs-dev@openjdk.java.net; Chan, Sunny [Tech]
Subject: Re: RFR: Implement RandomAccess spliterator

Hello!

(disclaimer: I'm not an official reviewer)

> ((AbstractList) lst).modCount;

Raw-type casts should be replaced with AbstractList<?> to avoid warning.

> public RandomAccessSpliterator<E> trySplit()

Covariant return type seems to be unnecessary here as this spliterator is not 
public and covariant type is not used anywhere. On the other hand it generates 
a bridge method increasing class file size by ~100 bytes. I would prefer 
"public Spliterator<E> trySplit()" here.

> RandomAccessSpliterator(List<E> list) {
>     this(list, 0, -1, list instanceof AbstractList ? ((AbstractList) 
> list).modCount : 0);
>     assert list instanceof RandomAccess; }

I don't see why modCount is requested here from original list. Anyways it will 
be reinitialized in getFence(). It seems that using "this(list, 0, -1, 0)" 
would be just as fine.

> for (; i < hi; ++i) {
>         @SuppressWarnings("unchecked")
>         E e = lst.get(i);
>         action.accept(e);
>         checkForComodification(mc);
> }
> return;

For better performance ArrayList.spliterator() does not check for 
comodification on every iteration. I guess, here it's also possible to do this. 
Note that as lst.get() and especially action.accept() will unlikely to be 
inlined by JIT as such call sites are usually polymorphic in real applications. 
As a consequence, checkForComodification cannot be optimized to eliminate field 
load and instanceof check, so these operations will be performed on every 
iteration. forEachRemaining() implementation avoids this by storing field into 
local variable lst, but checkForComodification uses field, not local.

Also note that AbstractList.subList().spliterator() should probably also be 
redefined (I guess, Paul already mentioned this). It should be linked to the 
original List, not to the subList to check comodification against the original 
list and also to reduce the indirection in .get() calls.

I wonder if it's really necessary to go down to the List.spliterator(). 
Probably it would be ok to leave
List.spliterator() as is, but redefine AbstractList.spliterator() only (so only 
implementations derived from AbstractList will benefit, but most of real-life 
lists extend AbstractList). This would eliminate those instanceof checks.

With best regards,
Tagir Valeev.


IH> Hi Paul,

IH> Thank you very much for your feedback, and apologize for the delay in 
response..

IH> I have incorporated your comments in the attached patch. Please 
IH> kindly take a look and let me know if you have further feedback.

IH> - Modified @implSpec to add a description in case the list implements 
RandomAccess.
IH> - Added modification checking in RandomAccessSpliterator, only when 
IH> the list implements AbstractList.

IH> Sorry for the confusion about SubList. It was all about the 
IH> concurrent modification checking, so adding the change above pretty 
IH> much addressed my issue earlier.

IH> Thanks,
IH> Hiroshi

IH> -----Original Message-----
IH> From: Paul Sandoz [mailto:paul.san...@oracle.com]
IH> Sent: Thursday, May 12, 2016 8:46 PM
IH> To: Ito, Hiroshi [Tech]
IH> Cc: core-libs-dev@openjdk.java.net; Chan, Sunny [Tech]; Raab, Donald 
IH> [Tech]
IH> Subject: Re: RFR: Implement RandomAccess spliterator

IH> Hi Hiroshi,

IH> This is a good example of what seems like a small feature and yet 
IH> there are some unexpected complexities :-)


IH> We will need to refine the implementation specification of 
IH> List.spliterator, which currently states:

IH> * @implSpec
IH> * The default implementation creates a
IH> * <em><a href="Spliterator.html#binding">late-binding</a></em> 
IH> spliterator
IH> * from the list's {@code Iterator}.  The spliterator inherits the
IH> * <em>fail-fast</em> properties of the list's iterator.


IH> Since the existing implementation is derived from the iterator:

IH>   @Override
IH>   default Spliterator<E> spliterator() {
IH>       return Spliterators.spliterator(this, Spliterator.ORDERED);
IH>   }

IH> concurrent modification checking will occur. The 
IH> RandomAccessSpliterator should also support modification checking, 
IH> which i think requires it be an inner class to check co-mod state.


IH> I am struggling to understand the points you make about the 
IH> spliterator of a sub-list of a Vector being required to be an 
IH> iterator-based implementation. Since AbstractList.SubList can access 
IH> a Vector's elements through List.get/set why cannot RandomAccessSpliterator?

IH> If we want to support random access spliterators on sub-lists i 
IH> think we would anyway need to override the spliterator method to 
IH> check for concurrent modification (as is the case of the iterator method).

IH> Paul.

>> On 11 May 2016, at 11:25, Ito, Hiroshi <hiroshi....@gs.com> wrote:
>> 
>> Hi,
>> 
>> Please kindly review the attached patch for RandomAccessSpliterator 
>> implementation.
>> 
>> Currently default implementation of spliterator is an IteratorSpliterator 
>> which is not optimal for RandomAccess data structures (besides ArrayList and 
>> Vector). This patch is to provide a default RandomAccessSpliterator 
>> implementation for RandomAccess data structure.
>> 
>> The figures below demonstrate the performance difference before and after 
>> the change. Note the significant performance improvement in test 
>> SelectTest.parallel_lazy_streams_gsc (parallel streams performance test for 
>> non-JDK Lists that implement RandomAccess but don't yet implement their own 
>> spliterators).
>> 
>> Benchmark code: 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_goldm
>> ansachs_gs-2Dcollections_blob_master_jmh-2Dtests_src_main_java_com_gs
>> _collections_impl_jmh_SelectTest.java&d=CwIDAw&c=7563p3e2zaQw0AB1wrFV
>> gyagb2IE5rTZOYPxLxfZlX4&r=VPf6N-bfLPGZK6y8EINpNH-yzIbvHVyPq7pd9K2ZmWU
>> &m=xwrjbGjxiPZsTHK8ertu4z2d_OlvBThrjZLWNGPXVeY&s=DFU5c5DqPw66D7n_0iYd
>> 9nef2a3g5kzNnOtEnsWZ2YY&e=
>> 
>> With IteratorSpliterator as default:
>> 
>> Benchmark                              Mode  Cnt    Score    Error  Units
>> SelectTest.parallel_lazy_jdk          thrpt   20  172.671 ± 14.231  ops/s
>> SelectTest.parallel_lazy_streams_gsc  thrpt   20   20.662 ±  0.493  ops/s
>> SelectTest.serial_lazy_jdk            thrpt   20   89.384 ±  4.431  ops/s
>> SelectTest.serial_lazy_streams_gsc    thrpt   20   80.831 ±  7.875  ops/s
>> 
>> With RandomAccessSpliterator as default:
>> 
>> Benchmark                              Mode  Cnt    Score    Error  Units
>> SelectTest.parallel_lazy_jdk          thrpt   20  174.190 ± 16.870  ops/s
>> SelectTest.parallel_lazy_streams_gsc  thrpt   20  180.740 ±  9.594  ops/s
>> SelectTest.serial_lazy_jdk            thrpt   20   85.719 ±  2.414  ops/s
>> SelectTest.serial_lazy_streams_gsc    thrpt   20   78.760 ±  1.029  ops/s
>> 
>> Majority of the patch is contributed by Paul Sandoz and he should be 
>> credited in the Contributed-by field.
>> 
>> Along with this patch submission, we have a question for SubList spliterator 
>> implementation that we retained old behavior for now (i.e. return 
>> IteratorSpliterator, refer to RandomAccessSubList#spliterator()). We have 
>> found that Vector's subList is wrapped by RandomAccessSubList, that is 
>> RandomAccess but not a Vector anymore, and it expects to use 
>> IteratorSpliterator. We were not sure what's the best approach to 
>> distinguish Vector from other RandomAccess data structure within 
>> RandomAccessSublist, so we kept it return IteratorSpliterator for now.
>> 
>> One approach could be to introduce VectorSubList that returns 
>> IteratorSpliterator (or an implementation similar to VectorSpliterator?). 
>> Then we could revert the spliterator() override in RandomAccessSublist.
>> 
>> What would be your suggestion to handle this?
>> 
>> Depending on your suggestion, we might fix the subList spliterator in this 
>> patch, or submit a separate patch if the amount of the change is significant.
>> 
>> Thanks,
>> Hiroshi
>> <RandomAccessSpliterator.patch.txt>

Reply via email to