Since you replied to my message rather than to David's, I assume you had seen 
the [issue] I linked.

Masquerading an arbitrary binary search as a binary search on a list -- a 
random access list, as Louis noted correctly -- can do something for you:

        class Element implements Comparable<Element> { /* ... */ }

        class MyList extends AbstractList<Element>
                implements java.util.RandomAccess {

            @Override
            public int size() {
                return Integer.MAX_VALUE;
            }

            @Override
            public Element get(int index) {
                return /* arbitrary computation */
            }
        }

        int idx = Collections.binarySearch(new MyList().subList(0, 1024),
                new Element( /* ... */ ));

But it's a trick which can do only so much; for instance:

  - it cannot help you find a lower or an upper bound (i.e. the left-most or 
right-most occurrence of a non-unique key),
  - it is hard-to-read and error-prone in its own way

[issue]: https://bugs.openjdk.org/browse/JDK-8326330

> On 15 May 2024, at 23:19, Mateusz Romanowski <romanowski.mate...@gmail.com> 
> wrote:
> 
> Hi,
> I would say it is not worth any effort.
> 
> One can easily write an ad-hoc *local* adapter extending 
> `java.util.AbstractList`..
> .. and immediately invoke existing `java.util.Collections::binarySearch` 
> method.
> 
> Cheers,
> MR
> 
> On Thu, Apr 25, 2024 at 9:09 PM Pavel Rappo <pavel.ra...@oracle.com> wrote:
> 
> 
> > On 25 Apr 2024, at 19:41, David Lloyd <david.ll...@redhat.com> wrote:
> > 
> > The JDK contains a few collection- and array-oriented implementations of 
> > binary search. For the most part, they all work the same way: search the 
> > target "thing" by index using the obvious bisection strategy, returning 
> > either /index/ or /-index-1/ depending on whether it was found at the end 
> > of the search.
> > 
> > However, if you're doing a binary search on a thing that is not a list or 
> > an array, you have two choices: try to make the thing you're searching on 
> > implement List (often infeasible) or write your own binary search.
> > 
> > I'm really tired of writing my own binary search. I've probably done it 50 
> > times by now, each one using a slightly different access and/or compare 
> > strategy, and every time is a new chance to typo something or get something 
> > backwards, leading to irritating debugging sessions and higher dentist 
> > bills.
> 
> Can we safely say that it sets your teeth on edge?
> 
> > It got me to thinking that it wouldn't be too hard to make a "general" 
> > binary search which can search on anything, so that's what I did. I was 
> > thinking that it might be interesting to try adding this into the JDK 
> > somehow.
> > 
> > This implementation is more or less what I now copy & paste to different 
> > projects at the moment:
> > 
> >     public static <C, T> int binarySearch(C collection, int start, int end, 
> > T key, Comparator<? super T> cmp, IntGetter<C, T> getter) {
> >         int low = start;
> >         int high = end - 1;
> >         while (low <= high) {
> >             int mid = low + high >>> 1;
> >             int res = cmp.compare(getter.get(collection, mid), key);
> >             if (res < 0) {
> >                 low = mid + 1;
> >             } else if (res > 0) {
> >                 high = mid - 1;
> >             } else {
> >                 return mid;
> >             }
> >         }
> >         return -low - 1;
> >     }
> >     // (Plus variations for `Comparable` keys and long indices)
> > 
> > A key problem with this approach is that in the JDK, there is no 
> > `ObjIntFunction<T, R>` or `ObjLongFunction<T, R>` which would be necessary 
> > to implement the "getter" portion of the algorithm (despite the existence 
> > of the analogous `ObjIntConsumer<T>` and `ObjLongConsumer<T>`). So, I also 
> > have to copy/paste a `IntGetter<T, R>`/`LongGetter<T, R>` as well, which is 
> > annoying.
> > 
> > A slightly less-good approach is for the general `binarySearch` method to 
> > accept a `IntFunction<T>`/`LongFunction<T>`, and drop the `C collection` 
> > parameter, like this:
> > 
> >     public static <T> int binarySearch(int start, int end, T key, 
> > Comparator<? super T> cmp, IntFunction<T> getter) { ... }
> > 
> > In this case, we can use the function types that the JDK already provides, 
> > but we would very likely have to also create a capturing lambda (e.g. 
> > `myList::get` instead of `List::get`). Maybe this isn't that bad of a 
> > compromise.
> > 
> > It would be possible to replace the existing `binarySearch` implementations 
> > with delegating calls to a generalized implementation. For `Collections`, 
> > the indexed version uses `List::get` and the iterator version uses a helper 
> > lambda to move the iterator and get the result. For arrays, a lambda would 
> > be provided which gets the corresponding array element. If the 
> > non-capturing variant is used, then (on paper at least) this version should 
> > perform similarly to the existing implementations, with less code needed 
> > overall. However, if a capturing lambda is required (due to the 
> > aforementioned lack of `ObjXFunction`), then this could be slightly 
> > worse-performing than the existing implementation due to the construction 
> > (and maybe dispatch) overhead of the lambda. Some real-world benchmarks 
> > would have to be written with various-sized data sets.
> > 
> > It would also be possible to produce primitive variations which operate on 
> > int, float, long, and double values, using existing functions if capturing 
> > is deemed "OK". It is also possible to produce a variation which uses a 
> > `long` for the index, for huge data sets (for example, very large mapped 
> > files using `MemorySegment`).
> > 
> > Also unclear is: where would it live? `Collections`? Somewhere else?
> > 
> > Any thoughts/opinions would be appreciated (even if they are along the 
> > lines of "it's not worth the effort"). Particularly, any insight would be 
> > appreciated as to whether or not this kind of hypothetical enhancement 
> > would warrant a JEP (I wouldn't think so, but I'm no expert at such 
> > assessments).
> > 
> > --
> > - DML • he/him
> 
> Have a look at this recently filed issue: 
> https://bugs.openjdk.org/browse/JDK-8326330
> 
> -Pavel
> 
> 

Reply via email to