[ 
https://issues.apache.org/jira/browse/LUCENE-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182276#comment-16182276
 ] 

Uwe Schindler commented on LUCENE-7966:
---------------------------------------

Hi [~mikemccand],
I did some more testing and did not figure out a bug in the MR jar that could 
explain this. I did the following:

I created a simple JAVA class file:

{code:java}
import org.apache.lucene.util.BytesRef;

public class Test {
  public static void main(String... args) {
    BytesRef r1 = new BytesRef(new byte[20], 0, 30);
    BytesRef r2 = new BytesRef(new byte[20], 1, 10);
    r1.compareTo(r2);
  }
}
{code}

Of course this code is buggy, but that was the idea here. If I run this with 
Java 8, I get following output:

{noformat}
$ java -cp lucene-core-8.0.0-SNAPSHOT.jar;. Test
Exception in thread "main" java.lang.IndexOutOfBoundsException: Range [0, 30) 
out-of-bounds for length 20
        at 
org.apache.lucene.future.FutureArrays.checkFromToIndex(FutureArrays.java:35)
        at 
org.apache.lucene.future.FutureArrays.compareUnsigned(FutureArrays.java:62)
        at org.apache.lucene.util.BytesRef.compareTo(BytesRef.java:165)
        at Test.main(Test.java:7)
{noformat}

If I run the same with Java 9, I get the following output:

{noformat}
$ java -cp lucene-core-8.0.0-SNAPSHOT.jar;. Test
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array 
index out of range: 30
        at java.base/java.util.Arrays.rangeCheck(Arrays.java:122)
        at java.base/java.util.Arrays.compareUnsigned(Arrays.java:6101)
        at org.apache.lucene.util.BytesRef.compareTo(BytesRef.java:165)
        at Test.main(Test.java:7)
{noformat}

As we see, BytesRef class is using Java9's native java.util.Arrays class 
instead our own FutureArrays.

You can also verify this with {{javap}}:

{noformat}
$ javap --multi-release 9 -cp lucene-core-8.0.0-SNAPSHOT.jar -p -c 
org.apache.lucene.util.BytesRef  | grep Arrays
      34: invokestatic  #76                 // Method 
java/util/Arrays.equals:([BII[BII)Z
      34: invokestatic  #136                // Method 
java/util/Arrays.compareUnsigned:([BII[BII)I
      26: invokestatic  #143                // Method 
java/util/Arrays.copyOfRange:([BII)[B

$ javap --multi-release 8 -cp lucene-core-8.0.0-SNAPSHOT.jar -p -c 
org.apache.lucene.util.BytesRef  | grep Arrays
      34: invokestatic  #15                 // Method 
org/apache/lucene/future/FutureArrays.equals:([BII[BII)Z
      34: invokestatic  #29                 // Method 
org/apache/lucene/future/FutureArrays.compareUnsigned:([BII[BII)I
      26: invokestatic  #31                 // Method 
java/util/Arrays.copyOfRange:([BII)[B
{noformat}

So the setup is correct.

Can you maybe also check with the above "Test" program that the JAR file you 
are using behaves correctly?

Nevertheless and unrelated to Mike's problems seing an improvement here, we 
need to improve the whole thing:
- Our tests are currently always running against our own 
FutureArrays/FutureObjects variants, because we run tests against the file 
system based class path and not the JAR file. So Java 9 won't see our patched 
classes while running tests. I am curretly thinking about how to improve this.
- We should maybe add a smoke tester check using my above "buggy Test class" to 
figure out that our MR JAR works. We should also improve Smoker to use Java 9 
in addition to Java 8 (like we did it at Java 7 times).
- We should try check our MR JAR. Currently if you mix up FutureArrays or 
FutureObjects to have different method signatures, it would fail in Java 9. So 
we have to ensure the exported methods in our own impls match the original.

I am still waiting for [~jpountz] to see his benchmark results with the new MR 
JAR.

> build mr-jar and use some java 9 methods if available
> -----------------------------------------------------
>
>                 Key: LUCENE-7966
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7966
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: general/build
>            Reporter: Robert Muir
>         Attachments: LUCENE-7966.patch, LUCENE-7966.patch, LUCENE-7966.patch, 
> LUCENE-7966.patch, LUCENE-7966.patch
>
>
> See background: http://openjdk.java.net/jeps/238
> It would be nice to use some of the newer array methods and range checking 
> methods in java 9 for example, without waiting for lucene 10 or something. If 
> we build an MR-jar, we can start migrating our code to use java 9 methods 
> right now, it will use optimized methods from java 9 when thats available, 
> otherwise fall back to java 8 code.  
> This patch adds:
> {code}
> Objects.checkIndex(int,int)
> Objects.checkFromToIndex(int,int,int)
> Objects.checkFromIndexSize(int,int,int)
> Arrays.mismatch(byte[],int,int,byte[],int,int)
> Arrays.compareUnsigned(byte[],int,int,byte[],int,int)
> Arrays.equal(byte[],int,int,byte[],int,int)
> // did not add char/int/long/short/etc but of course its possible if needed
> {code}
> It sets these up in {{org.apache.lucene.future}} as 1-1 mappings to java 
> methods. This way, we can simply directly replace call sites with java 9 
> methods when java 9 is a minimum. Simple 1-1 mappings mean also that we only 
> have to worry about testing that our java 8 fallback methods work.
> I found that many of the current byte array methods today are willy-nilly and 
> very lenient for example, passing invalid offsets at times and relying on 
> compare methods not throwing exceptions, etc. I fixed all the instances in 
> core/codecs but have not looked at the problems with AnalyzingSuggester. Also 
> SimpleText still uses a silly method in ArrayUtil in similar crazy way, have 
> not removed that one yet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to