On 1/12/13 12:37 AM, Ulf Zibis wrote:
Am 11.01.2013 23:53, schrieb Christian Thalinger:
But you guys noticed that sentence in the initial review request, right?

"Move encoding loop into separate method for which VM will use
intrinsic on x86."

Just wanted to make sure ;-)

Good question Christian!

This is, how it shows up to me:
1) The bug synopsis is unspecific about intrinsc, so ...
2) the mentioned 1st sentence could be one of many solutions.
3) bugs.sun.com/bugdatabase/view_bug.do?bug_id=6896617 ==> This bug is
not available.

I opened it, should show up in few days.

4) What specific operation should be done by the intrinsic, i.e. is
there a fixed API for that method ???

When C2 (server JIT compiler in JVM) compiles encode methods it will replace new method encodeArray() (matched by signature) with hand optimized assembler code which uses latest processor instructions. I will send Hotspot changes soon. So it is nothing to do with interpreter or bytecode sequence.

5) Can an intrinsic write back more than 1 value (see my hack via int[]
p) ?
6) Vladimir's webrev shows an integer as return type for that method,
I've added a variant with boolean return type, and the code from my last
approach could be transformed to a method with Object return type.

Here is latest webrev, I added caching arrayOffset() call results:

http://cr.openjdk.java.net/~kvn/6896617_jdk/webrev.01

I tested it with java nio regression/verification tests. I am done with java part and will not accept any more changes except if someone find a bug in it.


... so waiting for Vladimir's feedback :-[
(especially on performance/hsdis results)

Performance on x86 tested with next code (whole test will be in Hotspot changes) :

        ba = CharBuffer.wrap(a);
        bb = ByteBuffer.wrap(b);
        long start = System.currentTimeMillis();
        for (int i = 0; i < 1000000; i++) {
            ba.clear(); bb.clear();
            enc_res = enc_res && enc.encode(ba, bb, true).isUnderflow();
        }
        long end = System.currentTimeMillis();

1 - current java code
2 - new encodeArray() with loop but without intrinsic (JIT compiled code)
3 - using assembler intrinsic for encodeArray() on cpu without SSE4.2
4 - using assembler intrinsic on cpu with SSE4.2
5 - using assembler intrinsic on cpu with AVX2

size:    1 time:   40     34     28    28    28
size:    7 time:   47     40     33    33    34
size:    8 time:   51     41     33    28    29
size:   16 time:   58     45     37    29    29
size:   32 time:   72     56     44    30    29
size:   64 time:  103     71     62    32    31
size:  128 time:  160    105     89    36    33
size:  256 time:  284    178    141    42    37
size:  512 time:  514    317    246    61    50
size: 1024 time:  987    599    458    89    68
size: 2048 time: 1930   1150    853   145   114
size: 4096 time: 3820   2283   1645   264   207


Thanks,
Vladimir


(Can someone push the bug to the public?)

-Ulf

Reply via email to