Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Paul Sandoz

On Feb 26, 2014, at 6:21 PM, Martin Buchholz  wrote:

> Every once in  a while I see an attempt to introduce a "general purpose" 
> unsafe array class, but efforts stall due to:

Yes, in general this is "arrays 2.0" stuff, which is tricky especially since on 
and off-heap seem to have conflicting requirements (the latter might require 
some form of explicit layout where as for the former it would be better to let 
hotspot work out what to do with better GC mechanisms).


Mostly in the context of just comparing byte arrays:

> - it's not obvious whether unaligned access to e.g. 8 bytes via long is even 
> possible or more efficient than just reading 8 bytes
> - endianness is an issue

In Guava's implementation there is an optimization for big endian natural byte 
order and a clever trick otherwise:

for (int i = 0; i < minWords * Longs.BYTES; i += Longs.BYTES) {
  long lw = theUnsafe.getLong(left, BYTE_ARRAY_BASE_OFFSET + (long) i);
  long rw = theUnsafe.getLong(right, BYTE_ARRAY_BASE_OFFSET + (long) i);
  if (lw != rw) {
if (BIG_ENDIAN) {
  return UnsignedLongs.compare(lw, rw);
}

/*
 * We want to compare only the first index where left[index] != 
right[index].
 * This corresponds to the least significant nonzero byte in lw ^ 
rw, since lw
 * and rw are little-endian.  Long.numberOfTrailingZeros(diff) 
tells us the least
 * significant nonzero bit, and zeroing out the first three bits of 
L.nTZ gives us the
 * shift to get that least significant nonzero byte.
 */
int n = Long.numberOfTrailingZeros(lw ^ rw) & ~0x7;
return (int) (((lw >>> n) & UNSIGNED_MASK) - ((rw >>> n) & 
UNSIGNED_MASK));
  }
}

> - for best performance, you also want to elide those pesky array bound checks 
> (but hotspot can do a better job of that)
> 

No such checks are provided by Unsafe :-)

For loops a check can often be hoisted out (i think hotspot already can do 
that). Recently there have been some optimisations implemented for "(i & 
(a.length - 1))" patterns (e.g. see HashMap). But there are always gonna be 
cases where it cannot fully remove such checks and people reach for Unsafe to 
save some cycles.


> I think it's worth doing, but it's harder than it looks, and probably needs 
> help from the VM via intrinsics.
> 

> ---
> 
> lexicographicalComparator saves a lot of cycles.
> 

Yes, in this context we can provide such methods and over time they could be 
modified to benefit from any future improvements.

Just trying to whittle down small use-cases as to why Unsafe is currently used 
today.

Paul.


Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Florian Weimer

On 02/27/2014 11:37 AM, Paul Sandoz wrote:


 /*
  * We want to compare only the first index where left[index] != 
right[index].
  * This corresponds to the least significant nonzero byte in lw ^ 
rw, since lw
  * and rw are little-endian.  Long.numberOfTrailingZeros(diff) 
tells us the least
  * significant nonzero bit, and zeroing out the first three bits 
of L.nTZ gives us the
  * shift to get that least significant nonzero byte.
  */
 int n = Long.numberOfTrailingZeros(lw ^ rw) & ~0x7;
 return (int) (((lw >>> n) & UNSIGNED_MASK) - ((rw >>> n) & 
UNSIGNED_MASK));


This really depends on the microarchitecture, but for many current CPUs, 
conversion to big endian with Long::reverseBytes(long) will be faster 
(assuming that it's intrinsified).


This is not just a performance optimization, it can also be used to plug 
an alleged timing oracle:




--
Florian Weimer / Red Hat Product Security Team


Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Andrew Haley
Hi,

On 02/26/2014 03:42 PM, Paul Sandoz wrote:

> A common reason why Unsafe is used is to more efficiently compare two 
> unsigned byte arrays, viewing those byte arrays as long arrays.

Why not simply provide ByteBuffer.forArray(byte[]) ?  Then we'd have all the
ByteBuffer intrinsics for put() and get() operations.

Andrew.


Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Paul Sandoz

On Feb 27, 2014, at 3:52 PM, Andrew Haley  wrote:

> Hi,
> 
> On 02/26/2014 03:42 PM, Paul Sandoz wrote:
> 
>> A common reason why Unsafe is used is to more efficiently compare two 
>> unsigned byte arrays, viewing those byte arrays as long arrays.
> 
> Why not simply provide ByteBuffer.forArray(byte[]) ?  Then we'd have all the
> ByteBuffer intrinsics for put() and get() operations.
> 

Good point, even if it does require the creation of two ByteBuffer instances.

An implementation of Arrays.compare could do that if things were suitably 
intrinsified, which i am not sure is the case as of today.

I will do some measurements and report back.

Paul.



Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Andrew Haley
On 02/27/2014 03:05 PM, Paul Sandoz wrote:
> 
> On Feb 27, 2014, at 3:52 PM, Andrew Haley  wrote:
> 
>> Hi,
>> 
>> On 02/26/2014 03:42 PM, Paul Sandoz wrote:
>> 
>>> A common reason why Unsafe is used is to more efficiently compare two 
>>> unsigned byte arrays, viewing those byte arrays as long arrays.
>> 
>> Why not simply provide ByteBuffer.forArray(byte[]) ?  Then we'd have all the 
>> ByteBuffer intrinsics for put() and get() operations.
> 
> Good point, even if it does require the creation of two ByteBuffer instances.
> 
> An implementation of Arrays.compare could do that if things were suitably 
> intrinsified, which i am not sure is the case as of today.
> 
> I will do some measurements and report back.

Unless something has been done in HotSpot pretty recently, the
ByteBuffer intrinsics are not as performat as they might be, so the
results might be a little underwhelming.

Andrew.


Re: Unsafe: efficiently comparing two byte arrays

2014-02-27 Thread Paul Sandoz

On Feb 27, 2014, at 4:12 PM, Andrew Haley  wrote:

> On 02/27/2014 03:05 PM, Paul Sandoz wrote:
>> 
>> On Feb 27, 2014, at 3:52 PM, Andrew Haley  wrote:
>> 
>>> Hi,
>>> 
>>> On 02/26/2014 03:42 PM, Paul Sandoz wrote:
>>> 
 A common reason why Unsafe is used is to more efficiently compare two 
 unsigned byte arrays, viewing those byte arrays as long arrays.
>>> 
>>> Why not simply provide ByteBuffer.forArray(byte[]) ?  Then we'd have all 
>>> the ByteBuffer intrinsics for put() and get() operations.
>> 
>> Good point, even if it does require the creation of two ByteBuffer instances.
>> 
>> An implementation of Arrays.compare could do that if things were suitably 
>> intrinsified, which i am not sure is the case as of today.
>> 
>> I will do some measurements and report back.
> 
> Unless something has been done in HotSpot pretty recently, the
> ByteBuffer intrinsics are not as performat as they might be, so the
> results might be a little underwhelming.
> 

Yes.

For this benchmark:

  http://cr.openjdk.java.net/~psandoz/unsafe/ByteArrayCompareTest.java

Here are results for comparing two byte arrays of size N containing the same 
content, on my mac book:

  http://cr.openjdk.java.net/~psandoz/unsafe/byte-array-compare-results.txt

The unsafe test is about ~3/4 x faster than the safe test. And the buffer test 
is slower than the safe test.

I also included a variant of the unsafe test that uses Long.reverseBytes; the 
results are similar to the unsafe test. I believe reverseBytes is intrinsified 
as is numberOfTrailingZeros (i tried to disable it with 
-XX:DisableIntrinsic=_reverseBytes_l but it did not make any great different to 
the results, which makes me think i configured the option incorrectly).

Paul.


Re: Unsafe: efficiently comparing two byte arrays

2014-02-28 Thread Aleksey Shipilev
On 03/01/2014 02:29 AM, Martin Buchholz wrote:
> Are word-size reads in ByteBuffer actually intrinsified?  I could find no
> evidence for that.  Perhaps this is an important missing optimization for
> hotspot to make?

It is a missing optimization, and we will get to that:
 https://bugs.openjdk.java.net/browse/JDK-8026049

-Aleksey.



Re: Unsafe: efficiently comparing two byte arrays

2014-02-28 Thread Andrew Haley
On 02/28/2014 10:29 PM, Martin Buchholz wrote:
> Are word-size reads in ByteBuffer actually intrinsified?

No.

Andrew.



Re: Unsafe: efficiently comparing two byte arrays

2014-03-01 Thread Ulf Zibis

Am 28.02.2014 23:29, schrieb Martin Buchholz:

Are word-size reads in ByteBuffer actually intrinsified?  I could find no
evidence for that.  Perhaps this is an important missing optimization for
hotspot to make?

I see DirectByteBuffer, but not garden-variety ByteBuffer, using Unsafe.
share/classes/java/nio/Direct-X-Buffer.java.template

Probably should be investigated using disassembly of jitted code, which I
have never done.


See also:
https://bugs.openjdk.java.net/browse/JDK-6914113

-Ulf


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Paul Sandoz

On Feb 28, 2014, at 11:29 PM, Martin Buchholz  wrote:

> Are word-size reads in ByteBuffer actually intrinsified?  I could find no 
> evidence for that.  Perhaps this is an important missing optimization for 
> hotspot to make?
> 
> I see DirectByteBuffer, but not garden-variety ByteBuffer, using Unsafe.
> share/classes/java/nio/Direct-X-Buffer.java.template
> 

Yes, no choice :-) I see it only does it when unaligned access is supported:

private long getLong(long a) {
if (unaligned) {
long x = unsafe.getLong(a);
return (nativeByteOrder ? x : Bits.swap(x));
}
return Bits.getLong(a, bigEndian);
}

I suppose it could detect if the address + position also aligns, but ideally 
that should be hoisted out of any loop so the unaligned pre/post bytes are 
handled separately to the aligned sequence viewed as longs, which most likely 
means explicit code for the comparison use-case.

Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
Arrays.compareUnsigned.

An explosion of methods on Arrays for all types (plus to/from versions) would 
be annoying.

Paul.


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Andrew Haley
On 03/13/2014 11:57 AM, Paul Sandoz wrote:
> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
> Arrays.compareUnsigned.
> 
> An explosion of methods on Arrays for all types (plus to/from versions) would 
> be annoying.

Surely it's a no brainer?  All we have to do is add the intrinsics for
ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
I don't see the problem.

Andrew.



Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Paul Sandoz
On Mar 13, 2014, at 1:19 PM, Andrew Haley  wrote:
> On 03/13/2014 11:57 AM, Paul Sandoz wrote:
>> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
>> Arrays.compareUnsigned.
>> 
>> An explosion of methods on Arrays for all types (plus to/from versions) 
>> would be annoying.
> 
> Surely it's a no brainer?  All we have to do is add the intrinsics for
> ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
> I don't see the problem.
> 

For byte[] comparison, I don't think optimal long access is sufficient on it's 
own due to alignment issues. It would be nice if we can avoid burdening users 
with alignment issues to ensure the main comparison loop is efficient.

A quick solution is to leverage Unsafe within a ByteBuffer.compareUnsigned 
method. In fact i believe Unsafe could be used (as Aleksey says in [1]) for 
put/get as well, which i presume is likely to be much easier/quicker to 
implement. Might be a good first step until a more superior intrinsics solution 
is implemented?

Paul.

[1] https://bugs.openjdk.java.net/browse/JDK-8026049


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Andrew Haley
On 03/13/2014 12:49 PM, Paul Sandoz wrote:
> On Mar 13, 2014, at 1:19 PM, Andrew Haley  wrote:
>> On 03/13/2014 11:57 AM, Paul Sandoz wrote:
>>> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
>>> Arrays.compareUnsigned.
>>>
>>> An explosion of methods on Arrays for all types (plus to/from versions) 
>>> would be annoying.
>>
>> Surely it's a no brainer?  All we have to do is add the intrinsics for
>> ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
>> I don't see the problem.
> 
> For byte[] comparison, I don't think optimal long access is
> sufficient on it's own due to alignment issues. It would be nice if
> we can avoid burdening users with alignment issues to ensure the
> main comparison loop is efficient.

But alignment issues are a feature of the hardware, not of software.
The problem right now is that ByteBuffers use a byte-by-byte
comparison of longs, even when it isn't needed, because there aren't
the intrinsics.

> A quick solution is to leverage Unsafe within a
> ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
> used (as Aleksey says in [1]) for put/get as well, which i presume
> is likely to be much easier/quicker to implement. Might be a good
> first step until a more superior intrinsics solution is implemented?

I still don't get it, sorry.  What can Unsafe do that ByteBuffer
intrinsics can't do?

Andrew.


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread David M. Lloyd

On 03/13/2014 07:19 AM, Andrew Haley wrote:

On 03/13/2014 11:57 AM, Paul Sandoz wrote:

Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
Arrays.compareUnsigned.

An explosion of methods on Arrays for all types (plus to/from versions) would 
be annoying.


Surely it's a no brainer?  All we have to do is add the intrinsics for
ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
I don't see the problem.


Why such a hangup on ByteBuffers and Arrays?  I think it'd be super 
useful to simply have methods on Long, Integer, etc:


public static void getBytesBE(long orig, byte[] dst, int off)
public static void getBytesLE(long orig, byte[] dst, int off)

public static int compareBE(long orig, byte[] other, int off)
public static int compareLE(long orig, byte[] other, int off)

public static long getLongFromBytesBE(byte[] src, int off)
public static long getLongFromBytesLE(byte[] src, int off)

...

It makes sense as there are already some byte- and bit-ish methods on 
there (reversing byte order for example, or finding population count, 
etc.).  If these were intrinsic, ByteBuffer could use these methods (as 
could Arrays.fill(...) and friends, if one was so inclined).  Then 
Data*Stream could use them as well perhaps.


--
- DML


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Paul Sandoz

On Mar 13, 2014, at 1:57 PM, Andrew Haley  wrote:

> On 03/13/2014 12:49 PM, Paul Sandoz wrote:
>> On Mar 13, 2014, at 1:19 PM, Andrew Haley  wrote:
>>> On 03/13/2014 11:57 AM, Paul Sandoz wrote:
 Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
 Arrays.compareUnsigned.
 
 An explosion of methods on Arrays for all types (plus to/from versions) 
 would be annoying.
>>> 
>>> Surely it's a no brainer?  All we have to do is add the intrinsics for
>>> ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
>>> I don't see the problem.
>> 
>> For byte[] comparison, I don't think optimal long access is
>> sufficient on it's own due to alignment issues. It would be nice if
>> we can avoid burdening users with alignment issues to ensure the
>> main comparison loop is efficient.
> 
> But alignment issues are a feature of the hardware, not of software.
> The problem right now is that ByteBuffers use a byte-by-byte
> comparison of longs, even when it isn't needed, because there aren't
> the intrinsics.
> 

I was considering the following case:

byte[] b1 = ...
byte[] b2 = ...
// compare from offset
ByteBuffer left = ByteBuffer.wrap(b1, 1, b1.length - 1);
ByteBuffer right = ByteBuffer.wrap(b2, 1, b2.length - 1);
int r = compare(left, right);

static int compare(ByteBuffer left, ByteBuffer right) {
int minLength = Math.min(left.remaining(), right.remaining());
int minWords = minLength / 8;

for (int i = 0; i < minWords * 8; i += 8) {
long lw = left.getLong(); // Inefficient when unaligned access is 
not supported by hardware?
long rw = right.getLong(); // Are bounds checks hoisted out? i 
presume so
if (lw != rw) {
return Long.compareUnsigned(lw, rw);
}
}

for (int i = minWords * 8; i < minLength; i++) {
int result = Byte.toUnsignedInt(left.get()) - 
Byte.toUnsignedInt(right.get());
if (result != 0) {
return result;
}
}

return left.remaining() - right.remaining();
}

Certain Apache code (e.g. see Cassandra) supports comparing byte[] arrays with 
offsets and i am not sure that code is portable [1].

I was wondering in the case where the hardware does not support unaligned 
access it might be possible to sill optimize by reading longs at aligned 
positions and doing some some extra bit twiddling.


>> A quick solution is to leverage Unsafe within a
>> ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
>> used (as Aleksey says in [1]) for put/get as well, which i presume
>> is likely to be much easier/quicker to implement. Might be a good
>> first step until a more superior intrinsics solution is implemented?
> 
> I still don't get it, sorry.  What can Unsafe do that ByteBuffer
> intrinsics can't do?
> 

We can arguably implement it faster [2] and i presume it will be simpler too.

Paul.

[1] 
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob;f=src/java/org/apache/cassandra/utils/FastByteComparisons.java;h=4be6cd42b3035055c2b6102718d2cf49b73669f1;hb=HEAD#l184

[2] Well, i argue it is since i know next to nothing about how to implement an 
intrinsic :-)


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Andrew Haley
On 03/13/2014 02:19 PM, Paul Sandoz wrote:
> 
> On Mar 13, 2014, at 1:57 PM, Andrew Haley  wrote:
> 
>> On 03/13/2014 12:49 PM, Paul Sandoz wrote:
>>> On Mar 13, 2014, at 1:19 PM, Andrew Haley  wrote:
 On 03/13/2014 11:57 AM, Paul Sandoz wrote:
> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
> Arrays.compareUnsigned.
>
> An explosion of methods on Arrays for all types (plus to/from versions) 
> would be annoying.

 Surely it's a no brainer?  All we have to do is add the intrinsics for
 ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
 I don't see the problem.
>>>
>>> For byte[] comparison, I don't think optimal long access is
>>> sufficient on it's own due to alignment issues. It would be nice if
>>> we can avoid burdening users with alignment issues to ensure the
>>> main comparison loop is efficient.
>>
>> But alignment issues are a feature of the hardware, not of software.
>> The problem right now is that ByteBuffers use a byte-by-byte
>> comparison of longs, even when it isn't needed, because there aren't
>> the intrinsics.
> 
> I was considering the following case:
> 
> byte[] b1 = ...
> byte[] b2 = ...
> // compare from offset
> ByteBuffer left = ByteBuffer.wrap(b1, 1, b1.length - 1);
> ByteBuffer right = ByteBuffer.wrap(b2, 1, b2.length - 1);
> int r = compare(left, right);
> 
> static int compare(ByteBuffer left, ByteBuffer right) {
> int minLength = Math.min(left.remaining(), right.remaining());
> int minWords = minLength / 8;
> 
> for (int i = 0; i < minWords * 8; i += 8) {
> long lw = left.getLong(); // Inefficient when unaligned access is 
> not supported by hardware?

Possibly, but HotSpot can do an awful lot with specialization when
offsets are known.

> long rw = right.getLong(); // Are bounds checks hoisted out? i 
> presume so

Yes.

> if (lw != rw) {
> return Long.compareUnsigned(lw, rw);
> }
> }
> 
> for (int i = minWords * 8; i < minLength; i++) {
> int result = Byte.toUnsignedInt(left.get()) - 
> Byte.toUnsignedInt(right.get());
> if (result != 0) {
> return result;
> }
> }
> 
> return left.remaining() - right.remaining();
> }
> 
> Certain Apache code (e.g. see Cassandra) supports comparing byte[]
> arrays with offsets and i am not sure that code is portable [1].
> 
> I was wondering in the case where the hardware does not support
> unaligned access it might be possible to sill optimize by reading
> longs at aligned positions and doing some some extra bit twiddling.

Probably, but given that the offsets of two strings don't have to be
the same, it can get very messy.

>>> A quick solution is to leverage Unsafe within a
>>> ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
>>> used (as Aleksey says in [1]) for put/get as well, which i presume
>>> is likely to be much easier/quicker to implement. Might be a good
>>> first step until a more superior intrinsics solution is implemented?
>>
>> I still don't get it, sorry.  What can Unsafe do that ByteBuffer
>> intrinsics can't do?
> 
> We can arguably implement it faster [2]

I doubt that very much.  In fact, I would say that it is almost
certainly not true.  HotSpot usually knows, for example, when both
offsets are zero and can generate better code for machines with strict
alignment.

And also, Unsafe has the same speed problems with unaligned data
that an intrinsic would have.

> and i presume it will be simpler too.

And that.

But the key point is this: we want intrinsics for ByteBuffers anyway.

Andrew.


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Andrew Haley
On 03/13/2014 02:26 PM, David M. Lloyd wrote:
> On 03/13/2014 07:19 AM, Andrew Haley wrote:
>> On 03/13/2014 11:57 AM, Paul Sandoz wrote:
>>> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add 
>>> Arrays.compareUnsigned.
>>>
>>> An explosion of methods on Arrays for all types (plus to/from versions) 
>>> would be annoying.
>>
>> Surely it's a no brainer?  All we have to do is add the intrinsics for
>> ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
>> I don't see the problem.
> 
> Why such a hangup on ByteBuffers and Arrays?  I think it'd be super 
> useful to simply have methods on Long, Integer, etc:
> 
> public static void getBytesBE(long orig, byte[] dst, int off)
> public static void getBytesLE(long orig, byte[] dst, int off)
> 
> public static int compareBE(long orig, byte[] other, int off)
> public static int compareLE(long orig, byte[] other, int off)
> 
> public static long getLongFromBytesBE(byte[] src, int off)
> public static long getLongFromBytesLE(byte[] src, int off)

I take your point, but this is a whole lot of new library work, both
in implementation and specification, whereas we already have this in
ByteBuffers, albeit with a suboptimal implementation.

Andrew.




Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Paul Sandoz
On Mar 13, 2014, at 5:26 PM, Andrew Haley  wrote:
> 
 A quick solution is to leverage Unsafe within a
 ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
 used (as Aleksey says in [1]) for put/get as well, which i presume
 is likely to be much easier/quicker to implement. Might be a good
 first step until a more superior intrinsics solution is implemented?
>>> 
>>> I still don't get it, sorry.  What can Unsafe do that ByteBuffer
>>> intrinsics can't do?
>> 
>> We can arguably implement it faster [2]
> 
> I doubt that very much.  In fact, I would say that it is almost
> certainly not true.  HotSpot usually knows, for example, when both
> offsets are zero and can generate better code for machines with strict
> alignment.
> 
> And also, Unsafe has the same speed problems with unaligned data
> that an intrinsic would have.
> 

I meant i can *implement* a comparison solution using Unsafe faster than i can 
implement a more general solution using intrinsics. 

I presume intrinsics might be of great advantage if SIMD instructions can be 
leveraged? if not would it be reasonable to assume that for common 0 offset 
case the two solutions might be similar in terms of performance?


>> and i presume it will be simpler too.
> 
> And that.
> 
> But the key point is this: we want intrinsics for ByteBuffers anyway.
> 

Yeah, but i think that could take longer and there is also the Arrays 2.0 work 
to consider, the scope is considerably larger than providing an efficient 
comparison method for byte[] arrays. 

It is possible to tackle this simple use-case with an API tweak 
(ByteBuffer.compareUnsigned) and revisit the implementation if/when instincts 
for arrays get into 9. That is a small but useful step in the right direction.

Paul.


Re: Unsafe: efficiently comparing two byte arrays

2014-03-13 Thread Andrew Haley
On 03/13/2014 05:02 PM, Paul Sandoz wrote:
> On Mar 13, 2014, at 5:26 PM, Andrew Haley  wrote:
>>
> A quick solution is to leverage Unsafe within a
> ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
> used (as Aleksey says in [1]) for put/get as well, which i presume
> is likely to be much easier/quicker to implement. Might be a good
> first step until a more superior intrinsics solution is implemented?

 I still don't get it, sorry.  What can Unsafe do that ByteBuffer
 intrinsics can't do?
>>>
>>> We can arguably implement it faster [2]
>>
>> I doubt that very much.  In fact, I would say that it is almost
>> certainly not true.  HotSpot usually knows, for example, when both
>> offsets are zero and can generate better code for machines with strict
>> alignment.
>>
>> And also, Unsafe has the same speed problems with unaligned data
>> that an intrinsic would have.
> 
> I meant i can *implement* a comparison solution using Unsafe faster
> than i can implement a more general solution using intrinsics.

I see.  Yes, what you said is clear, now I read it again.  :-)

> I presume intrinsics might be of great advantage if SIMD
> instructions can be leveraged? if not would it be reasonable to
> assume that for common 0 offset case the two solutions might be
> similar in terms of performance?

That's hard to say for sure.  It might well be so, but I suspect
you'll hit memor bandwidth limits in both cases.

> Yeah, but i think that could take longer and there is also the
> Arrays 2.0 work to consider, the scope is considerably larger than
> providing an efficient comparison method for byte[] arrays.

OK, so I think I understand now: you can do what you need without
having to drag in HotSpot engineers.  But Unsafe only works really
efficiently when it is intrinsified.

> It is possible to tackle this simple use-case with an API tweak
> (ByteBuffer.compareUnsigned) and revisit the implementation if/when
> instincts for arrays get into 9. That is a small but useful step in
> the right direction.

Ok, fair point.

Andrew.