Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-10 Thread Mikael Vidstedt



On 2016-02-10 02:03, Paul Sandoz wrote:

On 10 Feb 2016, at 04:42, Mikael Vidstedt  wrote:


Can I please get a quick review of these updated webrevs:

hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/hotspot/webrev/
jdk: http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/jdk/webrev/

incremental webrevs:

hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/hotspot/webrev/
jdk: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/jdk/webrev/


+1

I agree with David on the JavaDoc, but that could be followed up with any 
future changes, including potentially the removal of wrapping methods in 
Bits.java, since buffers any way use Unsafe the wrappers now appear to offer 
little value.


I'm planning on cleaning up Unsafe.java pretty significantly in a 
separate change, so I will add the relevant javadocs as part of that. 
I'll also file a separate enhancement to remove the Bits wrappers.


Thanks,
Mikael



Paul.



Changes:

* Added asserts in copy.cpp/conjoint_swap
* Correctness: Moved offset sign checks to only be performed if corresponding 
base object is null, and added corresponding tests

I'm about to make additional changes in this same area, so unless this last 
change is horribly broken I'm planning on pushing this and doing any additional 
cleanup in the upcoming change(s).

Cheers,
Mikael





Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-10 Thread Paul Sandoz

> On 10 Feb 2016, at 04:42, Mikael Vidstedt  wrote:
> 
> 
> Can I please get a quick review of these updated webrevs:
> 
> hotspot: 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/hotspot/webrev/
> jdk: http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/jdk/webrev/
> 
> incremental webrevs:
> 
> hotspot: 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/hotspot/webrev/
> jdk: 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/jdk/webrev/
> 

+1

I agree with David on the JavaDoc, but that could be followed up with any 
future changes, including potentially the removal of wrapping methods in 
Bits.java, since buffers any way use Unsafe the wrappers now appear to offer 
little value.

Paul.


> Changes:
> 
> * Added asserts in copy.cpp/conjoint_swap
> * Correctness: Moved offset sign checks to only be performed if corresponding 
> base object is null, and added corresponding tests
> 
> I'm about to make additional changes in this same area, so unless this last 
> change is horribly broken I'm planning on pushing this and doing any 
> additional cleanup in the upcoming change(s).
> 
> Cheers,
> Mikael
> 


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-09 Thread Mikael Vidstedt


Can I please get a quick review of these updated webrevs:

hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/hotspot/webrev/
jdk: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/jdk/webrev/


incremental webrevs:

hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/hotspot/webrev/
jdk: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/jdk/webrev/


Changes:

* Added asserts in copy.cpp/conjoint_swap
* Correctness: Moved offset sign checks to only be performed if 
corresponding base object is null, and added corresponding tests


I'm about to make additional changes in this same area, so unless this 
last change is horribly broken I'm planning on pushing this and doing 
any additional cleanup in the upcoming change(s).


Cheers,
Mikael


On 2016-02-07 16:14, David Holmes wrote:

On 6/02/2016 8:21 AM, Mikael Vidstedt wrote:


I fully agree that moving the arguments checking up to Java makes more
sense, and I've prepared new webrevs which do exactly that, including
changes to address the other feedback from David, John and others:


Shouldn't the lowest-level do_conjoint_swap routines at least check 
preconditions with asserts to catch the cases where the calling Java 
code has failed to do the right thing? The other Copy methods seem to 
do this.


David
-


hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/hotspot/webrev/ 



jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/jdk/webrev/

Incremental webrevs for your convenience:

hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/hotspot/webrev/ 



jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/jdk/webrev/ 





I have done some benchmarking of this code and for large copies (16MB+)
this outperforms the old Bits.c implementation by *30-100%* depending on
platform and exact element sizes! For smaller copies the additional
checks which are now performed hurt performance on client VMs (80-90% of
old impl), but with the server VMs I see performance on par with, or in
most cases 5-10% better than the old implementation. There's a
potentially statistically significant regression of ~3-4% for
elemSize=2, but for now I'm going to declare success. There's certainly
room for further improvements here, but this should at least do for
addressing the original problem.


I filed https://bugs.openjdk.java.net/browse/JDK-8149159 for moving the
checks for Unsafe.copyMemory to Java, and will work on that next. I also
filed https://bugs.openjdk.java.net/browse/JDK-8149162 to cover the
potential renaming of the Bits methods to have more informative names.
Finally, I filed https://bugs.openjdk.java.net/browse/JDK-8149163 to
look at improving the behavior of Unsafe.addressSize(), after having
spent too much time trying to understand why the performance of the new
U.copySwapMemory Java checks wasn't quite living up to my expectations
(spoiler alert: Unsafe.addressSize() is not intrinsified, so will always
result in a call into the VM/unsafe.cpp).


Finally, I - too - would like to see the copy-swap logic moved into
Java, and as I mentioned I played around with that first before I
decided to do the native implementation to address the immediate
problem. Looking forward to what you find Paul!

Cheers,
Mikael

On 2016-02-05 05:00, Paul Sandoz wrote:

Hi,

Nice use of C++ templates :-)

Overall looks good.

I too would prefer if we could move the argument checking out, perhaps
even to the point of requiring callers do that rather than providing
another method, for example for Buffer i think the arguments are known
to be valid? I think in either case it is important to improve the
documentation on the method stating the constraints on arguments,
atomicity guarantees etc.

I have a hunch that for the particular case of copying-with-swap for
buffers i could get this to work work efficiently using Unsafe (three
separate methods for each unit type of 2, 4 and 8 bytes), since IIUC
the range is bounded to be less than Integer.MAX_VALUE so an int loop
rather than a long loop can be used and therefore safe points checks
will not be placed within the loop.

However, i think what you have done is more generally applicable and
could be made intrinsic. It would be a nice at some future point if it
could be made a pure Java implementation and intrinsified where
appropriate.

—

John, regarding array mismatch there were issues with the efficiency
of the unrolled loops with Unsafe access. (Since the loops were int
bases there were no issues with safe point checks.) Roland recently
fixed that so now code is generated that is competitive with direct
array accesses. We drop into the stub intrinsic and leverage 128bits
or 256bits where supported. Interestingly it seems the unrolled loop
using Unsafe is now slightly faster than the stub using 128bit
registers. I don’t know if that is due to unluckly alignment, and/or
the stub needs to do 

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-09 Thread David Holmes

On 10/02/2016 1:42 PM, Mikael Vidstedt wrote:


Can I please get a quick review of these updated webrevs:


In terms of the incremental changes this looks fine. If you consider it 
all reviewed then nothing in the increments should change that.


But looking at the JDK code I have some follow up suggestions for 
copySwapMemory:
- document all parameters with @param and describe constraints on 
values/relationships
- specify @throws for all IllegalArgumentException and 
NullPointerException conditions

- add a descriptive error message when throwing IllegalArgumentException
- not sure NullPointerException is correct, rather than 
IllegalArgumentException. for null base with zero offset cases


Thanks,
David


hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/hotspot/webrev/

jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05/jdk/webrev/

incremental webrevs:

hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/hotspot/webrev/

jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.05.incr/jdk/webrev/


Changes:

* Added asserts in copy.cpp/conjoint_swap
* Correctness: Moved offset sign checks to only be performed if
corresponding base object is null, and added corresponding tests

I'm about to make additional changes in this same area, so unless this
last change is horribly broken I'm planning on pushing this and doing
any additional cleanup in the upcoming change(s).

Cheers,
Mikael


On 2016-02-07 16:14, David Holmes wrote:

On 6/02/2016 8:21 AM, Mikael Vidstedt wrote:


I fully agree that moving the arguments checking up to Java makes more
sense, and I've prepared new webrevs which do exactly that, including
changes to address the other feedback from David, John and others:


Shouldn't the lowest-level do_conjoint_swap routines at least check
preconditions with asserts to catch the cases where the calling Java
code has failed to do the right thing? The other Copy methods seem to
do this.

David
-


hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/hotspot/webrev/


jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/jdk/webrev/

Incremental webrevs for your convenience:

hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/hotspot/webrev/


jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/jdk/webrev/




I have done some benchmarking of this code and for large copies (16MB+)
this outperforms the old Bits.c implementation by *30-100%* depending on
platform and exact element sizes! For smaller copies the additional
checks which are now performed hurt performance on client VMs (80-90% of
old impl), but with the server VMs I see performance on par with, or in
most cases 5-10% better than the old implementation. There's a
potentially statistically significant regression of ~3-4% for
elemSize=2, but for now I'm going to declare success. There's certainly
room for further improvements here, but this should at least do for
addressing the original problem.


I filed https://bugs.openjdk.java.net/browse/JDK-8149159 for moving the
checks for Unsafe.copyMemory to Java, and will work on that next. I also
filed https://bugs.openjdk.java.net/browse/JDK-8149162 to cover the
potential renaming of the Bits methods to have more informative names.
Finally, I filed https://bugs.openjdk.java.net/browse/JDK-8149163 to
look at improving the behavior of Unsafe.addressSize(), after having
spent too much time trying to understand why the performance of the new
U.copySwapMemory Java checks wasn't quite living up to my expectations
(spoiler alert: Unsafe.addressSize() is not intrinsified, so will always
result in a call into the VM/unsafe.cpp).


Finally, I - too - would like to see the copy-swap logic moved into
Java, and as I mentioned I played around with that first before I
decided to do the native implementation to address the immediate
problem. Looking forward to what you find Paul!

Cheers,
Mikael

On 2016-02-05 05:00, Paul Sandoz wrote:

Hi,

Nice use of C++ templates :-)

Overall looks good.

I too would prefer if we could move the argument checking out, perhaps
even to the point of requiring callers do that rather than providing
another method, for example for Buffer i think the arguments are known
to be valid? I think in either case it is important to improve the
documentation on the method stating the constraints on arguments,
atomicity guarantees etc.

I have a hunch that for the particular case of copying-with-swap for
buffers i could get this to work work efficiently using Unsafe (three
separate methods for each unit type of 2, 4 and 8 bytes), since IIUC
the range is bounded to be less than Integer.MAX_VALUE so an int loop
rather than a long loop can be used and therefore safe points checks
will not be placed within the loop.

However, i think what you have done is more generally applicable and
could be made intrinsic. It would be a nice at some future point if it
could 

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-07 Thread David Holmes

On 6/02/2016 8:21 AM, Mikael Vidstedt wrote:


I fully agree that moving the arguments checking up to Java makes more
sense, and I've prepared new webrevs which do exactly that, including
changes to address the other feedback from David, John and others:


Shouldn't the lowest-level do_conjoint_swap routines at least check 
preconditions with asserts to catch the cases where the calling Java 
code has failed to do the right thing? The other Copy methods seem to do 
this.


David
-


hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/hotspot/webrev/

jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/jdk/webrev/

Incremental webrevs for your convenience:

hotspot:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/hotspot/webrev/

jdk:
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/jdk/webrev/



I have done some benchmarking of this code and for large copies (16MB+)
this outperforms the old Bits.c implementation by *30-100%* depending on
platform and exact element sizes! For smaller copies the additional
checks which are now performed hurt performance on client VMs (80-90% of
old impl), but with the server VMs I see performance on par with, or in
most cases 5-10% better than the old implementation. There's a
potentially statistically significant regression of ~3-4% for
elemSize=2, but for now I'm going to declare success. There's certainly
room for further improvements here, but this should at least do for
addressing the original problem.


I filed https://bugs.openjdk.java.net/browse/JDK-8149159 for moving the
checks for Unsafe.copyMemory to Java, and will work on that next. I also
filed https://bugs.openjdk.java.net/browse/JDK-8149162 to cover the
potential renaming of the Bits methods to have more informative names.
Finally, I filed https://bugs.openjdk.java.net/browse/JDK-8149163 to
look at improving the behavior of Unsafe.addressSize(), after having
spent too much time trying to understand why the performance of the new
U.copySwapMemory Java checks wasn't quite living up to my expectations
(spoiler alert: Unsafe.addressSize() is not intrinsified, so will always
result in a call into the VM/unsafe.cpp).


Finally, I - too - would like to see the copy-swap logic moved into
Java, and as I mentioned I played around with that first before I
decided to do the native implementation to address the immediate
problem. Looking forward to what you find Paul!

Cheers,
Mikael

On 2016-02-05 05:00, Paul Sandoz wrote:

Hi,

Nice use of C++ templates :-)

Overall looks good.

I too would prefer if we could move the argument checking out, perhaps
even to the point of requiring callers do that rather than providing
another method, for example for Buffer i think the arguments are known
to be valid? I think in either case it is important to improve the
documentation on the method stating the constraints on arguments,
atomicity guarantees etc.

I have a hunch that for the particular case of copying-with-swap for
buffers i could get this to work work efficiently using Unsafe (three
separate methods for each unit type of 2, 4 and 8 bytes), since IIUC
the range is bounded to be less than Integer.MAX_VALUE so an int loop
rather than a long loop can be used and therefore safe points checks
will not be placed within the loop.

However, i think what you have done is more generally applicable and
could be made intrinsic. It would be a nice at some future point if it
could be made a pure Java implementation and intrinsified where
appropriate.

—

John, regarding array mismatch there were issues with the efficiency
of the unrolled loops with Unsafe access. (Since the loops were int
bases there were no issues with safe point checks.) Roland recently
fixed that so now code is generated that is competitive with direct
array accesses. We drop into the stub intrinsic and leverage 128bits
or 256bits where supported. Interestingly it seems the unrolled loop
using Unsafe is now slightly faster than the stub using 128bit
registers. I don’t know if that is due to unluckly alignment, and/or
the stub needs to do some manual unrolling. In terms of code-cache
efficiency the intrinsic is better.

Paul.






On 4 Feb 2016, at 06:27, John Rose  wrote:

On Feb 2, 2016, at 11:25 AM, Mikael Vidstedt
 wrote:

Please review this change which introduces a Copy::conjoint_swap and
an Unsafe.copySwapMemory method to call it from Java, along with the
necessary changes to have java.nio.Bits call it instead of the
Bits.c code.

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/


This is very good.

I have some nit-picks:

These days, when we introduce a new intrinsic (@HSIntrCand),
we write the argument checking code separately in a non-intrinsic
bytecode method.  In this case, we don't (yet) have an intrinsic
binding for U.copy*, but we 

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-05 Thread Mikael Vidstedt


I fully agree that moving the arguments checking up to Java makes more 
sense, and I've prepared new webrevs which do exactly that, including 
changes to address the other feedback from David, John and others:


hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/hotspot/webrev/
jdk: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04/jdk/webrev/


Incremental webrevs for your convenience:

hotspot: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/hotspot/webrev/
jdk: 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.04.incr/jdk/webrev/



I have done some benchmarking of this code and for large copies (16MB+) 
this outperforms the old Bits.c implementation by *30-100%* depending on 
platform and exact element sizes! For smaller copies the additional 
checks which are now performed hurt performance on client VMs (80-90% of 
old impl), but with the server VMs I see performance on par with, or in 
most cases 5-10% better than the old implementation. There's a 
potentially statistically significant regression of ~3-4% for 
elemSize=2, but for now I'm going to declare success. There's certainly 
room for further improvements here, but this should at least do for 
addressing the original problem.



I filed https://bugs.openjdk.java.net/browse/JDK-8149159 for moving the 
checks for Unsafe.copyMemory to Java, and will work on that next. I also 
filed https://bugs.openjdk.java.net/browse/JDK-8149162 to cover the 
potential renaming of the Bits methods to have more informative names. 
Finally, I filed https://bugs.openjdk.java.net/browse/JDK-8149163 to 
look at improving the behavior of Unsafe.addressSize(), after having 
spent too much time trying to understand why the performance of the new 
U.copySwapMemory Java checks wasn't quite living up to my expectations 
(spoiler alert: Unsafe.addressSize() is not intrinsified, so will always 
result in a call into the VM/unsafe.cpp).



Finally, I - too - would like to see the copy-swap logic moved into 
Java, and as I mentioned I played around with that first before I 
decided to do the native implementation to address the immediate 
problem. Looking forward to what you find Paul!


Cheers,
Mikael

On 2016-02-05 05:00, Paul Sandoz wrote:

Hi,

Nice use of C++ templates :-)

Overall looks good.

I too would prefer if we could move the argument checking out, perhaps even to 
the point of requiring callers do that rather than providing another method, 
for example for Buffer i think the arguments are known to be valid? I think in 
either case it is important to improve the documentation on the method stating 
the constraints on arguments, atomicity guarantees etc.

I have a hunch that for the particular case of copying-with-swap for buffers i 
could get this to work work efficiently using Unsafe (three separate methods 
for each unit type of 2, 4 and 8 bytes), since IIUC the range is bounded to be 
less than Integer.MAX_VALUE so an int loop rather than a long loop can be used 
and therefore safe points checks will not be placed within the loop.

However, i think what you have done is more generally applicable and could be 
made intrinsic. It would be a nice at some future point if it could be made a 
pure Java implementation and intrinsified where appropriate.

—

John, regarding array mismatch there were issues with the efficiency of the 
unrolled loops with Unsafe access. (Since the loops were int bases there were 
no issues with safe point checks.) Roland recently fixed that so now code is 
generated that is competitive with direct array accesses. We drop into the stub 
intrinsic and leverage 128bits or 256bits where supported. Interestingly it 
seems the unrolled loop using Unsafe is now slightly faster than the stub using 
128bit registers. I don’t know if that is due to unluckly alignment, and/or the 
stub needs to do some manual unrolling. In terms of code-cache efficiency the 
intrinsic is better.

Paul.






On 4 Feb 2016, at 06:27, John Rose  wrote:

On Feb 2, 2016, at 11:25 AM, Mikael Vidstedt  wrote:

Please review this change which introduces a Copy::conjoint_swap and an 
Unsafe.copySwapMemory method to call it from Java, along with the necessary 
changes to have java.nio.Bits call it instead of the Bits.c code.

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/

This is very good.

I have some nit-picks:

These days, when we introduce a new intrinsic (@HSIntrCand),
we write the argument checking code separately in a non-intrinsic
bytecode method.  In this case, we don't (yet) have an intrinsic
binding for U.copy*, but we might in the future.  (C intrinsifies
memcpy, which is a precedent.)  In any case, I would prefer
if we could structure the argument checking code in a similar
way, with Unsafe.java containing both copySwapMemory
and a private 

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-05 Thread Paul Sandoz
Hi,

Nice use of C++ templates :-)

Overall looks good.

I too would prefer if we could move the argument checking out, perhaps even to 
the point of requiring callers do that rather than providing another method, 
for example for Buffer i think the arguments are known to be valid? I think in 
either case it is important to improve the documentation on the method stating 
the constraints on arguments, atomicity guarantees etc.

I have a hunch that for the particular case of copying-with-swap for buffers i 
could get this to work work efficiently using Unsafe (three separate methods 
for each unit type of 2, 4 and 8 bytes), since IIUC the range is bounded to be 
less than Integer.MAX_VALUE so an int loop rather than a long loop can be used 
and therefore safe points checks will not be placed within the loop.

However, i think what you have done is more generally applicable and could be 
made intrinsic. It would be a nice at some future point if it could be made a 
pure Java implementation and intrinsified where appropriate.

—

John, regarding array mismatch there were issues with the efficiency of the 
unrolled loops with Unsafe access. (Since the loops were int bases there were 
no issues with safe point checks.) Roland recently fixed that so now code is 
generated that is competitive with direct array accesses. We drop into the stub 
intrinsic and leverage 128bits or 256bits where supported. Interestingly it 
seems the unrolled loop using Unsafe is now slightly faster than the stub using 
128bit registers. I don’t know if that is due to unluckly alignment, and/or the 
stub needs to do some manual unrolling. In terms of code-cache efficiency the 
intrinsic is better.

Paul.





> On 4 Feb 2016, at 06:27, John Rose  wrote:
> 
> On Feb 2, 2016, at 11:25 AM, Mikael Vidstedt  
> wrote:
>> Please review this change which introduces a Copy::conjoint_swap and an 
>> Unsafe.copySwapMemory method to call it from Java, along with the necessary 
>> changes to have java.nio.Bits call it instead of the Bits.c code.
>> 
>> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/
>> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/
> 
> This is very good.
> 
> I have some nit-picks:
> 
> These days, when we introduce a new intrinsic (@HSIntrCand),
> we write the argument checking code separately in a non-intrinsic
> bytecode method.  In this case, we don't (yet) have an intrinsic
> binding for U.copy*, but we might in the future.  (C intrinsifies
> memcpy, which is a precedent.)  In any case, I would prefer
> if we could structure the argument checking code in a similar
> way, with Unsafe.java containing both copySwapMemory
> and a private copySwapMemory0.  Then we can JIT-optimize
> the safety checks.
> 
> You might as well extend the same treatment to the pre-existing
> copyMemory call.  The most important check (and the only one
> in U.copyMemory) is to ensure that the size_t operand has not
> wrapped around from a Java negative value to a crazy-large
> size_t value.  That's the low-hanging fruit.  Checking the pointers
> (for null or oob) is more problematic, of course.  Checking consistency
> around elemSize is cheap and easy, so I agree that the U.copySM
> should do that work also.  Basically, Unsafe can do very basic
> checks if there is a tricky user model to enforce, but it mustn't
> "sign up" to guard the user against all errors.
> 
> Rule of thumb:  Unsafe calls don't throw NPEs, they just SEGV.
> And the rare bit that *does* throw (IAE usually) should be placed
> into Unsafe.java, not unsafe.cpp.  (The best-practice rule for putting
> argument checking code outside of the intrinsic is a newer one,
> so Unsafe code might not always do this.)
> 
> The comment "Generalizing it would be reasonable, but requires
> card marking" is bogus, since we never byte-swap managed pointers.
> 
> The test logic will flow a little smoother if your GenericPointer guy,
> the onHeap version, stores the appropriate array base offset in his offset 
> field.
> You won't have to mention p.isOnHeap nearly so much, and the code will
> set a slightly better example.
> 
> The VM_ENTRY_BASE_FROM_LEAF macro is really cool.
> 
> The C++ template code is cool also.  It reminds me of the kind
> of work Gosling's "Ace" processor could do, but now it's mainstreamed
> for all to use in C++.  We're going to get some of that goodness
> in Project Valhalla with specialization logic.
> 
> I find it amazing that the right way to code this in C is to
> use memcpy for unaligned accesses and byte peek/poke
> into registers for byte-swapping operators.  I'm glad we
> can write this code *once* for the JVM and JDK.
> 
> Possible future work:  If we can get a better handle on
> writing vectorizable loops from Java, including Unsafe-based
> ones, we can move some of the C code back up to Java.
> Perhaps U.copy* calls for very short lengths deserved to
> be 

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-04 Thread Andrew Haley
On 02/02/2016 07:25 PM, Mikael Vidstedt wrote:
> Please review this change which introduces a Copy::conjoint_swap and an 
> Unsafe.copySwapMemory method to call it from Java, along with the 
> necessary changes to have java.nio.Bits call it instead of the Bits.c code.
> 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/

One other little thing: why are the byte-swapping methods in class nio.Bits
not called copySwapSomething?  e.g.:

 826 /**
 827  * Copy and byte swap 16 bit elements from off-heap memory to a heap 
array
 828  *
 829  * @param srcAddr
 830  *source address
 831  * @param dst
 832  *destination array, must be a 16-bit primitive array type
 833  * @param dstPos
 834  *byte offset within the destination array of the first 
element to write
 835  * @param length
 836  *number of bytes to copy
 837  */
 838 static void copyToCharArray(long srcAddr, Object dst, long dstPos, 
long length) {
 839 unsafe.copySwapMemory(null, srcAddr, dst, 
unsafe.arrayBaseOffset(dst.getClass()) + dstPos, length, 2);
 840 }

Andrew.




Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-04 Thread Mikael Vidstedt



On 2016-02-04 04:22, Andrew Haley wrote:

On 02/02/2016 07:25 PM, Mikael Vidstedt wrote:

Please review this change which introduces a Copy::conjoint_swap and an
Unsafe.copySwapMemory method to call it from Java, along with the
necessary changes to have java.nio.Bits call it instead of the Bits.c code.

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/

One other little thing: why are the byte-swapping methods in class nio.Bits
not called copySwapSomething?  e.g.:


That sure would be a better name, wouldn't it? I'm not going to be 
changing the Bits method names as part of this change, but it does seem 
like a very reasonable follow-up enhancement.


Cheers,
Mikael



Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-03 Thread Andrew Haley
On 02/02/16 19:25, Mikael Vidstedt wrote:
> Please review this change which introduces a Copy::conjoint_swap and an 
> Unsafe.copySwapMemory method to call it from Java, along with the 
> necessary changes to have java.nio.Bits call it instead of the Bits.c code.

There doesn't seem to be any way to use a byte-swap instruction
in the swapping code.  This will make it unnecessarily slow.

Andrew.



Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-03 Thread David Holmes

Hi Mikael,

Can't really comment on the bit-twiddling details.

A couple of minor style nits:

- don't put "return" on a line by itself, include the first part of the 
return expression

- spaces after commas in template definitions/instantiation

The JVM_ENTRY_FROM_LEAF etc was a little mind twisting but seems okay.

Otherwise hotspot and JDK code appear okay.

Thanks,
David

On 3/02/2016 5:25 AM, Mikael Vidstedt wrote:


Please review this change which introduces a Copy::conjoint_swap and an
Unsafe.copySwapMemory method to call it from Java, along with the
necessary changes to have java.nio.Bits call it instead of the Bits.c code.

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/

On the jdk/ side I don't think there should be a lot of surprises.
Bits.c is gone and that required a mapfile-vers to be changed
accordingly. I also added a relatively extensive
jdk/internal/misc/Unsafe/CopySwap.java test which exercises all the
various copySwap configurations and verifies that the resulting data is
correct. There are also a handful of negative tests in there.

On the hotspot/ side:

* the copy logic in copy.cpp is leveraging templates to help the C++
compiler produce tight copy loops for the various configurations
{element type, copy direction, src aligned, dst aligned}.
* Unsafe_CopySwapMemory is a leaf to not stall safe points more than
necessary. Only if needed (THROW, copy involves heap objects) will it
enter VM using a new JVM_ENTRY_FROM_LEAF macro.
* JVM_ENTRY_FROM_LEAF calls a new VM_ENTRY_BASE_FROM_LEAF helper macro,
which mimics what VM_ENTRY_BASE does, but also does a
debug_only(ResetNoHandleMark __rnhm;) - this is because
JVM_LEAF/VM_LEAF_BASE does debug_only(NoHandleMark __hm;).

I'm in the process of getting the last performance numbers, but from
what I've seen so far this will outperform the earlier implementation.

Cheers,
Mikeal

On 2016-01-27 17:13, Mikael Vidstedt wrote:


Just an FYI:

I'm working on moving all of this to the Hotspot Copy class and
bridging to it via jdk.internal.misc.Unsafe, removing Bits.c
altogether. The implementation is working, and the preliminary
performance numbers beat the pants off of any of the suggested Bits.c
implementations (yay!).

I'm currently in the progress of getting some unit tests in place for
it all to make sure it covers all the corner cases and then I'll run
some real benchmarks to see if it actually lives up to the expectations.

Cheers,
Mikael

On 2016-01-26 11:13, John Rose wrote:

On Jan 26, 2016, at 11:08 AM, Andrew Haley  wrote:

On 01/26/2016 07:04 PM, John Rose wrote:

Unsafe.copyMemory bottoms out to Copy::conjoint_memory_atomic.
IMO that's a better starting point than memcpy.  Perhaps it can be
given an additional parameter (or overloading) to specify a swap size.

OK, but conjoint_memory_atomic doesn't guarantee that destination
words won't be torn if their source is misaligned: in fact it
guarantees that they will will be.

That's a good point, and argues for a new function with the
stronger guarantee.  Actually, it would be perfectly reasonable
to strengthen the guarantee on the existing function.  I don't
think anyone will care about the slight performance change,
especially since it is probably favorable.  Since it's Unsafe,
they are not supposed to care, either.

— John






Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-03 Thread Mikael Vidstedt


On 2016-02-03 01:43, Andrew Haley wrote:

On 02/02/16 19:25, Mikael Vidstedt wrote:

Please review this change which introduces a Copy::conjoint_swap and an
Unsafe.copySwapMemory method to call it from Java, along with the
necessary changes to have java.nio.Bits call it instead of the Bits.c code.

There doesn't seem to be any way to use a byte-swap instruction
in the swapping code.  This will make it unnecessarily slow.


To be clear, this isn't trying to provide the absolutely most optimal 
copy+swap implementation. It's trying to fix the Bits.c unaligned bug 
and pave the way for further improvements. Further performance 
improvements here are certainly possible, but at this point I'm happy as 
long as the performance is on par (or better) with the Bits.c 
implementation it's replacing.


That said, at least gcc seems to recognize the byte swapping pattern and 
does emit a bswap on linux-x64. I'm not sure about the other platforms 
though.


Cheers,
Mikael



Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-03 Thread Andrew Haley
On 02/03/2016 04:13 PM, Mikael Vidstedt wrote:
> 
> On 2016-02-03 01:43, Andrew Haley wrote:
>> On 02/02/16 19:25, Mikael Vidstedt wrote:
>>> Please review this change which introduces a Copy::conjoint_swap and an
>>> Unsafe.copySwapMemory method to call it from Java, along with the
>>> necessary changes to have java.nio.Bits call it instead of the Bits.c code.
>> There doesn't seem to be any way to use a byte-swap instruction
>> in the swapping code.  This will make it unnecessarily slow.
> 
> To be clear, this isn't trying to provide the absolutely most optimal 
> copy+swap implementation. It's trying to fix the Bits.c unaligned bug 
> and pave the way for further improvements. Further performance 
> improvements here are certainly possible, but at this point I'm happy as 
> long as the performance is on par (or better) with the Bits.c 
> implementation it's replacing.

Got it, sure.  It's just nice to be able to replace low-level routines
with platform ones.

> That said, at least gcc seems to recognize the byte swapping pattern and 
> does emit a bswap on linux-x64. I'm not sure about the other platforms 
> though.

Oh, very nice.  Right, I'll check that once your patch does in.

Andrew.




Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-02-03 Thread John Rose
On Feb 2, 2016, at 11:25 AM, Mikael Vidstedt  wrote:
> Please review this change which introduces a Copy::conjoint_swap and an 
> Unsafe.copySwapMemory method to call it from Java, along with the necessary 
> changes to have java.nio.Bits call it instead of the Bits.c code.
> 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/hotspot/webrev/
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.03/jdk/webrev/

This is very good.

I have some nit-picks:

These days, when we introduce a new intrinsic (@HSIntrCand),
we write the argument checking code separately in a non-intrinsic
bytecode method.  In this case, we don't (yet) have an intrinsic
binding for U.copy*, but we might in the future.  (C intrinsifies
memcpy, which is a precedent.)  In any case, I would prefer
if we could structure the argument checking code in a similar
way, with Unsafe.java containing both copySwapMemory
and a private copySwapMemory0.  Then we can JIT-optimize
the safety checks.

You might as well extend the same treatment to the pre-existing
copyMemory call.  The most important check (and the only one
in U.copyMemory) is to ensure that the size_t operand has not
wrapped around from a Java negative value to a crazy-large
size_t value.  That's the low-hanging fruit.  Checking the pointers
(for null or oob) is more problematic, of course.  Checking consistency
around elemSize is cheap and easy, so I agree that the U.copySM
should do that work also.  Basically, Unsafe can do very basic
checks if there is a tricky user model to enforce, but it mustn't
"sign up" to guard the user against all errors.

Rule of thumb:  Unsafe calls don't throw NPEs, they just SEGV.
And the rare bit that *does* throw (IAE usually) should be placed
into Unsafe.java, not unsafe.cpp.  (The best-practice rule for putting
argument checking code outside of the intrinsic is a newer one,
so Unsafe code might not always do this.)

The comment "Generalizing it would be reasonable, but requires
card marking" is bogus, since we never byte-swap managed pointers.

The test logic will flow a little smoother if your GenericPointer guy,
the onHeap version, stores the appropriate array base offset in his offset 
field.
You won't have to mention p.isOnHeap nearly so much, and the code will
set a slightly better example.

The VM_ENTRY_BASE_FROM_LEAF macro is really cool.

The C++ template code is cool also.  It reminds me of the kind
of work Gosling's "Ace" processor could do, but now it's mainstreamed
for all to use in C++.  We're going to get some of that goodness
in Project Valhalla with specialization logic.

I find it amazing that the right way to code this in C is to
use memcpy for unaligned accesses and byte peek/poke
into registers for byte-swapping operators.  I'm glad we
can write this code *once* for the JVM and JDK.

Possible future work:  If we can get a better handle on
writing vectorizable loops from Java, including Unsafe-based
ones, we can move some of the C code back up to Java.
Perhaps U.copy* calls for very short lengths deserved to
be broken out into small loops of U.get/put* (with alignment).
I think you experimented with this, and there were problems
with the JIT putting fail-safe memory barriers between
U.get/put* calls.  Paul's work on Array.mismatch ran into
similar issues, with the right answer being to write manual
vector code in assembly.

Anyway, you can count me as a reviewer.

Thanks,

— John

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-27 Thread Mikael Vidstedt


Just an FYI:

I'm working on moving all of this to the Hotspot Copy class and bridging 
to it via jdk.internal.misc.Unsafe, removing Bits.c altogether. The 
implementation is working, and the preliminary performance numbers beat 
the pants off of any of the suggested Bits.c implementations (yay!).


I'm currently in the progress of getting some unit tests in place for it 
all to make sure it covers all the corner cases and then I'll run some 
real benchmarks to see if it actually lives up to the expectations.


Cheers,
Mikael

On 2016-01-26 11:13, John Rose wrote:

On Jan 26, 2016, at 11:08 AM, Andrew Haley  wrote:

On 01/26/2016 07:04 PM, John Rose wrote:

Unsafe.copyMemory bottoms out to Copy::conjoint_memory_atomic.
IMO that's a better starting point than memcpy.  Perhaps it can be
given an additional parameter (or overloading) to specify a swap size.

OK, but conjoint_memory_atomic doesn't guarantee that destination
words won't be torn if their source is misaligned: in fact it
guarantees that they will will be.

That's a good point, and argues for a new function with the
stronger guarantee.  Actually, it would be perfectly reasonable
to strengthen the guarantee on the existing function.  I don't
think anyone will care about the slight performance change,
especially since it is probably favorable.  Since it's Unsafe,
they are not supposed to care, either.

— John




Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread Alan Bateman


On 26/01/2016 04:57, Mikael Vidstedt wrote:


I've finally found some time to return to this and have a new version 
of the patch which looks more promising:


http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.02/webrev/

This uses memcpy to read/write the native data and the preliminary 
benchmark numbers on linux/x64 shows the expected performance. I'll 
work on verifying that it doesn't impact other platforms negatively 
over the next days/weeks.


Btw, I also played around with implementing it in pure Java, but there 
are some issues getting C2 to correctly vectorize the loop given the 
native data accesses, so the native code seems to be needed for now.
This looks good and maybe it was a good thing that the compiler upgrade 
ran into this. We eliminated most of the native methods in this area a 
few years ago but had to leave the byte swapping methods. It would be 
good to eliminate those some day.


-Alan


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread Andrew Haley
On 26/01/16 04:57, Mikael Vidstedt wrote:
> 
> I've finally found some time to return to this and have a new version of 
> the patch which looks more promising:
> 
> http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.02/webrev/
> 
> This uses memcpy to read/write the native data and the preliminary 
> benchmark numbers on linux/x64 shows the expected performance. I'll work 
> on verifying that it doesn't impact other platforms negatively over the 
> next days/weeks.

I agree that memcpy is the right thing to use.  It's portable and is
inlined well on production-quality C compilers.

Andrew.



Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread Brian Burkhalter


On 1/26/16 4:28 AM, Alan Bateman wrote:


On 26/01/2016 04:57, Mikael Vidstedt wrote:


I've finally found some time to return to this and have a new version 
of the patch which looks more promising:


http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.02/webrev/

This uses memcpy to read/write the native data and the preliminary 
benchmark numbers on linux/x64 shows the expected performance. I'll 
work on verifying that it doesn't impact other platforms negatively 
over the next days/weeks.


Btw, I also played around with implementing it in pure Java, but 
there are some issues getting C2 to correctly vectorize the loop 
given the native data accesses, so the native code seems to be needed 
for now.
This looks good and maybe it was a good thing that the compiler 
upgrade ran into this. We eliminated most of the native methods in 
this area a few years ago but had to leave the byte swapping methods. 
It would be good to eliminate those some day.
I concur that this looks OK. Don't forget to update the latest copyright 
year to 2016.


Thanks,

Brian


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread Andrew Haley
On 01/26/2016 06:32 PM, John Rose wrote:
> On Jan 26, 2016, at 1:04 AM, Andrew Haley  wrote:
>>
>> I agree that memcpy is the right thing to use.  It's portable and is
>> inlined well on production-quality C compilers.
> 
> But it is not strong enough to uphold the Java memory model,
> because it is allows to copy byte-wise, which can tear shorts,
> ints, or longs, creating illegal race states.
> 
> So we try to avoid memcpy when we can.

Yes, I see.  I guess the best we can do is something like the fun and
games in Unsafe.{get, put}LongUnaligned(), which always do the best
they can to align everything.

Andrew.


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread John Rose
On Jan 26, 2016, at 10:48 AM, Andrew Haley  wrote:
> 
> On 01/26/2016 06:32 PM, John Rose wrote:
>> On Jan 26, 2016, at 1:04 AM, Andrew Haley  wrote:
>>> 
>>> I agree that memcpy is the right thing to use.  It's portable and is
>>> inlined well on production-quality C compilers.
>> 
>> But it is not strong enough to uphold the Java memory model,
>> because it is allows to copy byte-wise, which can tear shorts,
>> ints, or longs, creating illegal race states.
>> 
>> So we try to avoid memcpy when we can.
> 
> Yes, I see.  I guess the best we can do is something like the fun and
> games in Unsafe.{get, put}LongUnaligned(), which always do the best
> they can to align everything.

Unsafe.copyMemory bottoms out to Copy::conjoint_memory_atomic.
IMO that's a better starting point than memcpy.  Perhaps it can be
given an additional parameter (or overloading) to specify a swap size.

— John

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread Andrew Haley
On 01/26/2016 07:04 PM, John Rose wrote:
> Unsafe.copyMemory bottoms out to Copy::conjoint_memory_atomic.
> IMO that's a better starting point than memcpy.  Perhaps it can be
> given an additional parameter (or overloading) to specify a swap size.

OK, but conjoint_memory_atomic doesn't guarantee that destination
words won't be torn if their source is misaligned: in fact it
guarantees that they will will be.

Andrew.


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-26 Thread John Rose
On Jan 26, 2016, at 11:08 AM, Andrew Haley  wrote:
> 
> On 01/26/2016 07:04 PM, John Rose wrote:
>> Unsafe.copyMemory bottoms out to Copy::conjoint_memory_atomic.
>> IMO that's a better starting point than memcpy.  Perhaps it can be
>> given an additional parameter (or overloading) to specify a swap size.
> 
> OK, but conjoint_memory_atomic doesn't guarantee that destination
> words won't be torn if their source is misaligned: in fact it
> guarantees that they will will be.

That's a good point, and argues for a new function with the
stronger guarantee.  Actually, it would be perfectly reasonable
to strengthen the guarantee on the existing function.  I don't
think anyone will care about the slight performance change,
especially since it is probably favorable.  Since it's Unsafe,
they are not supposed to care, either.

— John

Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2016-01-25 Thread Mikael Vidstedt


I've finally found some time to return to this and have a new version of 
the patch which looks more promising:


http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.02/webrev/

This uses memcpy to read/write the native data and the preliminary 
benchmark numbers on linux/x64 shows the expected performance. I'll work 
on verifying that it doesn't impact other platforms negatively over the 
next days/weeks.


Btw, I also played around with implementing it in pure Java, but there 
are some issues getting C2 to correctly vectorize the loop given the 
native data accesses, so the native code seems to be needed for now.


Cheers,
Mikael

On 2015-11-25 13:32, Mikael Vidstedt wrote:


Have you looked anything at the performance of the generated code? As 
you may have seen I was playing around with an alternative 
implementation[1] which has the benefit of being pure C++ without 
compiler specific hints. That said, when I did some initial 
benchmarking of that it did seem like the performance impact was 
significant. I didn't have time to look at more in detail why, and 
will not have time to return to that until late next week earliest. It 
would be interesting to understand what type of performance you see 
with your patch.


Cheers,
Mikael


[1] 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.01/webrev/jdk.patch


On 2015-11-25 11:42, Coleen Phillimore wrote:

Sending to core-libs mailing list.

On 11/25/15 2:19 PM, Alexander Smundak wrote:

Please take a look at
http://cr.openjdk.java.net/~asmundak/8141491/jdk/webrev.00
that fixes the problem.

It utilizes the ability of some (GCC and Clang) to declare data
alignment explicitly.
I have verified it works on x86_64 Linux by running
jdk/test/java/nio/Buffer/Basic.java test

I need a sponsor.

Sasha








Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2015-12-01 Thread Mikael Vidstedt


This is as far as I got before I got interrupted:

http://cr.openjdk.java.net/~mikael/NioBenchmark.java

I haven't had time yet to verify that the benchmark code even measures 
the right thing, much less figure out why I get the performance impact 
with my fix. I can see many reasons why that could be the case, but it 
would be good to know if it's something trivial which can be easily 
fixed. In general, it sure would be nice to make this code behave and 
perform without compiler specific annotations, especially given that 
reliance on unaligned memory accesses and the cast specifically is 
sketchy at best.


Cheers,
Mikael

On 2015-11-30 10:13, Alexander Smundak wrote:

On Wed, Nov 25, 2015 at 2:52 PM,   wrote:

Have you looked anything at the performance of the generated code?

No. I looked at the emitted code, saw 'MOVQDU' instruction being used
(it was 'MOVQDA' before that resulted in alignment error), and concluded
  that the compiler knows what it's doing :-)

It would be interesting to understand what type of performance you see with
your patch.

If you have specific benchmark in mind, I am willing to run it.

Sasha




Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2015-11-30 Thread Alexander Smundak
On Wed, Nov 25, 2015 at 2:52 PM,   wrote:
> Have you looked anything at the performance of the generated code?
No. I looked at the emitted code, saw 'MOVQDU' instruction being used
(it was 'MOVQDA' before that resulted in alignment error), and concluded
 that the compiler knows what it's doing :-)
> It would be interesting to understand what type of performance you see with
> your patch.
If you have specific benchmark in mind, I am willing to run it.

Sasha


Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2015-11-25 Thread Coleen Phillimore

Sending to core-libs mailing list.

On 11/25/15 2:19 PM, Alexander Smundak wrote:

Please take a look at
http://cr.openjdk.java.net/~asmundak/8141491/jdk/webrev.00
that fixes the problem.

It utilizes the ability of some (GCC and Clang) to declare data
alignment explicitly.
I have verified it works on x86_64 Linux by running
jdk/test/java/nio/Buffer/Basic.java test

I need a sponsor.

Sasha




Re: RFR JDK-8141491: Unaligned memory access in Bits.c

2015-11-25 Thread Mikael Vidstedt


Have you looked anything at the performance of the generated code? As 
you may have seen I was playing around with an alternative 
implementation[1] which has the benefit of being pure C++ without 
compiler specific hints. That said, when I did some initial benchmarking 
of that it did seem like the performance impact was significant. I 
didn't have time to look at more in detail why, and will not have time 
to return to that until late next week earliest. It would be interesting 
to understand what type of performance you see with your patch.


Cheers,
Mikael


[1] 
http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.01/webrev/jdk.patch


On 2015-11-25 11:42, Coleen Phillimore wrote:

Sending to core-libs mailing list.

On 11/25/15 2:19 PM, Alexander Smundak wrote:

Please take a look at
http://cr.openjdk.java.net/~asmundak/8141491/jdk/webrev.00
that fixes the problem.

It utilizes the ability of some (GCC and Clang) to declare data
alignment explicitly.
I have verified it works on x86_64 Linux by running
jdk/test/java/nio/Buffer/Basic.java test

I need a sponsor.

Sasha