Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-17 Thread Andrew Haley
On 6/17/19 8:46 AM, Nick Gasson wrote:
> Thanks Andrew. Please double check this final webrev is ok to push:
> 
> http://cr.openjdk.java.net/~ngasson/8224851/webrev.2/
> 
> It's webrev.1 plus Kim's review comments.

OK, thanks.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-17 Thread Nick Gasson

On 15/06/2019 01:20, Andrew Haley wrote:


OK, that's all good. Thanks.



Thanks Andrew. Please double check this final webrev is ok to push:

http://cr.openjdk.java.net/~ngasson/8224851/webrev.2/

It's webrev.1 plus Kim's review comments.


Nick


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-13 Thread Andrew Haley
On 6/12/19 10:35 AM, Nick Gasson wrote:
> I also replaced the call to _get_previous_fp() in os::current_frame() with
> 
>*(intptr_t **)__builtin_frame_address(0);
> 
> As it generates the same code and avoids the `register intptr_t **fp 
> __asm__ (SPELL_REG_FP);' declaration which clang doesn't support. Also 
> the following comment in _get_previous_fp seems to be wrong:
> 
>// fp is for this frame (_get_previous_fp). We want the fp for the
>// caller of os::current_frame*(), so go up two frames. However, for
>// optimized builds, _get_previous_fp() will be inlined, so only go
>// up 1 frame in that case.
>#ifdef _NMT_NOINLINE_
>  return **(intptr_t***)fp;
>#else
>  return *fp;
>#endif

All of this seems horribly fragile. I'll have a look at what this is supposed
to be used for and think about a stable and robust way to do it.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-13 Thread Nick Gasson

Hi Kim,

Thanks for the feedback. I'll make these changes.



Is there a reason to conditionalize use of __atomic_add_fetch for
clang and continue to use __sync_add_and_fetch for gcc, rather than
using __atomic_add_fetch unconditionally?


I think I did it that way because I didn't want to affect the build with 
gcc, but actually gcc compiles this to the same instructions with either 
variant so we can get rid of the #ifdef.




Is __ATOMIC_RELEASE and a following FULL_MEM_BARRIER the right usage
to get the equivalent of __sync_add_and_fetch?  Or should
__ATOMIC_SEQ_CST be used?  Or should the latter be actively avoided?
I remember some discussion and questions in that area, about how to
implement memory_order_seq_cst on ARM processors, but wasn't paying
close attention since I don't deal with ARM much.



It generates the same instruction sequence as the existing 
__sync_fetch_and_add.


The __sync_XXX builtins have a full barrier at the end [1], so GCC 
compiles this to:


.L2:
ldxrw2, [x1]
add w2, w2, w0
stlxr   w3, w2, [x1]; Store release
cbnzw3, .L2
dmb ish ; Full barrier
mov w0, w2
ret

Whereas using the __atomic builtin with __ATOMIC_SEQ_CST we get two 
half-barriers like this:


.L2:
ldaxr   w2, [x1]; Load acquire
add w2, w2, w0
stlxr   w3, w2, [x1]; Store release
cbnzw3, .L2
mov w0, w2
ret


Thanks,
Nick


[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-12 Thread Kim Barrett
> On Jun 12, 2019, at 5:35 AM, Nick Gasson  wrote:
> 
> Hi Andrew,
> 
> Sorry for the delay, I've put an updated webrev here:
> 
> http://cr.openjdk.java.net/~ngasson/8224851/webrev.1/

Since I've been commenting on some parts, I decided I might as well
look at all of it.  So here's a review.  I don't need to re-review any
of the suggested changes.

--
src/hotspot/cpu/aarch64/frame_aarch64.cpp
 772 reg_map = (RegisterMap*)os::malloc(sizeof(RegisterMap), mtNone);

Use NEW_C_HEAP_OBJ(RegisterMap, mtNone)

--
src/hotspot/cpu/aarch64/macroAssembler_aarch64_trig.cpp
 384 fcmpd(v26, 0.0);  // if NE then jx == 2. 
else it's 1 or 0

and 

 388 fcmpd(v7, 0.0);   // v7 == 0 => jx = 0. 
Else jx = 1

EOL comments no longer aligned with others nearby.

--
src/hotspot/cpu/aarch64/vm_version_aarch64.cpp
 180   if ((p = strchr(buf, ':'))) {

Hotspot Coding Style says not to use implicit bool, instead using
explicit comparisons, e.g. this should be

 180   if ((p = strchr(buf, ':')) != NULL) {

That would have avoided the warning in the first place.

And yes, I know there are lots of violations of that guideline.

--
src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
  43 #ifdef __clang__
  44 D res = __atomic_add_fetch(dest, add_value, __ATOMIC_RELEASE);
  45 FULL_MEM_BARRIER;
  46 return res;
  47 #else

Is there a reason to conditionalize use of __atomic_add_fetch for
clang and continue to use __sync_add_and_fetch for gcc, rather than
using __atomic_add_fetch unconditionally?

Is __ATOMIC_RELEASE and a following FULL_MEM_BARRIER the right usage
to get the equivalent of __sync_add_and_fetch?  Or should
__ATOMIC_SEQ_CST be used?  Or should the latter be actively avoided?
I remember some discussion and questions in that area, about how to
implement memory_order_seq_cst on ARM processors, but wasn't paying
close attention since I don't deal with ARM much.

Andrew?  I'll happily defer to you here.

--
src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp

I didn't review the stuff around os::current_stack_pointer and
os::current_frame.

--
src/hotspot/os_cpu/linux_aarch64/copy_linux_aarch64.s
 162 prfum   pldl1keep, [s, #-256]

I didn't review this change.

--



Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-12 Thread Nick Gasson

Hi Andrew,

Sorry for the delay, I've put an updated webrev here:

http://cr.openjdk.java.net/~ngasson/8224851/webrev.1/

Changes since the last patch:

* Replaced ~((1<<12) - 1) with -1u<<12

* Use __atomic_add_fetch in Atomic::PlatformAdd #ifdef __clang__

* Use __builtin_frame_address(0) in os::current_frame()

* Use placement new / copy assignment in pf()

I also replaced the call to _get_previous_fp() in os::current_frame() with

  *(intptr_t **)__builtin_frame_address(0);

As it generates the same code and avoids the `register intptr_t **fp 
__asm__ (SPELL_REG_FP);' declaration which clang doesn't support. Also 
the following comment in _get_previous_fp seems to be wrong:


  // fp is for this frame (_get_previous_fp). We want the fp for the
  // caller of os::current_frame*(), so go up two frames. However, for
  // optimized builds, _get_previous_fp() will be inlined, so only go
  // up 1 frame in that case.
  #ifdef _NMT_NOINLINE_
return **(intptr_t***)fp;
  #else
return *fp;
  #endif

Even on -O0 gcc won't generate a stack frame for this function so if 
_NMT_NOINLINE_ is defined you get the caller's caller's FP rather than 
the caller's FP. I just deleted the function as it's not used anywhere else.


BTW we can't use __builtin_frame_address(1) as GCC gives a warning when 
this is called with a non-zero argument (-Wframe-address).


Tested jtreg hotspot_all_no_apps and jdk_core (gcc 7.3).


Thanks,
Nick


On 03/06/2019 19:03, Andrew Haley wrote:

Hi,

On 6/3/19 11:36 AM, Nick Gasson wrote:



We need to know what it's used for to know which solution is right. I'm
guessing the primary use case is asynchronous runtime stack probes, used
by debugging tools.



Yes, we also have os::stack_shadow_pages_available() which uses it to
calculate how much space there is between the current SP and the end of
the stack. In this case I think it's ok as long as we don't return a
value that's *above* the actual stack pointer. And os::current_frame(),
which constructs a `frame' object using the FP of the caller, but the SP
will point into the frame of os::current_frame(), so it seems it's
already inaccurate.


Eww.


There's also a comment in os_linux.cpp that says "Don't use
os::current_stack_pointer(), as its result can be slightly below current
stack pointer". So I'm wondering if we can simplify this a lot and use
__builtin_frame_address(0) which will give us the FP in
os::current_stack_pointer (ought to be the caller's SP - 16). Zero does
this currently. And maybe we can replace _get_previous_fp() with
__builtin_frame_address(1)?


Let's make os::current_stack_pointer() noinline, then make it return
__builtin_frame_address(0).



Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Kim Barrett
> On Jun 3, 2019, at 4:32 AM, Andrew Haley  wrote:
> 
> On 6/3/19 7:58 AM, Nick Gasson wrote:
>> 
>>> If I'm reading the gcc documentation for __sync_add_and_fetch
>>> correctly, I think the following (completely untested) should work,
>>> and won't cause Andrew to (I think rightly) complain about
>>> type-punning reinterpret_casts.
>>> 
>>> Though I wonder if this is a clang bug. (See below.)
> 
> It's arguably a mistake in the GCC documentation, but adding two
> pointers is not well defined whereas adding a pointer and an integer
> is.

Yes, the gcc documentation seems to be a bit botched here.  But there
is also the semantic description, e.g. for __sync_OP_and_fetch:

{ *ptr OP= value; return *ptr; }

>> Another way round this is to use __atomic_add_fetch instead which both
>> Clang and GCC accept:
>> 
>>  template
>>  D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
>> const {
>>D res = __atomic_add_fetch(dest, add_value, __ATOMIC_RELEASE);
>>FULL_MEM_BARRIER;
>>return res;
>>  }
> 
> I could live with that.

The __atomic_xxx operations seem to have the same documentation
issues.  It's strange that clang would get one set right and the other
wrong.  But okay, using them seems like a good workaround. And maybe
someday the order argument will be used to select the barriers to use.



Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Andrew Haley
Hi,

On 6/3/19 11:36 AM, Nick Gasson wrote:
> 
>> We need to know what it's used for to know which solution is right. I'm
>> guessing the primary use case is asynchronous runtime stack probes, used
>> by debugging tools.
>>
> 
> Yes, we also have os::stack_shadow_pages_available() which uses it to 
> calculate how much space there is between the current SP and the end of 
> the stack. In this case I think it's ok as long as we don't return a 
> value that's *above* the actual stack pointer. And os::current_frame(), 
> which constructs a `frame' object using the FP of the caller, but the SP 
> will point into the frame of os::current_frame(), so it seems it's 
> already inaccurate.

Eww.

> There's also a comment in os_linux.cpp that says "Don't use 
> os::current_stack_pointer(), as its result can be slightly below current
> stack pointer". So I'm wondering if we can simplify this a lot and use 
> __builtin_frame_address(0) which will give us the FP in 
> os::current_stack_pointer (ought to be the caller's SP - 16). Zero does 
> this currently. And maybe we can replace _get_previous_fp() with 
> __builtin_frame_address(1)?

Let's make os::current_stack_pointer() noinline, then make it return
__builtin_frame_address(0).

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Nick Gasson

Hi Andrew,



We need to know what it's used for to know which solution is right. I'm
guessing the primary use case is asynchronous runtime stack probes, used
by debugging tools.



Yes, we also have os::stack_shadow_pages_available() which uses it to 
calculate how much space there is between the current SP and the end of 
the stack. In this case I think it's ok as long as we don't return a 
value that's *above* the actual stack pointer. And os::current_frame(), 
which constructs a `frame' object using the FP of the caller, but the SP 
will point into the frame of os::current_frame(), so it seems it's 
already inaccurate.


There's also a comment in os_linux.cpp that says "Don't use 
os::current_stack_pointer(), as its result can be slightly below current
stack pointer". So I'm wondering if we can simplify this a lot and use 
__builtin_frame_address(0) which will give us the FP in 
os::current_stack_pointer (ought to be the caller's SP - 16). Zero does 
this currently. And maybe we can replace _get_previous_fp() with 
__builtin_frame_address(1)?


Thanks,
Nick


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Andrew Haley
On 6/3/19 10:08 AM, Nick Gasson wrote:
> Hi Andrew,
> 
> On 03/06/2019 16:32, Andrew Haley wrote:
>>
>> I don't think there's any need to boycott the whole compiler because
>> of this bug. It would make more sense to try to get clang fixed.
>>
>> Maybe for now put the cast in #ifdef CLANG ?
>>
> 
> I think I'd rather put the __atomic_fetch_add in an #ifdef CLANG? This 
> seems a bit neater than casting the integer argument to a pointer.

I guess the solution which uses __atomic_add_fetch instead is best.

> Did you have any thoughts about os::current_stack_pointer() from my 
> earlier email?

We need to know what it's used for to know which solution is right. I'm
guessing the primary use case is asynchronous runtime stack probes, used
by debugging tools.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Nick Gasson

Hi Andrew,

On 03/06/2019 16:32, Andrew Haley wrote:


I don't think there's any need to boycott the whole compiler because
of this bug. It would make more sense to try to get clang fixed.

Maybe for now put the cast in #ifdef CLANG ?



I think I'd rather put the __atomic_fetch_add in an #ifdef CLANG? This 
seems a bit neater than casting the integer argument to a pointer.


Did you have any thoughts about os::current_stack_pointer() from my 
earlier email?



Thanks,
Nick


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Andrew Haley
On 6/3/19 7:58 AM, Nick Gasson wrote:
> 
>> If I'm reading the gcc documentation for __sync_add_and_fetch
>> correctly, I think the following (completely untested) should work,
>> and won't cause Andrew to (I think rightly) complain about
>> type-punning reinterpret_casts.
>>
>> Though I wonder if this is a clang bug. (See below.)

It's arguably a mistake in the GCC documentation, but adding two
pointers is not well defined whereas adding a pointer and an integer
is.

>> template
>> struct Atomic::PlatformAdd
>>   : Atomic::AddAndFetch >
>> {
>>   template
>>   D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
>> const {
>> // Cast to work around clang pointer arithmetic bug.
>> return PrimitiveConversions::cast(__sync_add_and_fetch(dest, 
>> add_value));
>>   }
>> }
> 
> This fails with:
> 
> /home/nicgas01/jdk/src/hotspot/share/metaprogramming/primitiveConversions.hpp:162:10:
>  error: invalid use of incomplete type 'struct 
> PrimitiveConversions::Cast'
>return Cast()(x);
>   ^~~~
> 
> I think because all the specialisations of PrimitiveConversions are only
> enabled if T is an integral type and here it's char*?
> 
>>
>> The key bit from the documentation is "Operations on pointer arguments
>> are performed as if the operands were of the uintptr_t type."
>>
>> The bug description says this warning only arises with clang.  It
>> seems like clang is overdoing that as-if, and treating it literally as
>> uintptr_t all the way through to the result type.
>>
> 
> I think Clang is actually complaining about the argument types
> mismatching here. According to the GCC docs the __sync_add_and_fetch
> prototype looks like:
> 
> type __sync_add_and_fetch (type *ptr, type value, ...)
> 
> I guess Clang infers type=char* from the first argument, and complains
> that the second argument is unsigned long. So Clang gives an error for
> the following but GCC accepts it:
> 
>   char *ptr;
>   unsigned long val;
>   __sync_add_and_fetch(, val);
> 
> Another way round this is to use __atomic_add_fetch instead which both
> Clang and GCC accept:
> 
>   template
>   D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
> const {
> D res = __atomic_add_fetch(dest, add_value, __ATOMIC_RELEASE);
> FULL_MEM_BARRIER;
> return res;
>   }

I could live with that.

>> Or don't change the code at all here, and say that clang is not
>> (currently) a supported compiler for this platform.
> 
> Yes, that's an option too :-). But we should change the configure script
> to give an error on --with-toolchain-type=clang on AArch64.

I don't think there's any need to boycott the whole compiler because
of this bug. It would make more sense to try to get clang fixed.

Maybe for now put the cast in #ifdef CLANG ?

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-06-03 Thread Nick Gasson
Hi Kim,

> If I'm reading the gcc documentation for __sync_add_and_fetch
> correctly, I think the following (completely untested) should work,
> and won't cause Andrew to (I think rightly) complain about
> type-punning reinterpret_casts.
>
> Though I wonder if this is a clang bug. (See below.)
>
> template
> struct Atomic::PlatformAdd
>   : Atomic::AddAndFetch >
> {
>   template
>   D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
> const {
> // Cast to work around clang pointer arithmetic bug.
> return PrimitiveConversions::cast(__sync_add_and_fetch(dest, 
> add_value));
>   }
> }

This fails with:

/home/nicgas01/jdk/src/hotspot/share/metaprogramming/primitiveConversions.hpp:162:10:
 error: invalid use of incomplete type 'struct 
PrimitiveConversions::Cast'
   return Cast()(x);
  ^~~~

I think because all the specialisations of PrimitiveConversions are only
enabled if T is an integral type and here it's char*?

>
> The key bit from the documentation is "Operations on pointer arguments
> are performed as if the operands were of the uintptr_t type."
>
> The bug description says this warning only arises with clang.  It
> seems like clang is overdoing that as-if, and treating it literally as
> uintptr_t all the way through to the result type.
>

I think Clang is actually complaining about the argument types
mismatching here. According to the GCC docs the __sync_add_and_fetch
prototype looks like:

type __sync_add_and_fetch (type *ptr, type value, ...)

I guess Clang infers type=char* from the first argument, and complains
that the second argument is unsigned long. So Clang gives an error for
the following but GCC accepts it:

  char *ptr;
  unsigned long val;
  __sync_add_and_fetch(, val);

Another way round this is to use __atomic_add_fetch instead which both
Clang and GCC accept:

  template
  D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
const {
D res = __atomic_add_fetch(dest, add_value, __ATOMIC_RELEASE);
FULL_MEM_BARRIER;
return res;
  }

> Or don't change the code at all here, and say that clang is not (currently) a 
> supported
> compiler for this platform.

Yes, that's an option too :-). But we should change the configure script
to give an error on --with-toolchain-type=clang on AArch64.

Thanks,
Nick


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-31 Thread Kim Barrett
> On May 31, 2019, at 4:46 PM, Kim Barrett  wrote:
> 
>> On May 31, 2019, at 12:11 AM, Nick Gasson  wrote:
>> 
>> /home/nicgas01/jdk/src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp:259:34:
>>  note: in instantiation of function template specialization 
>> 'Atomic::add' requested here
>> char* touch_addr = Atomic::add(actual_chunk_size, &_cur_addr) - 
>> actual_chunk_size;
>>^
> […]
> If I'm reading the gcc documentation for __sync_add_and_fetch
> correctly, I think the following (completely untested) should work,
> and won't cause Andrew to (I think rightly) complain about
> type-punning reinterpret_casts.
> 
> Though I wonder if this is a clang bug. (See below.)
> 
> […]
> The bug description says this warning only arises with clang.  It
> seems like clang is overdoing that as-if, and treating it literally as
> uintptr_t all the way through to the result type.

Or don’t change the code at all here, and say that clang is not (currently) a 
supported
compiler for this platform.



Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-31 Thread Kim Barrett
> On May 31, 2019, at 12:11 AM, Nick Gasson  wrote:
>>> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp:43:39:
>>>  error: cannot initialize a parameter of type 'char *' with an lvalue of 
>>> type 'unsigned long'
>>>return __sync_add_and_fetch(dest, add_value);
>>>  ^
>>> 
>>> If the add_and_fetch template is instantiated with I=integer and
>>> D=pointer then we need an explicit cast here.
>> 
>> Please don't. Please have a look at where this happens and fix the
>> types there, as appropriate. What do other ports do?
> 
> It's instantiated here:
> 
> /home/nicgas01/jdk/src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp:259:34:
>  note: in instantiation of function template specialization 
> 'Atomic::add' requested here
>  char* touch_addr = Atomic::add(actual_chunk_size, &_cur_addr) - 
> actual_chunk_size;
> ^
> 
> We're adding an unsigned long to a char* which seems legitimate.
> 
> The other ports except Arm32 and Zero use inline assembly so don't have
> problems with mismatched types. Zero uses __sync_fetch_and_add so would
> hit this too if it used G1.
> 
> I'm not sure how to resolve this. We could add a specialisation of
> PlatformAdd for byte_size=8 and cast to (u)intptr_t?
> 
> template<>
> struct Atomic::PlatformAdd<8>
>  : Atomic::AddAndFetch >
> {
>  template
>  D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
> const {
>STATIC_ASSERT(8 == sizeof(I));
>STATIC_ASSERT(8 == sizeof(D));
> 
>typedef typename Conditional::value, intptr_t, 
> uintptr_t>::type PI;
> 
>PI res = __sync_add_and_fetch(reinterpret_cast(dest),
>  static_cast(add_value));
>return reinterpret_cast(res);
>  }
> }
> 
> I think this is safe.

If I'm reading the gcc documentation for __sync_add_and_fetch
correctly, I think the following (completely untested) should work,
and won't cause Andrew to (I think rightly) complain about
type-punning reinterpret_casts.

Though I wonder if this is a clang bug. (See below.)

template
struct Atomic::PlatformAdd
  : Atomic::AddAndFetch >
{
  template
  D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) 
const {
// Cast to work around clang pointer arithmetic bug.
return PrimitiveConversions::cast(__sync_add_and_fetch(dest, add_value));
  }
}

The key bit from the documentation is "Operations on pointer arguments
are performed as if the operands were of the uintptr_t type."

The bug description says this warning only arises with clang.  It
seems like clang is overdoing that as-if, and treating it literally as
uintptr_t all the way through to the result type.

The documentation also says "That is, they are not scaled by the size
of the type to which the pointer points."  Any needed scaling of
add_value has already been done in the calling Atomic::AddImpl.




Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-30 Thread Nick Gasson

Hi Andrew,



Oh dear. Experience has shown me (and probably you, too) that this is
one of the most error-prone parts of software development. Many
very serious failures have been caused by trying to shut up
compilers. These failures have even included major security breaches.



Fair enough. But I think several of these are worth fixing.




/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp:43:39:
 error: cannot initialize a parameter of type 'char *' with an lvalue of type 
'unsigned long'
return __sync_add_and_fetch(dest, add_value);
  ^

If the add_and_fetch template is instantiated with I=integer and
D=pointer then we need an explicit cast here.


Please don't. Please have a look at where this happens and fix the
types there, as appropriate. What do other ports do?


It's instantiated here:

/home/nicgas01/jdk/src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp:259:34: 
note: in instantiation of function template specialization 
'Atomic::add' requested here
  char* touch_addr = Atomic::add(actual_chunk_size, &_cur_addr) - 
actual_chunk_size;

 ^

We're adding an unsigned long to a char* which seems legitimate.

The other ports except Arm32 and Zero use inline assembly so don't have
problems with mismatched types. Zero uses __sync_fetch_and_add so would
hit this too if it used G1.

I'm not sure how to resolve this. We could add a specialisation of
PlatformAdd for byte_size=8 and cast to (u)intptr_t?

template<>
struct Atomic::PlatformAdd<8>
  : Atomic::AddAndFetch >
{
  template
  D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order 
order) const {

STATIC_ASSERT(8 == sizeof(I));
STATIC_ASSERT(8 == sizeof(D));

typedef typename Conditional::value, intptr_t, 
uintptr_t>::type PI;


PI res = __sync_add_and_fetch(reinterpret_cast(dest),
  static_cast(add_value));
return reinterpret_cast(res);
  }
}

I think this is safe.




Rewrote as ~((1<<12) - 1) which matches the masks used in the rest of
the function.


Let's not. All we have to do to create a mask is is use -1u<<12. I
would challenge anyone to figure out without using a pencil what the
effect of  &= ~(((1<<12)-1)<<12)  might be.


OK, I don't mind too much either way.




/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp:103:20:
 error: variable 'esp' is uninitialized when used here [-Werror,-Wuninitialized]
  return (address) esp;
   ^~~
/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp:101:21:
 note: initialize the variable 'esp' to silence this warning
  register void *esp __asm__ (SPELL_REG_SP);
^
 = nullptr

We could silence the -Wuninitialized warning for this function, but that
doesn't help much because Clang then compiles
os::current_stack_pointer() into:

 <_ZN2os21current_stack_pointerEv>:
   0:   d65f03c0ret

Whereas with GCC we get:

0250 <_ZN2os21current_stack_pointerEv>:
 250:   910003e0mov x0, sp
 254:   d65f03c0ret

Added some inline assembly to explicitly move sp into the result #ifdef
__clang__. Same with _get_previous_fp().


This is all looking extremely fragile. We'd need to look at what the
usages of this code are and trace through.


Yes, I agree this function seems quite fragile. What is the "current"
stack pointer supposed to be? Is it the SP at the call site? That's what
you get with current code on GCC -00 and -02, and with this patch on
clang -O2. But clang -O0 gives SP - 16 because it creates a stack frame
for os::current_stack_pointer().

On x86_64 with gcc -00 and clang -00 we get this:

 <_ZN2os21current_stack_pointerEv>:
0:   55  push   %rbp
1:   48 89 e5mov%rsp,%rbp
4:   48 89 e0mov%rsp,%rax
7:   5d  pop%rbp
8:   c3  retq

The result is RSP before the CALL instruction - 16. On -02 we get
caller's RSP - 8 instead:

 <_ZN2os21current_stack_pointerEv>:
   0:48 89 e0mov%rsp,%rax
   3:c3  retq

I'm also not sure our existing use of `__asm__ (SPELL_REG_SP)' is
correct. The GCC manual [1] says "The only supported use for this
feature is to specify registers for input and output operands when
calling Extended asm" which we are not doing here.

Excluding assertions and tests, the usages are in os::current_frame()
and as the input to os::stack_shadow_pages_available(). In the first
case the SP seems to already be inaccurate as it points into the frame
of os::current_frame() but the FP used is for the caller. In the second
case I think there is only a problem if the value returned by
os::current_stack_pointer() is *above* the actual SP, as then 

Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-30 Thread Nick Gasson
Hi Kim,

Kim Barrett writes:
>
> I think the solution here is to use placement new, but explicitly qualified 
> to actually get there, e.g.
>
> ::new (reg_map) RegisterMap(map);
>
> Using unqualified new will do the lookup in RegisterMap, find some 
> declarations in StackObj (but not the one we want) and fail.
> This is a pretty common pattern.
>

This works, thanks.


Nick


Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-29 Thread Kim Barrett
Not a review, just briefly jumping in because I noticed one thing to correct.

> On May 29, 2019, at 9:12 AM, Andrew Haley  wrote:
> 
> On 5/29/19 11:26 AM, Nick Gasson wrote:
> 
> Oh dear. Experience has shown me (and probably you, too) that this is
> one of the most error-prone parts of software development. Many
> very serious failures have been caused by trying to shut up
> compilers. These failures have even included major security breaches.
> 
> With that, let's press on with a heavy heart…

+1 on all that.

>> This first one is with GCC 8.3:
>> 
>> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp: In function 
>> 'void pf(long unsigned int, long unsigned int, long unsigned int, long 
>> unsigned int, long unsigned int)':
>> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp:775:35: error: 
>> 'void* memcpy(void*, const void*, size_t)' writing to an object of type 
>> 'class RegisterMap' with no trivial copy-assignment; use copy-assignment or 
>> copy-initialization instead [-Werror=class-memaccess]
>>   memcpy(reg_map, , sizeof map);
>>   ^
>> 
>> RegisterMap is non-POD because its base class AllocatedObj has virtual
>> methods, so we need to use copy-assignment rather than memcpy. But
>> because we're allocating reg_map with os::malloc we ought to use
>> placement new to properly construct it first. But we can't do that
>> because RegisterMap inherits from StackObj which disallows new. So I
>> just put in some casts to void * to silence the warning.
> 
> We can't do that. It seems to me that this must be Undefined
> Behaviour, and we must never attempt to cover up UB with casts.

I think the solution here is to use placement new, but explicitly qualified to 
actually get there, e.g.

::new (reg_map) RegisterMap(map);

Using unqualified new will do the lookup in RegisterMap, find some declarations 
in StackObj (but not the one we want) and fail.
This is a pretty common pattern.

>> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp:2684:17:
>>  error: shifting a negative signed value is undefined 
>> [-Werror,-Wshift-negative-value]
>>offset &= -1<<12;
>>  ~~^
> 
> Oh FFS. :-) This is perfectly legal in GCC and probably clang as well.

Yeah, I sometimes think we should disable -Wshift-negative-value.




RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-29 Thread Andrew Haley
On 5/29/19 11:26 AM, Nick Gasson wrote:

> Please review this set of fixes to building hotspot with GCC 8.3 or 
> Clang 7.0 on AArch64.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8224851
> Webrev: http://cr.openjdk.java.net/~ngasson/8224851/webrev.0/
> 
> Ran jtreg with no new failures.

Oh dear. Experience has shown me (and probably you, too) that this is
one of the most error-prone parts of software development. Many
very serious failures have been caused by trying to shut up
compilers. These failures have even included major security breaches.

With that, let's press on with a heavy heart...

> 
> This first one is with GCC 8.3:
> 
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp: In function 
> 'void pf(long unsigned int, long unsigned int, long unsigned int, long 
> unsigned int, long unsigned int)':
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp:775:35: error: 
> 'void* memcpy(void*, const void*, size_t)' writing to an object of type 
> 'class RegisterMap' with no trivial copy-assignment; use copy-assignment or 
> copy-initialization instead [-Werror=class-memaccess]
>memcpy(reg_map, , sizeof map);
>^
> 
> RegisterMap is non-POD because its base class AllocatedObj has virtual
> methods, so we need to use copy-assignment rather than memcpy. But
> because we're allocating reg_map with os::malloc we ought to use
> placement new to properly construct it first. But we can't do that
> because RegisterMap inherits from StackObj which disallows new. So I
> just put in some casts to void * to silence the warning.

We can't do that. It seems to me that this must be Undefined
Behaviour, and we must never attempt to cover up UB with casts.

> I can't see where `pf' and `internal_pf' are used - perhaps they can be
> called manually from the debugger?

They're debugger commands. That's why they have external C linkage.

> If no one uses them maybe they should be deleted?

Perhaps so. We have the debugger command pfl() which does all the
frame printing I find I ever need.

> The remaining warnings / errors are with Clang 7.0.
> 
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:279:22: 
> error: & has lower precedence than ==; == will be evaluated first 
> [-Werror,-Wparentheses]
> assert_cond(bits & mask == mask);
>  ^~
> 
> To preserve the behaviour we should write `bits & (mask == mask)' but I
> don't think this was what was intended so I changed it to
> `(bits & mask) == mask'.

That's right. It's supposed to check that the field being read from an
instruction has already been set.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp:44:25: 
> error: redeclaration of using declaration
>   using MacroAssembler::call_VM_leaf_base;
> ^
> 
> This using declaration appears twice in a row.

OK.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/aarch64.ad:13866:80: error: 
> invalid suffix 'D' on floating constant
> __ fcmps(as_FloatRegister(opnd_array(1)->reg(ra_,this,idx1)/* src1 */), 
> 0.0D);
>   
>  ^
> 
> The "d" or "D" suffix on floating point literals is not in C++98. I
> believe it comes from an extension to C to support decimal floating
> point? Anyway it's not necessary here.

OK.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRGenerator_aarch64.cpp:429:66:
>  error: implicit conversion of NULL constant to 'bool' 
> [-Werror,-Wnull-conversion]
>   arithmetic_op_fpu(x->op(), reg, left.result(), right.result(), NULL);
>   ~  ^~~~
>  false
>
> The NULL here is for the `bool is_strictfp' argument to
> LIRGenerator::arithmetic_op_fpu. Changing it to `false' would preserve
> the existing behaviour but it seems like it should be `x->is_strictfp()'
> to match other platforms.

OK, use x->is_strictfp(). I don't think that it can possibly make any
difference on AArch64.

> As a consequence of this we need to handle the lir_{div,mul}_scriptfp
> opcodes in LIR_Assembler::arith_op, but these just fall through to the
> non-strictfp variants so there should be no behaviour change.

Yep.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp:1074:24:
>  error: '&&' within '||' [-Werror,-Wlogical-op-parentheses]
>   if (is_unordered && op->cond() == lir_cond_equal
>   ~^~~
> 
> Added parenthesis to make this explicit. No behaviour change.

OK. I don't quite get what this is for, but I guess it shuts the
compiler up and is harmless.

> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/copy_linux_aarch64.s:162:32:
>  error: index must be a multiple of 8 in range [0, 32760].
> prfmpldl1keep, [s, #-256]
>^
> 
> The immediate offset in `prfm' 

Re: RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-29 Thread Andrew Haley
On 5/29/19 11:26 AM, Nick Gasson wrote:

> Please review this set of fixes to building hotspot with GCC 8.3 or 
> Clang 7.0 on AArch64.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8224851
> Webrev: http://cr.openjdk.java.net/~ngasson/8224851/webrev.0/
> 
> Ran jtreg with no new failures.

Oh dear. Experience has shown me (and probably you, too) that this is
one of the most error-prone parts of software development. Many
very serious failures have been recorded when trying to shut up
compilers. These failures have even included major security breaches.

With that, let's press on with a heavy heart...

> 
> This first one is with GCC 8.3:
> 
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp: In function 
> 'void pf(long unsigned int, long unsigned int, long unsigned int, long 
> unsigned int, long unsigned int)':
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp:775:35: error: 
> 'void* memcpy(void*, const void*, size_t)' writing to an object of type 
> 'class RegisterMap' with no trivial copy-assignment; use copy-assignment or 
> copy-initialization instead [-Werror=class-memaccess]
>memcpy(reg_map, , sizeof map);
>^
> 
> RegisterMap is non-POD because its base class AllocatedObj has virtual
> methods, so we need to use copy-assignment rather than memcpy. But
> because we're allocating reg_map with os::malloc we ought to use
> placement new to properly construct it first. But we can't do that
> because RegisterMap inherits from StackObj which disallows new. So I
> just put in some casts to void * to silence the warning.

We can't do that. It seems to me that this must be Undefined
Behaviour, and we must never attempt to cover up UB with casts.

> I can't see where `pf' and `internal_pf' are used - perhaps they can be
> called manually from the debugger?

They're debugger commands. That's why they have external C linkage.

> If no one uses them maybe they should be deleted?

Perhaps so. We have the debugger command pfl() which does all the
frame printing I find I ever need.

> The remaining warnings / errors are with Clang 7.0.
> 
> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:279:22: 
> error: & has lower precedence than ==; == will be evaluated first 
> [-Werror,-Wparentheses]
> assert_cond(bits & mask == mask);
>  ^~
> 
> To preserve the behaviour we should write `bits & (mask == mask)' but I
> don't think this was what was intended so I changed it to
> `(bits & mask) == mask'.

That's right. It's supposed to check that the field being read from an
instruction has already been set.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp:44:25: 
> error: redeclaration of using declaration
>   using MacroAssembler::call_VM_leaf_base;
> ^
> 
> This using declaration appears twice in a row.

OK.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/aarch64.ad:13866:80: error: 
> invalid suffix 'D' on floating constant
> __ fcmps(as_FloatRegister(opnd_array(1)->reg(ra_,this,idx1)/* src1 */), 
> 0.0D);
>   
>  ^
> 
> The "d" or "D" suffix on floating point literals is not in C++98. I
> believe it comes from an extension to C to support decimal floating
> point? Anyway it's not necessary here.

OK.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRGenerator_aarch64.cpp:429:66:
>  error: implicit conversion of NULL constant to 'bool' 
> [-Werror,-Wnull-conversion]
>   arithmetic_op_fpu(x->op(), reg, left.result(), right.result(), NULL);
>   ~  ^~~~
>  false
>
> The NULL here is for the `bool is_strictfp' argument to
> LIRGenerator::arithmetic_op_fpu. Changing it to `false' would preserve
> the existing behaviour but it seems like it should be `x->is_strictfp()'
> to match other platforms.

OK, use x->is_strictfp(). I don't think that it can possibly make any
difference on AArch64.

> As a consequence of this we need to handle the lir_{div,mul}_scriptfp
> opcodes in LIR_Assembler::arith_op, but these just fall through to the
> non-strictfp variants so there should be no behaviour change.

Yep.

> /home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp:1074:24:
>  error: '&&' within '||' [-Werror,-Wlogical-op-parentheses]
>   if (is_unordered && op->cond() == lir_cond_equal
>   ~^~~
> 
> Added parenthesis to make this explicit. No behaviour change.

OK. I don't quite get what this is for, but I guess it shuts the
compiler up.

> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/copy_linux_aarch64.s:162:32:
>  error: index must be a multiple of 8 in range [0, 32760].
> prfmpldl1keep, [s, #-256]
>^
> 
> The immediate offset in `prfm' gets 

RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

2019-05-29 Thread Nick Gasson
Hi,

Please review this set of fixes to building hotspot with GCC 8.3 or 
Clang 7.0 on AArch64.

Bug: https://bugs.openjdk.java.net/browse/JDK-8224851
Webrev: http://cr.openjdk.java.net/~ngasson/8224851/webrev.0/

Ran jtreg with no new failures.

This first one is with GCC 8.3:

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp: In function 'void 
pf(long unsigned int, long unsigned int, long unsigned int, long unsigned int, 
long unsigned int)':
/home/nicgas01/jdk/src/hotspot/cpu/aarch64/frame_aarch64.cpp:775:35: error: 
'void* memcpy(void*, const void*, size_t)' writing to an object of type 'class 
RegisterMap' with no trivial copy-assignment; use copy-assignment or 
copy-initialization instead [-Werror=class-memaccess]
   memcpy(reg_map, , sizeof map);
   ^

RegisterMap is non-POD because its base class AllocatedObj has virtual
methods, so we need to use copy-assignment rather than memcpy. But
because we're allocating reg_map with os::malloc we ought to use
placement new to properly construct it first. But we can't do that
because RegisterMap inherits from StackObj which disallows new. So I
just put in some casts to void * to silence the warning.

I can't see where `pf' and `internal_pf' are used - perhaps they can be
called manually from the debugger? If no one uses them maybe they should
be deleted?

The remaining warnings / errors are with Clang 7.0.

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:279:22: error: 
& has lower precedence than ==; == will be evaluated first 
[-Werror,-Wparentheses]
assert_cond(bits & mask == mask);
 ^~

To preserve the behaviour we should write `bits & (mask == mask)' but I
don't think this was what was intended so I changed it to
`(bits & mask) == mask'.

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/interp_masm_aarch64.hpp:44:25: 
error: redeclaration of using declaration
  using MacroAssembler::call_VM_leaf_base;
^

This using declaration appears twice in a row.

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/aarch64.ad:13866:80: error: invalid 
suffix 'D' on floating constant
__ fcmps(as_FloatRegister(opnd_array(1)->reg(ra_,this,idx1)/* src1 */), 
0.0D);
   ^

The "d" or "D" suffix on floating point literals is not in C++98. I
believe it comes from an extension to C to support decimal floating
point? Anyway it's not necessary here.

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRGenerator_aarch64.cpp:429:66: 
error: implicit conversion of NULL constant to 'bool' 
[-Werror,-Wnull-conversion]
  arithmetic_op_fpu(x->op(), reg, left.result(), right.result(), NULL);
  ~  ^~~~
 false

The NULL here is for the `bool is_strictfp' argument to
LIRGenerator::arithmetic_op_fpu. Changing it to `false' would preserve
the existing behaviour but it seems like it should be `x->is_strictfp()'
to match other platforms.

As a consequence of this we need to handle the lir_{div,mul}_scriptfp
opcodes in LIR_Assembler::arith_op, but these just fall through to the
non-strictfp variants so there should be no behaviour change.

/home/nicgas01/jdk/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp:1074:24: 
error: '&&' within '||' [-Werror,-Wlogical-op-parentheses]
  if (is_unordered && op->cond() == lir_cond_equal
  ~^~~

Added parenthesis to make this explicit. No behaviour change.

/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/copy_linux_aarch64.s:162:32:
 error: index must be a multiple of 8 in range [0, 32760].
prfmpldl1keep, [s, #-256]
   ^

The immediate offset in `prfm' gets zero-extended to 64-bits and then
left shifted three places so can only be unsigned. GNU as actually
assembles [s, #-256] the same as [s]:

   0:   f980prfmpldl1keep, [x0]

Seems like GNU as ought to report an error here instead. I've replaced
this with `prfum' which has a sign-extended offset.

/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp:43:39:
 error: cannot initialize a parameter of type 'char *' with an lvalue of type 
'unsigned long'
return __sync_add_and_fetch(dest, add_value);
  ^

If the add_and_fetch template is instantiated with I=integer and
D=pointer then we need an explicit cast here.

/home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp:679:8: 
error: conflicting types for '_Copy_conjoint_jshorts_atomic'
  void _Copy_conjoint_jshorts_atomic(jshort* from, jshort* to, size_t count) {
   ^

This family of functions is declared with const `from' pointer but in
the definition it is non-const. Made it const everywhere.