Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-09 Thread Stefano Stabellini
On Wed, 9 Jan 2019, Jan Beulich wrote:
> >>> On 08.01.19 at 19:08,  wrote:
> > On Tue, 8 Jan 2019, Stefano Stabellini wrote:
> >> So, this is what I am going to do: I'll send a series update according
> >> to your suggestion, with SYMBOL returning the native pointer type. As I
> >> wrote earlier, although weaker, it is still an improvement from my point
> >> of view.
> > 
> > There is a problem with this though I didn't foresee :-(
> > 
> > The native type of _start is not char* -- it is char[]. So I cannot
> > actually return the native type from SYMBOL because I cannot cast to
> > (char[]). I didn't notice it until I actually tried it.
> > 
> > See the implementation of RELOC_HIDE:
> > 
> >   #define RELOC_HIDE(ptr, off)\
> > ({ unsigned long __ptr;   \
> >   __asm__ ("" : "=r"(__ptr) : "0"(ptr));  \
> >   (typeof(ptr)) (__ptr + (off)); })
> > 
> > It casts to the type at the end, the error is:
> > 
> >   error: cast specifies array type
> >(typeof(ptr)) (__ptr + (off)); })
> > 
> > We have a few options:
> > 
> > 1) use unsigned long as in this version of the series (the Linux kernel
> > also uses this technique)
> > Sorry if I insist, it is still the best I think :-)
> > 
> > 2) casts the parameters of SYMBOL to the corresponding pointer type
> > For instance:
> >   SYMBOL((char *)_start)
> >   SYMBOL((struct alt_instr *)__alt_instructions_end)
> > This works, but it is ugly, I would say it makes the code worse than
> > option 1)
> > 
> > 2) always return void* from SYMBOL
> > I don't think it is a good idea to use void* as a workaround here
> > 
> > 3) pass the desired return type to SYMBOL
> > For instance:
> >   SYMBOL(_start, char *)
> >   SYMBOL(__alt_instructions_end, struct alt_instr *)
> > Then SYMBOL would automatically cast the return type to char * and
> > struct alt_instr * according to the second parameter.
> > 
> > Do you have any other suggestions?
> 
> 4) 
> 
> #define RELOC_HIDE(ptr, off)\
> ({ unsigned long ptr_;   \
>   __asm__ ("" : "=r"(ptr_) : "0" (ptr));  \
>   (typeof(*(ptr)) *) (ptr_ + (off)); })
> 
> or, if not suitable for RELOC_HIDE() itself, clone the macro into one
> that fits SYMBOL()'s needs.

OK. I still don't think that this is a good idea. Nonetheless, I have
just sent a patch series that uses this trick in the implementation of
SYMBOL to return the native type.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-09 Thread Jan Beulich
>>> On 08.01.19 at 19:08,  wrote:
> On Tue, 8 Jan 2019, Stefano Stabellini wrote:
>> So, this is what I am going to do: I'll send a series update according
>> to your suggestion, with SYMBOL returning the native pointer type. As I
>> wrote earlier, although weaker, it is still an improvement from my point
>> of view.
> 
> There is a problem with this though I didn't foresee :-(
> 
> The native type of _start is not char* -- it is char[]. So I cannot
> actually return the native type from SYMBOL because I cannot cast to
> (char[]). I didn't notice it until I actually tried it.
> 
> See the implementation of RELOC_HIDE:
> 
>   #define RELOC_HIDE(ptr, off)\
> ({ unsigned long __ptr;   \
>   __asm__ ("" : "=r"(__ptr) : "0"(ptr));  \
>   (typeof(ptr)) (__ptr + (off)); })
> 
> It casts to the type at the end, the error is:
> 
>   error: cast specifies array type
>(typeof(ptr)) (__ptr + (off)); })
> 
> We have a few options:
> 
> 1) use unsigned long as in this version of the series (the Linux kernel
> also uses this technique)
> Sorry if I insist, it is still the best I think :-)
> 
> 2) casts the parameters of SYMBOL to the corresponding pointer type
> For instance:
>   SYMBOL((char *)_start)
>   SYMBOL((struct alt_instr *)__alt_instructions_end)
> This works, but it is ugly, I would say it makes the code worse than
> option 1)
> 
> 2) always return void* from SYMBOL
> I don't think it is a good idea to use void* as a workaround here
> 
> 3) pass the desired return type to SYMBOL
> For instance:
>   SYMBOL(_start, char *)
>   SYMBOL(__alt_instructions_end, struct alt_instr *)
> Then SYMBOL would automatically cast the return type to char * and
> struct alt_instr * according to the second parameter.
> 
> Do you have any other suggestions?

4) 

#define RELOC_HIDE(ptr, off)\
({ unsigned long ptr_;   \
  __asm__ ("" : "=r"(ptr_) : "0" (ptr));  \
  (typeof(*(ptr)) *) (ptr_ + (off)); })

or, if not suitable for RELOC_HIDE() itself, clone the macro into one
that fits SYMBOL()'s needs.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-08 Thread Julien Grall
Hi,

Sorry for the formatting.

On Tue, 8 Jan 2019, 13:09 Stefano Stabellini, 
wrote:

> On Tue, 8 Jan 2019, Stefano Stabellini wrote:
> > On Tue, 8 Jan 2019, Jan Beulich wrote:
> > > >>> On 07.01.19 at 19:29,  wrote:
> > > > On Mon, 7 Jan 2019, Jan Beulich wrote:
> > > >> >>> On 04.01.19 at 18:05,  wrote:
> > > >> > I realize that you are not convinced by these arguments, but
> let's find
> > > >> > a way forward. My preference would be to have SYMBOL returning
> unsigned
> > > >> > long and do unsigned long comparisons when pointers pointing to
> > > >> > different objects are involved.
> > > >>
> > > >> I continue to fail to see how suitable hiding of the connection to
> the
> > > >> original symbol from the compiler makes code less standard compliant
> > > >> when comparing pointers: The compiler simply can't know whether
> > > >> the underlying object ills (in the extreme case) an array spanning
> the
> > > >> entire address space.
> > > >
> > > > That is because the requirement I am trying to address is MISRA-C
> > > > compliance, which in turns requires C language compliance for C
> language
> > > > (I think it allows mixing C with assembly code). I don't particularly
> > > > care whether the compiler can or cannot find the connection to the
> > > > original symbol.
> > > >
> > > > The important thing for me is to avoid comparisons (and subtractions)
> > > > between pointers pointing to different objects. If we compare
> unsigned
> > > > longs, it is easier to prove that the comparison is not between
> pointers
> > > > pointing to different objects, even if somehow the numeric values
> > > > indirectly come from pointers. If we compare pointers, even if they
> went
> > > > through some sort of assembly conversions, we are still comparing
> > > > pointers pointing to different objects. The compiler might not be
> able
> > > > to figure it out, but a MISRA-C compliance scanning tool, or a human,
> > > > might.
> > >
> > > This is absurd: We are similarly still comparing pointers to different
> > > objects when comparing their values casted to unsigned long. The
> > > cast is as much of a hiding technique as any other one. If you want
> > > to be C language compliant without any compromises, you'll have to
> > > do away with all *_end symbols.
> >
> > Basically, this is a matter of interpretation of the spec: it seems to
> > me that coming back from asm-land with pointers and comparing pointers
> > would be a worse offense than a (almost) harmless unsigned long
> > comparison of values returned from asm-land.
> >
> > But I am not particularly knowledgeable about MISRA-C compliance and
> > their interpretation of the rules.
> >
> > So, this is what I am going to do: I'll send a series update according
> > to your suggestion, with SYMBOL returning the native pointer type. As I
> > wrote earlier, although weaker, it is still an improvement from my point
> > of view.
>
> There is a problem with this though I didn't foresee :-(
>
> The native type of _start is not char* -- it is char[]. So I cannot
> actually return the native type from SYMBOL because I cannot cast to
> (char[]). I didn't notice it until I actually tried it.
>
> See the implementation of RELOC_HIDE:
>
>   #define RELOC_HIDE(ptr, off)\
> ({ unsigned long __ptr;   \
>   __asm__ ("" : "=r"(__ptr) : "0"(ptr));  \
>   (typeof(ptr)) (__ptr + (off)); })
>
> It casts to the type at the end, the error is:
>
>   error: cast specifies array type
>(typeof(ptr)) (__ptr + (off)); })
>
> We have a few options:
>
> 1) use unsigned long as in this version of the series (the Linux kernel
> also uses this technique)
> Sorry if I insist, it is still the best I think :-)
>
> 2) casts the parameters of SYMBOL to the corresponding pointer type
> For instance:
>   SYMBOL((char *)_start)
>   SYMBOL((struct alt_instr *)__alt_instructions_end)
> This works, but it is ugly, I would say it makes the code worse than
> option 1)
>
> 2) always return void* from SYMBOL
> I don't think it is a good idea to use void* as a workaround here
>
> 3) pass the desired return type to SYMBOL
> For instance:
>   SYMBOL(_start, char *)
>   SYMBOL(__alt_instructions_end, struct alt_instr *)
> Then SYMBOL would automatically cast the return type to char * and
> struct alt_instr * according to the second parameter.
>
> Do you have any other suggestions?
>

Reading [1], I think casting back to the initial type is pointless and not
going to help the static analyzer or compiler. After all, you still
compare/substract 2 pointers...

So, I think the only solution is 1).

Cheers,

[1]
https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1


> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-08 Thread Stefano Stabellini
On Tue, 8 Jan 2019, Stefano Stabellini wrote:
> On Tue, 8 Jan 2019, Jan Beulich wrote:
> > >>> On 07.01.19 at 19:29,  wrote:
> > > On Mon, 7 Jan 2019, Jan Beulich wrote:
> > >> >>> On 04.01.19 at 18:05,  wrote:
> > >> > I realize that you are not convinced by these arguments, but let's find
> > >> > a way forward. My preference would be to have SYMBOL returning unsigned
> > >> > long and do unsigned long comparisons when pointers pointing to
> > >> > different objects are involved.
> > >> 
> > >> I continue to fail to see how suitable hiding of the connection to the
> > >> original symbol from the compiler makes code less standard compliant
> > >> when comparing pointers: The compiler simply can't know whether
> > >> the underlying object is (in the extreme case) an array spanning the
> > >> entire address space.
> > > 
> > > That is because the requirement I am trying to address is MISRA-C
> > > compliance, which in turns requires C language compliance for C language
> > > (I think it allows mixing C with assembly code). I don't particularly
> > > care whether the compiler can or cannot find the connection to the
> > > original symbol. 
> > > 
> > > The important thing for me is to avoid comparisons (and subtractions)
> > > between pointers pointing to different objects. If we compare unsigned
> > > longs, it is easier to prove that the comparison is not between pointers
> > > pointing to different objects, even if somehow the numeric values
> > > indirectly come from pointers. If we compare pointers, even if they went
> > > through some sort of assembly conversions, we are still comparing
> > > pointers pointing to different objects. The compiler might not be able
> > > to figure it out, but a MISRA-C compliance scanning tool, or a human,
> > > might.
> > 
> > This is absurd: We are similarly still comparing pointers to different
> > objects when comparing their values casted to unsigned long. The
> > cast is as much of a hiding technique as any other one. If you want
> > to be C language compliant without any compromises, you'll have to
> > do away with all *_end symbols.
> 
> Basically, this is a matter of interpretation of the spec: it seems to
> me that coming back from asm-land with pointers and comparing pointers
> would be a worse offense than a (almost) harmless unsigned long
> comparison of values returned from asm-land.
> 
> But I am not particularly knowledgeable about MISRA-C compliance and
> their interpretation of the rules.
> 
> So, this is what I am going to do: I'll send a series update according
> to your suggestion, with SYMBOL returning the native pointer type. As I
> wrote earlier, although weaker, it is still an improvement from my point
> of view.

There is a problem with this though I didn't foresee :-(

The native type of _start is not char* -- it is char[]. So I cannot
actually return the native type from SYMBOL because I cannot cast to
(char[]). I didn't notice it until I actually tried it.

See the implementation of RELOC_HIDE:

  #define RELOC_HIDE(ptr, off)\
({ unsigned long __ptr;   \
  __asm__ ("" : "=r"(__ptr) : "0"(ptr));  \
  (typeof(ptr)) (__ptr + (off)); })

It casts to the type at the end, the error is:

  error: cast specifies array type
   (typeof(ptr)) (__ptr + (off)); })

We have a few options:

1) use unsigned long as in this version of the series (the Linux kernel
also uses this technique)
Sorry if I insist, it is still the best I think :-)

2) casts the parameters of SYMBOL to the corresponding pointer type
For instance:
  SYMBOL((char *)_start)
  SYMBOL((struct alt_instr *)__alt_instructions_end)
This works, but it is ugly, I would say it makes the code worse than
option 1)

2) always return void* from SYMBOL
I don't think it is a good idea to use void* as a workaround here

3) pass the desired return type to SYMBOL
For instance:
  SYMBOL(_start, char *)
  SYMBOL(__alt_instructions_end, struct alt_instr *)
Then SYMBOL would automatically cast the return type to char * and
struct alt_instr * according to the second parameter.

Do you have any other suggestions?

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-08 Thread Stefano Stabellini
On Tue, 8 Jan 2019, Jan Beulich wrote:
> >>> On 07.01.19 at 19:29,  wrote:
> > On Mon, 7 Jan 2019, Jan Beulich wrote:
> >> >>> On 04.01.19 at 18:05,  wrote:
> >> > I realize that you are not convinced by these arguments, but let's find
> >> > a way forward. My preference would be to have SYMBOL returning unsigned
> >> > long and do unsigned long comparisons when pointers pointing to
> >> > different objects are involved.
> >> 
> >> I continue to fail to see how suitable hiding of the connection to the
> >> original symbol from the compiler makes code less standard compliant
> >> when comparing pointers: The compiler simply can't know whether
> >> the underlying object is (in the extreme case) an array spanning the
> >> entire address space.
> > 
> > That is because the requirement I am trying to address is MISRA-C
> > compliance, which in turns requires C language compliance for C language
> > (I think it allows mixing C with assembly code). I don't particularly
> > care whether the compiler can or cannot find the connection to the
> > original symbol. 
> > 
> > The important thing for me is to avoid comparisons (and subtractions)
> > between pointers pointing to different objects. If we compare unsigned
> > longs, it is easier to prove that the comparison is not between pointers
> > pointing to different objects, even if somehow the numeric values
> > indirectly come from pointers. If we compare pointers, even if they went
> > through some sort of assembly conversions, we are still comparing
> > pointers pointing to different objects. The compiler might not be able
> > to figure it out, but a MISRA-C compliance scanning tool, or a human,
> > might.
> 
> This is absurd: We are similarly still comparing pointers to different
> objects when comparing their values casted to unsigned long. The
> cast is as much of a hiding technique as any other one. If you want
> to be C language compliant without any compromises, you'll have to
> do away with all *_end symbols.

Basically, this is a matter of interpretation of the spec: it seems to
me that coming back from asm-land with pointers and comparing pointers
would be a worse offense than a (almost) harmless unsigned long
comparison of values returned from asm-land.

But I am not particularly knowledgeable about MISRA-C compliance and
their interpretation of the rules.

So, this is what I am going to do: I'll send a series update according
to your suggestion, with SYMBOL returning the native pointer type. As I
wrote earlier, although weaker, it is still an improvement from my point
of view.

In the future, we can revisit this decision if necessary, and at the
very least this series will help with easily spotting all the
troublesome sites, clearly marked by the SYMBOL macro. If we do have to
revisit it, your suggestion below is also something to keep in mind.


> You may recall that I did propose
> a patch doing so (for an entirely different reason), using the tool
> chain's .sizeof.() operator (and then for consistency also its
> .startof.() counterpart).

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-08 Thread Jan Beulich
>>> On 07.01.19 at 19:29,  wrote:
> On Mon, 7 Jan 2019, Jan Beulich wrote:
>> >>> On 04.01.19 at 18:05,  wrote:
>> > I realize that you are not convinced by these arguments, but let's find
>> > a way forward. My preference would be to have SYMBOL returning unsigned
>> > long and do unsigned long comparisons when pointers pointing to
>> > different objects are involved.
>> 
>> I continue to fail to see how suitable hiding of the connection to the
>> original symbol from the compiler makes code less standard compliant
>> when comparing pointers: The compiler simply can't know whether
>> the underlying object is (in the extreme case) an array spanning the
>> entire address space.
> 
> That is because the requirement I am trying to address is MISRA-C
> compliance, which in turns requires C language compliance for C language
> (I think it allows mixing C with assembly code). I don't particularly
> care whether the compiler can or cannot find the connection to the
> original symbol. 
> 
> The important thing for me is to avoid comparisons (and subtractions)
> between pointers pointing to different objects. If we compare unsigned
> longs, it is easier to prove that the comparison is not between pointers
> pointing to different objects, even if somehow the numeric values
> indirectly come from pointers. If we compare pointers, even if they went
> through some sort of assembly conversions, we are still comparing
> pointers pointing to different objects. The compiler might not be able
> to figure it out, but a MISRA-C compliance scanning tool, or a human,
> might.

This is absurd: We are similarly still comparing pointers to different
objects when comparing their values casted to unsigned long. The
cast is as much of a hiding technique as any other one. If you want
to be C language compliant without any compromises, you'll have to
do away with all *_end symbols. You may recall that I did propose
a patch doing so (for an entirely different reason), using the tool
chain's .sizeof.() operator (and then for consistency also its
.startof.() counterpart).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-07 Thread Stefano Stabellini
On Mon, 7 Jan 2019, Jan Beulich wrote:
> >>> On 04.01.19 at 18:05,  wrote:
> > I realize that you are not convinced by these arguments, but let's find
> > a way forward. My preference would be to have SYMBOL returning unsigned
> > long and do unsigned long comparisons when pointers pointing to
> > different objects are involved.
> 
> I continue to fail to see how suitable hiding of the connection to the
> original symbol from the compiler makes code less standard compliant
> when comparing pointers: The compiler simply can't know whether
> the underlying object is (in the extreme case) an array spanning the
> entire address space.

That is because the requirement I am trying to address is MISRA-C
compliance, which in turns requires C language compliance for C language
(I think it allows mixing C with assembly code). I don't particularly
care whether the compiler can or cannot find the connection to the
original symbol. 

The important thing for me is to avoid comparisons (and subtractions)
between pointers pointing to different objects. If we compare unsigned
longs, it is easier to prove that the comparison is not between pointers
pointing to different objects, even if somehow the numeric values
indirectly come from pointers. If we compare pointers, even if they went
through some sort of assembly conversions, we are still comparing
pointers pointing to different objects. The compiler might not be able
to figure it out, but a MISRA-C compliance scanning tool, or a human,
might.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-06 Thread Jan Beulich
>>> On 04.01.19 at 18:05,  wrote:
> I realize that you are not convinced by these arguments, but let's find
> a way forward. My preference would be to have SYMBOL returning unsigned
> long and do unsigned long comparisons when pointers pointing to
> different objects are involved.

I continue to fail to see how suitable hiding of the connection to the
original symbol from the compiler makes code less standard compliant
when comparing pointers: The compiler simply can't know whether
the underlying object is (in the extreme case) an array spanning the
entire address space.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-04 Thread Stefano Stabellini
On Fri, 4 Jan 2019, Jan Beulich wrote:
> >>> On 02.01.19 at 19:20,  wrote:
> > On Wed, 14 Nov 2018, Jan Beulich wrote:
> >> >>> On 13.11.18 at 23:02,  wrote:
> >> > On Tue, 13 Nov 2018, Jan Beulich wrote:
> >> >> >>> On 13.11.18 at 14:17,  wrote:
> >> >> > On 13/11/2018 12:56, Jan Beulich wrote:
> >> >> > On 13.11.18 at 00:06,  wrote:
> >> >> >>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
> >> >> >>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
> >> >> >>>   return -ENOMEM;
> >> >> >>>   
> >> >> >>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
> >> >> >>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
> >> >> >>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - 
> >> >> >>> SYMBOL(__per_cpu_start));
> >> >> >>> +__per_cpu_offset[cpu] = (unsigned long)p - 
> >> >> >>> SYMBOL(__per_cpu_start);
> >> >> >> 
> >> >> >> Can't you make SYMBOL() retain the original type, such that casts
> >> >> >> like this one aren't needed? As soon as the compiler doesn't know
> >> >> >> anymore that particular globals (or statics) are used, it can't infer
> >> >> >> anymore that two pointers can't possibly point into the same array.
> >> >> > 
> >> >> > If SYMBOL() keeps the original type, then you will still substract 2 
> >> >> > pointers. If the compiler can't infer the cannot possibly point into 
> >> >> > the 
> >> >> > same array, it also cannot infer they point to the same. So that 
> >> >> > would 
> >> >> > be undefined, right?
> >> >> 
> >> >> Undefined behavior results if you _actually_ subtract pointers pointing
> >> >> into different objects. Subtracting of pointers is not generally 
> >> >> undefined.
> >> >> The compiler can use undefined-ness only if it can prove that both
> >> >> pointers do point into different objects.
> >> > 
> >> > Let's remember that we are not trying to work-around the compiler, we
> >> > are trying to make our code C standard compliant :-)  The compiler might
> >> > not be able to infer anymore that two pointers can't possibly point into
> >> > the same array, but we would still be not-compliant. It doesn't solve
> >> > our problem, especially in regards to certifications.
> >> 
> >> But then this entire patch is pointless: SYMBOL() is exclusively about
> >> deluding the compiler. To make the code standard compliant, you'd
> >> have to e.g. do away with all combined (start and end) uses (in C
> >> files) of symbols bounding sections. I at least cannot think of a
> >> standard compliant way of expressing these. Oddly enough I had
> >> once tried to deal with this issue (for other reasons), but the patch
> >> wasn't liked:
> >> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html 
> >> All the remaining end symbols then could obviously go away in favor
> >> of using the size expressions, but as you see further C limitations
> >> make it necessary to use asm() for the ones which get converted.
> >> 
> >> Talking of asm()-s: C standard compliance, in a strict sense, would
> >> require dropping all of them as well. I'm afraid that when writing
> >> special purpose code like OS kernels or hypervisors are, if you
> >> want to avoid to resort extensively to assembly code, you'll have
> >> to accept to bend some of the language rules, just making sure
> >> that the compiler won't have means to mis-interpret the constructs
> >> used.
> >>
> >> > I is safer to use unsigned long as return type for SYMBOL and avoid
> >> > pointers comparisons completely. The code impact is very limited and
> >> > then we don't have to prove same or different "objectness" at all.
> >> 
> >> Well, that's one perspective to take. The other is that hidden or
> >> explicit casts are always a risk (and hence when reviewing code
> >> I'm quite picky about any ones introduced anew or even just
> >> retained without reason). Making constructs needing to cast
> >> things at least finally cast back to the original type often at least
> >> lowers this risk.
> > 
> > The new casts added actually cancel themselves out with the ones been
> > removed (some casts to unsigned long have been removed). I went through
> > the patch, these are the stats:
> > 
> > arch/arm: +4
> > arch/x86:  0
> > common:   -4
> > 
> > Overall the impact is basically null. Given the plus side of not having to
> > prove same or different "objectness", I think it is the best compromise
> > in this case. My preference is to use unsigned long as return type, as
> > done in this version of the patch.
> 
> But if I'm not misremembering there could be an overall win if you
> casted the result back to the original type, as suggested on the
> other sub-thread.

I think you are right. I am not saying that SYMBOL returning unsigned
long reduces the number of casts compared to SYMBOL returning original
type.

I am only saying that compared to before this series, the number of
casts required to introduce SYMBOL returning unsigned long is small.
Given that it is preferable to mak

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-04 Thread Jan Beulich
>>> On 02.01.19 at 19:20,  wrote:
> On Wed, 14 Nov 2018, Jan Beulich wrote:
>> >>> On 13.11.18 at 23:02,  wrote:
>> > On Tue, 13 Nov 2018, Jan Beulich wrote:
>> >> >>> On 13.11.18 at 14:17,  wrote:
>> >> > On 13/11/2018 12:56, Jan Beulich wrote:
>> >> > On 13.11.18 at 00:06,  wrote:
>> >> >>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
>> >> >>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
>> >> >>>   return -ENOMEM;
>> >> >>>   
>> >> >>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
>> >> >>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
>> >> >>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - 
>> >> >>> SYMBOL(__per_cpu_start));
>> >> >>> +__per_cpu_offset[cpu] = (unsigned long)p - 
>> >> >>> SYMBOL(__per_cpu_start);
>> >> >> 
>> >> >> Can't you make SYMBOL() retain the original type, such that casts
>> >> >> like this one aren't needed? As soon as the compiler doesn't know
>> >> >> anymore that particular globals (or statics) are used, it can't infer
>> >> >> anymore that two pointers can't possibly point into the same array.
>> >> > 
>> >> > If SYMBOL() keeps the original type, then you will still substract 2 
>> >> > pointers. If the compiler can't infer the cannot possibly point into 
>> >> > the 
>> >> > same array, it also cannot infer they point to the same. So that would 
>> >> > be undefined, right?
>> >> 
>> >> Undefined behavior results if you _actually_ subtract pointers pointing
>> >> into different objects. Subtracting of pointers is not generally 
>> >> undefined.
>> >> The compiler can use undefined-ness only if it can prove that both
>> >> pointers do point into different objects.
>> > 
>> > Let's remember that we are not trying to work-around the compiler, we
>> > are trying to make our code C standard compliant :-)  The compiler might
>> > not be able to infer anymore that two pointers can't possibly point into
>> > the same array, but we would still be not-compliant. It doesn't solve
>> > our problem, especially in regards to certifications.
>> 
>> But then this entire patch is pointless: SYMBOL() is exclusively about
>> deluding the compiler. To make the code standard compliant, you'd
>> have to e.g. do away with all combined (start and end) uses (in C
>> files) of symbols bounding sections. I at least cannot think of a
>> standard compliant way of expressing these. Oddly enough I had
>> once tried to deal with this issue (for other reasons), but the patch
>> wasn't liked:
>> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html 
>> All the remaining end symbols then could obviously go away in favor
>> of using the size expressions, but as you see further C limitations
>> make it necessary to use asm() for the ones which get converted.
>> 
>> Talking of asm()-s: C standard compliance, in a strict sense, would
>> require dropping all of them as well. I'm afraid that when writing
>> special purpose code like OS kernels or hypervisors are, if you
>> want to avoid to resort extensively to assembly code, you'll have
>> to accept to bend some of the language rules, just making sure
>> that the compiler won't have means to mis-interpret the constructs
>> used.
>>
>> > I is safer to use unsigned long as return type for SYMBOL and avoid
>> > pointers comparisons completely. The code impact is very limited and
>> > then we don't have to prove same or different "objectness" at all.
>> 
>> Well, that's one perspective to take. The other is that hidden or
>> explicit casts are always a risk (and hence when reviewing code
>> I'm quite picky about any ones introduced anew or even just
>> retained without reason). Making constructs needing to cast
>> things at least finally cast back to the original type often at least
>> lowers this risk.
> 
> The new casts added actually cancel themselves out with the ones been
> removed (some casts to unsigned long have been removed). I went through
> the patch, these are the stats:
> 
> arch/arm: +4
> arch/x86:  0
> common:   -4
> 
> Overall the impact is basically null. Given the plus side of not having to
> prove same or different "objectness", I think it is the best compromise
> in this case. My preference is to use unsigned long as return type, as
> done in this version of the patch.

But if I'm not misremembering there could be an overall win if you
casted the result back to the original type, as suggested on the
other sub-thread.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-03 Thread Stefano Stabellini
On Wed, 2 Jan 2019, Stefano Stabellini wrote:
> On Wed, 2 Jan 2019, Stefano Stabellini wrote:
> > On Tue, 13 Nov 2018, Jan Beulich wrote:
> > > >>> On 13.11.18 at 00:06,  wrote:
> > > > --- a/xen/arch/x86/alternative.c
> > > > +++ b/xen/arch/x86/alternative.c
> > > > @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
> > > > alt_instr *start,
> > > >   * So be careful if you want to change the scan order to any other
> > > >   * order.
> > > >   */
> > > > -for ( a = base = start; a < end; a++ )
> > > > +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
> > > 
> > > At this point all is fine: end is allowed to point to the end of start[].
> > > If anything you want to change the invocations (where the
> > > questionable symbols are used). I'm also not convinced you need
> > > to touch both sides of the comparison or subtraction expressions.
> > > 
> > > In order for people to not start wondering what the purpose of
> > > SYMBOL() is at any of its use sites, you really want to use it on
> > > the problematic symbols themselves, not somewhere on a derived
> > > variable or parameter.
> > 
> > I wasn't sure about what to do about derived variables and decided to
> > err on the safe side. I am happy to remove those changes, because I
> > agree that it would be far clearer if SYMBOL() is only used on the
> > problematic symbols.
> 
> I replied too quickly. I agree that changing only the problematic
> symbols, and not the derived variables, is easier to understand from the
> code point of view, and should still be safe and compliant.
> 
> However, I would also prefer to keep the comparisons as unsigned long
> comparisons, like it is done in this patch. I think it is safer and more
> compliant to do unsigned long comparisons to avoid completely the issue
> of having to understand when pointers point to same or different
> objects. It is certainly more in the spirit of the spec. 
> 
> So, in the case you mentioned above (and a bunch of similar cases), to
> do unsigned long comparisons only and apply SYMBOL() only to the
> problematic symbols, we need one more cast to unsigned long at the
> derived variable side:
> 
>   for ( a = base = start; (unsigned long)a < SYMBOL(end); a++ )
> 
> 
> FYI I realized I was using SYMBOL() conversions more than strictly
> necessary at the derived variables side, but I though it would be nicer
> to have:
> 
>   SYMBOL(derived) < SYMBOL(problem_variable)
> 
> rather than
> 
>   (unsigned long)derived < SYMBOL(problem_variable)
> 
> everywhere. I am happy either way as both should be compliant.

With the hope of getting it merge before the code freeze and to save
time, given that I also had other comments to address and a rebase to
do, I'll send a new series update using the latter of these two options.

See:

https://marc.info/?l=xen-devel&m=154654323311302

We can further discuss the issue there -- I am happy to change things
again as necessary.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-02 Thread Stefano Stabellini
On Wed, 2 Jan 2019, Stefano Stabellini wrote:
> On Tue, 13 Nov 2018, Jan Beulich wrote:
> > >>> On 13.11.18 at 00:06,  wrote:
> > > --- a/xen/arch/x86/alternative.c
> > > +++ b/xen/arch/x86/alternative.c
> > > @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
> > > alt_instr *start,
> > >   * So be careful if you want to change the scan order to any other
> > >   * order.
> > >   */
> > > -for ( a = base = start; a < end; a++ )
> > > +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
> > 
> > At this point all is fine: end is allowed to point to the end of start[].
> > If anything you want to change the invocations (where the
> > questionable symbols are used). I'm also not convinced you need
> > to touch both sides of the comparison or subtraction expressions.
> > 
> > In order for people to not start wondering what the purpose of
> > SYMBOL() is at any of its use sites, you really want to use it on
> > the problematic symbols themselves, not somewhere on a derived
> > variable or parameter.
> 
> I wasn't sure about what to do about derived variables and decided to
> err on the safe side. I am happy to remove those changes, because I
> agree that it would be far clearer if SYMBOL() is only used on the
> problematic symbols.

I replied too quickly. I agree that changing only the problematic
symbols, and not the derived variables, is easier to understand from the
code point of view, and should still be safe and compliant.

However, I would also prefer to keep the comparisons as unsigned long
comparisons, like it is done in this patch. I think it is safer and more
compliant to do unsigned long comparisons to avoid completely the issue
of having to understand when pointers point to same or different
objects. It is certainly more in the spirit of the spec. 

So, in the case you mentioned above (and a bunch of similar cases), to
do unsigned long comparisons only and apply SYMBOL() only to the
problematic symbols, we need one more cast to unsigned long at the
derived variable side:

  for ( a = base = start; (unsigned long)a < SYMBOL(end); a++ )


FYI I realized I was using SYMBOL() conversions more than strictly
necessary at the derived variables side, but I though it would be nicer
to have:

  SYMBOL(derived) < SYMBOL(problem_variable)

rather than

  (unsigned long)derived < SYMBOL(problem_variable)

everywhere. I am happy either way as both should be compliant.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-02 Thread Stefano Stabellini
On Tue, 13 Nov 2018, Jan Beulich wrote:
> >>> On 13.11.18 at 00:06,  wrote:
> > --- a/xen/arch/x86/alternative.c
> > +++ b/xen/arch/x86/alternative.c
> > @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
> > alt_instr *start,
> >   * So be careful if you want to change the scan order to any other
> >   * order.
> >   */
> > -for ( a = base = start; a < end; a++ )
> > +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
> 
> At this point all is fine: end is allowed to point to the end of start[].
> If anything you want to change the invocations (where the
> questionable symbols are used). I'm also not convinced you need
> to touch both sides of the comparison or subtraction expressions.
> 
> In order for people to not start wondering what the purpose of
> SYMBOL() is at any of its use sites, you really want to use it on
> the problematic symbols themselves, not somewhere on a derived
> variable or parameter.

I wasn't sure about what to do about derived variables and decided to
err on the safe side. I am happy to remove those changes, because I
agree that it would be far clearer if SYMBOL() is only used on the
problematic symbols.


> I also think review would be helped if you at least split this patch
> into an Arm, and x86, and a common code patch.

I'll do


> > @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
> >  if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
> >  return -ENOMEM;
> >  
> > -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
> > -__per_cpu_offset[cpu] = p - __per_cpu_start;
> > +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
> > +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);
> 
> Can't you make SYMBOL() retain the original type, such that casts
> like this one aren't needed? As soon as the compiler doesn't know
> anymore that particular globals (or statics) are used, it can't infer
> anymore that two pointers can't possibly point into the same array.

I'll reply to your point to later email in the thread.


> > @@ -1037,7 +1037,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> >   * Is the region size greater than zero and does it begin
> >   * at or above the end of current Xen image placement?
> >   */
> > -if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >= 
> > __pa(_end)) )
> > +if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >=
> > + __pa(_end)) )
> 
> Only re-flowing? If no change is meant to be done to this use of _end,
> please omit the change.

Sorry about the spurious change, I'll remove it.


> > --- a/xen/include/asm-arm/mm.h
> > +++ b/xen/include/asm-arm/mm.h
> > @@ -151,7 +151,7 @@ extern vaddr_t xenheap_virt_start;
> >  #endif
> >  
> >  #define is_xen_fixed_mfn(mfn)   \
> > -((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) &&   \
> > +((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) && \
> >   (pfn_to_paddr(mfn) <= virt_to_maddr(&_end)))
> 
> Unnecessary or incomplete change again?

Same again

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2019-01-02 Thread Stefano Stabellini
On Wed, 14 Nov 2018, Jan Beulich wrote:
> >>> On 13.11.18 at 23:02,  wrote:
> > On Tue, 13 Nov 2018, Jan Beulich wrote:
> >> >>> On 13.11.18 at 14:17,  wrote:
> >> > On 13/11/2018 12:56, Jan Beulich wrote:
> >> > On 13.11.18 at 00:06,  wrote:
> >> >>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
> >> >>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
> >> >>>   return -ENOMEM;
> >> >>>   
> >> >>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
> >> >>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
> >> >>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - 
> >> >>> SYMBOL(__per_cpu_start));
> >> >>> +__per_cpu_offset[cpu] = (unsigned long)p - 
> >> >>> SYMBOL(__per_cpu_start);
> >> >> 
> >> >> Can't you make SYMBOL() retain the original type, such that casts
> >> >> like this one aren't needed? As soon as the compiler doesn't know
> >> >> anymore that particular globals (or statics) are used, it can't infer
> >> >> anymore that two pointers can't possibly point into the same array.
> >> > 
> >> > If SYMBOL() keeps the original type, then you will still substract 2 
> >> > pointers. If the compiler can't infer the cannot possibly point into the 
> >> > same array, it also cannot infer they point to the same. So that would 
> >> > be undefined, right?
> >> 
> >> Undefined behavior results if you _actually_ subtract pointers pointing
> >> into different objects. Subtracting of pointers is not generally undefined.
> >> The compiler can use undefined-ness only if it can prove that both
> >> pointers do point into different objects.
> > 
> > Let's remember that we are not trying to work-around the compiler, we
> > are trying to make our code C standard compliant :-)  The compiler might
> > not be able to infer anymore that two pointers can't possibly point into
> > the same array, but we would still be not-compliant. It doesn't solve
> > our problem, especially in regards to certifications.
> 
> But then this entire patch is pointless: SYMBOL() is exclusively about
> deluding the compiler. To make the code standard compliant, you'd
> have to e.g. do away with all combined (start and end) uses (in C
> files) of symbols bounding sections. I at least cannot think of a
> standard compliant way of expressing these. Oddly enough I had
> once tried to deal with this issue (for other reasons), but the patch
> wasn't liked:
> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html
> All the remaining end symbols then could obviously go away in favor
> of using the size expressions, but as you see further C limitations
> make it necessary to use asm() for the ones which get converted.
> 
> Talking of asm()-s: C standard compliance, in a strict sense, would
> require dropping all of them as well. I'm afraid that when writing
> special purpose code like OS kernels or hypervisors are, if you
> want to avoid to resort extensively to assembly code, you'll have
> to accept to bend some of the language rules, just making sure
> that the compiler won't have means to mis-interpret the constructs
> used.
>
> > I is safer to use unsigned long as return type for SYMBOL and avoid
> > pointers comparisons completely. The code impact is very limited and
> > then we don't have to prove same or different "objectness" at all.
> 
> Well, that's one perspective to take. The other is that hidden or
> explicit casts are always a risk (and hence when reviewing code
> I'm quite picky about any ones introduced anew or even just
> retained without reason). Making constructs needing to cast
> things at least finally cast back to the original type often at least
> lowers this risk.

The new casts added actually cancel themselves out with the ones been
removed (some casts to unsigned long have been removed). I went through
the patch, these are the stats:

arch/arm: +4
arch/x86:  0
common:   -4

Overall the impact is basically null. Given the plus side of not having to
prove same or different "objectness", I think it is the best compromise
in this case. My preference is to use unsigned long as return type, as
done in this version of the patch.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-13 Thread Jan Beulich
>>> On 13.11.18 at 23:02,  wrote:
> On Tue, 13 Nov 2018, Jan Beulich wrote:
>> >>> On 13.11.18 at 14:17,  wrote:
>> > On 13/11/2018 12:56, Jan Beulich wrote:
>> > On 13.11.18 at 00:06,  wrote:
>> >>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
>> >>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
>> >>>   return -ENOMEM;
>> >>>   
>> >>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
>> >>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
>> >>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
>> >>> +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);
>> >> 
>> >> Can't you make SYMBOL() retain the original type, such that casts
>> >> like this one aren't needed? As soon as the compiler doesn't know
>> >> anymore that particular globals (or statics) are used, it can't infer
>> >> anymore that two pointers can't possibly point into the same array.
>> > 
>> > If SYMBOL() keeps the original type, then you will still substract 2 
>> > pointers. If the compiler can't infer the cannot possibly point into the 
>> > same array, it also cannot infer they point to the same. So that would 
>> > be undefined, right?
>> 
>> Undefined behavior results if you _actually_ subtract pointers pointing
>> into different objects. Subtracting of pointers is not generally undefined.
>> The compiler can use undefined-ness only if it can prove that both
>> pointers do point into different objects.
> 
> Let's remember that we are not trying to work-around the compiler, we
> are trying to make our code C standard compliant :-)  The compiler might
> not be able to infer anymore that two pointers can't possibly point into
> the same array, but we would still be not-compliant. It doesn't solve
> our problem, especially in regards to certifications.

But then this entire patch is pointless: SYMBOL() is exclusively about
deluding the compiler. To make the code standard compliant, you'd
have to e.g. do away with all combined (start and end) uses (in C
files) of symbols bounding sections. I at least cannot think of a
standard compliant way of expressing these. Oddly enough I had
once tried to deal with this issue (for other reasons), but the patch
wasn't liked:
https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html
All the remaining end symbols then could obviously go away in favor
of using the size expressions, but as you see further C limitations
make it necessary to use asm() for the ones which get converted.

Talking of asm()-s: C standard compliance, in a strict sense, would
require dropping all of them as well. I'm afraid that when writing
special purpose code like OS kernels or hypervisors are, if you
want to avoid to resort extensively to assembly code, you'll have
to accept to bend some of the language rules, just making sure
that the compiler won't have means to mis-interpret the constructs
used.

> I is safer to use unsigned long as return type for SYMBOL and avoid
> pointers comparisons completely. The code impact is very limited and
> then we don't have to prove same or different "objectness" at all.

Well, that's one perspective to take. The other is that hidden or
explicit casts are always a risk (and hence when reviewing code
I'm quite picky about any ones introduced anew or even just
retained without reason). Making constructs needing to cast
things at least finally cast back to the original type often at least
lowers this risk.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-13 Thread Stefano Stabellini
On Tue, 13 Nov 2018, Jan Beulich wrote:
> >>> On 13.11.18 at 14:17,  wrote:
> > On 13/11/2018 12:56, Jan Beulich wrote:
> > On 13.11.18 at 00:06,  wrote:
> >>> --- a/xen/arch/x86/alternative.c
> >>> +++ b/xen/arch/x86/alternative.c
> >>> @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
> >>> alt_instr *start,
> >>>* So be careful if you want to change the scan order to any other
> >>>* order.
> >>>*/
> >>> -for ( a = base = start; a < end; a++ )
> >>> +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
> >> 
> >> At this point all is fine: end is allowed to point to the end of start[].
> > 
> > But it can also point to somewhere else. The compiler cannot know it, right?
> 
> Correct. My point is that at this point the compiler cannot use its
> knowledge of what is (il)legal to "optimize" generated code.
> 
> >>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
> >>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
> >>>   return -ENOMEM;
> >>>   
> >>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
> >>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
> >>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
> >>> +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);
> >> 
> >> Can't you make SYMBOL() retain the original type, such that casts
> >> like this one aren't needed? As soon as the compiler doesn't know
> >> anymore that particular globals (or statics) are used, it can't infer
> >> anymore that two pointers can't possibly point into the same array.
> > 
> > If SYMBOL() keeps the original type, then you will still substract 2 
> > pointers. If the compiler can't infer the cannot possibly point into the 
> > same array, it also cannot infer they point to the same. So that would 
> > be undefined, right?
> 
> Undefined behavior results if you _actually_ subtract pointers pointing
> into different objects. Subtracting of pointers is not generally undefined.
> The compiler can use undefined-ness only if it can prove that both
> pointers do point into different objects.

Let's remember that we are not trying to work-around the compiler, we
are trying to make our code C standard compliant :-)  The compiler might
not be able to infer anymore that two pointers can't possibly point into
the same array, but we would still be not-compliant. It doesn't solve
our problem, especially in regards to certifications.

I is safer to use unsigned long as return type for SYMBOL and avoid
pointers comparisons completely. The code impact is very limited and
then we don't have to prove same or different "objectness" at all.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-13 Thread Jan Beulich
>>> On 13.11.18 at 14:17,  wrote:
> On 13/11/2018 12:56, Jan Beulich wrote:
> On 13.11.18 at 00:06,  wrote:
>>> --- a/xen/arch/x86/alternative.c
>>> +++ b/xen/arch/x86/alternative.c
>>> @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
>>> alt_instr *start,
>>>* So be careful if you want to change the scan order to any other
>>>* order.
>>>*/
>>> -for ( a = base = start; a < end; a++ )
>>> +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
>> 
>> At this point all is fine: end is allowed to point to the end of start[].
> 
> But it can also point to somewhere else. The compiler cannot know it, right?

Correct. My point is that at this point the compiler cannot use its
knowledge of what is (il)legal to "optimize" generated code.

>>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
>>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
>>>   return -ENOMEM;
>>>   
>>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
>>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
>>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
>>> +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);
>> 
>> Can't you make SYMBOL() retain the original type, such that casts
>> like this one aren't needed? As soon as the compiler doesn't know
>> anymore that particular globals (or statics) are used, it can't infer
>> anymore that two pointers can't possibly point into the same array.
> 
> If SYMBOL() keeps the original type, then you will still substract 2 
> pointers. If the compiler can't infer the cannot possibly point into the 
> same array, it also cannot infer they point to the same. So that would 
> be undefined, right?

Undefined behavior results if you _actually_ subtract pointers pointing
into different objects. Subtracting of pointers is not generally undefined.
The compiler can use undefined-ness only if it can prove that both
pointers do point into different objects.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-13 Thread Julien Grall

Hi,
On 13/11/2018 12:56, Jan Beulich wrote:
 On 13.11.18 at 00:06,  wrote:
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
>> alt_instr *start,
>>* So be careful if you want to change the scan order to any other
>>* order.
>>*/
>> -for ( a = base = start; a < end; a++ )
>> +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )
> 
> At this point all is fine: end is allowed to point to the end of start[].

But it can also point to somewhere else. The compiler cannot know it, right?

>> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
>>   if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
>>   return -ENOMEM;
>>   
>> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
>> -__per_cpu_offset[cpu] = p - __per_cpu_start;
>> +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
>> +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);
> 
> Can't you make SYMBOL() retain the original type, such that casts
> like this one aren't needed? As soon as the compiler doesn't know
> anymore that particular globals (or statics) are used, it can't infer
> anymore that two pointers can't possibly point into the same array.

If SYMBOL() keeps the original type, then you will still substract 2 
pointers. If the compiler can't infer the cannot possibly point into the 
same array, it also cannot infer they point to the same. So that would 
be undefined, right?

Cheers,

-- 
Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-13 Thread Jan Beulich
>>> On 13.11.18 at 00:06,  wrote:
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -194,7 +194,7 @@ void init_or_livepatch apply_alternatives(struct 
> alt_instr *start,
>   * So be careful if you want to change the scan order to any other
>   * order.
>   */
> -for ( a = base = start; a < end; a++ )
> +for ( a = base = start; SYMBOL(a) < SYMBOL(end); a++ )

At this point all is fine: end is allowed to point to the end of start[].
If anything you want to change the invocations (where the
questionable symbols are used). I'm also not convinced you need
to touch both sides of the comparison or subtraction expressions.

In order for people to not start wondering what the purpose of
SYMBOL() is at any of its use sites, you really want to use it on
the problematic symbols themselves, not somewhere on a derived
variable or parameter.

I also think review would be helped if you at least split this patch
into an Arm, and x86, and a common code patch.

> @@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
>  if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
>  return -ENOMEM;
>  
> -memset(p, 0, __per_cpu_data_end - __per_cpu_start);
> -__per_cpu_offset[cpu] = p - __per_cpu_start;
> +memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
> +__per_cpu_offset[cpu] = (unsigned long)p - SYMBOL(__per_cpu_start);

Can't you make SYMBOL() retain the original type, such that casts
like this one aren't needed? As soon as the compiler doesn't know
anymore that particular globals (or statics) are used, it can't infer
anymore that two pointers can't possibly point into the same array.

> @@ -1037,7 +1037,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>   * Is the region size greater than zero and does it begin
>   * at or above the end of current Xen image placement?
>   */
> -if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >= __pa(_end)) )
> +if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >=
> + __pa(_end)) )

Only re-flowing? If no change is meant to be done to this use of _end,
please omit the change.

> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -151,7 +151,7 @@ extern vaddr_t xenheap_virt_start;
>  #endif
>  
>  #define is_xen_fixed_mfn(mfn)   \
> -((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) &&   \
> +((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) && \
>   (pfn_to_paddr(mfn) <= virt_to_maddr(&_end)))

Unnecessary or incomplete change again?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 2/2] xen: use SYMBOL when required

2018-11-12 Thread Stefano Stabellini
Use SYMBOL in cases of comparisons and subtractions of:

_start, _end, __init_begin, __init_end,  __2M_text_end,
__2M_rodata_start, __2M_rodata_end, __2M_init_start,__2M_init_end,
__2M_rwdata_start, __2M_rwdata_end, _stext, _etext, _srodata, _erodata,
__end_vpci_array, __start_vpci_array, _sinittext, _einittext,
_stextentry, _etextentry, __start_bug_frames, __stop_bug_frames_0,
__stop_bug_frames_1, __stop_bug_frames_2,__stop_bug_frames_3,
__note_gnu_build_id_start, __note_gnu_build_id_end, __start___ex_table,
__stop___ex_table, __start___pre_ex_table, __stop___pre_ex_table,
__lock_profile_start, __lock_profile_end, __param_start,
__param_end, __setup_start, __setup_end, __initcall_start,
__initcall_end, __presmp_initcall_end, __trampoline_rel_start,
__trampoline_rel_stop, __trampoline_seg_start, __trampoline_seg_stop
__alt_instructions, __alt_instructions_end, __ctors_start, __ctors_end,
__end_schedulers_array, __start_schedulers_array, __bss_start,
__bss_end, __per_cpu_start, __per_cpu_data_end, _splatform, _eplatform,
_sdevice, _edevice, _asdevice, _aedevice, __proc_info_start,
__proc_info_end, _sdtb


as by the C standard [1].

M3CM: Rule-18.2: Subtraction between pointers shall only be applied to
pointers that address elements of the same array

[1] 
https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array

QAVerify: 2761
Signed-off-by: Stefano Stabellini 
CC: jbeul...@suse.com
CC: andrew.coop...@citrix.com
---
Changes in v4:
- only use SYMBOL where necessary, not "everywhere": comparisons and
  subtractions
- improve commit message
- remove some unnecessary casts
- fix some still unsafe casts
- extend checks to all symbols in xen/arch/x86/xen.lds.S and
  xen/arch/arm/xen.lds.S

Changes in v3:
- improve commit message
- no hard tabs
- rename __symbol to SYMBOL
- fix __end_vpci_array and __start_vpci_array
- avoid all comparisons between pointers: including (void *) casted
  returns from SYMBOL()
- remove useless casts to (unsigned long)

Changes in v2:
- cast return of SYMBOL to char* when required
- define __pa as unsigned long in is_kernel* functions
---
 xen/arch/arm/alternative.c|  7 ---
 xen/arch/arm/arm32/livepatch.c|  3 ++-
 xen/arch/arm/arm64/livepatch.c|  3 ++-
 xen/arch/arm/device.c |  6 +++---
 xen/arch/arm/livepatch.c  |  5 +++--
 xen/arch/arm/mm.c | 19 ++-
 xen/arch/arm/percpu.c |  8 
 xen/arch/arm/platform.c   |  6 --
 xen/arch/arm/setup.c  |  8 +---
 xen/arch/x86/alternative.c|  2 +-
 xen/arch/x86/efi/efi-boot.h   |  4 ++--
 xen/arch/x86/percpu.c |  8 
 xen/arch/x86/setup.c  | 11 +++
 xen/arch/x86/smpboot.c|  2 +-
 xen/common/kernel.c   |  8 ++--
 xen/common/lib.c  |  2 +-
 xen/common/schedule.c |  2 +-
 xen/common/spinlock.c |  4 +++-
 xen/common/version.c  |  6 +++---
 xen/common/virtual_region.c   |  2 +-
 xen/drivers/vpci/vpci.c   |  2 +-
 xen/include/asm-arm/grant_table.h |  3 ++-
 xen/include/asm-arm/mm.h  |  2 +-
 xen/include/xen/kernel.h  | 24 
 24 files changed, 83 insertions(+), 64 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 52ed7ed..b740cfe 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -131,7 +131,7 @@ static int __apply_alternatives(const struct alt_region 
*region,
 printk(XENLOG_INFO "alternatives: Patching with alt table %p -> %p\n",
region->begin, region->end);
 
-for ( alt = region->begin; alt < region->end; alt++ )
+for ( alt = region->begin; SYMBOL(alt) < SYMBOL(region->end); alt++ )
 {
 int nr_inst;
 
@@ -188,7 +188,7 @@ static int __apply_alternatives_multi_stop(void *unused)
 int ret;
 struct alt_region region;
 mfn_t xen_mfn = virt_to_mfn(_start);
-paddr_t xen_size = _end - _start;
+paddr_t xen_size = SYMBOL(_end) - SYMBOL(_start);
 unsigned int xen_order = get_order_from_bytes(xen_size);
 void *xenmap;
 
@@ -206,7 +206,8 @@ static int __apply_alternatives_multi_stop(void *unused)
 region.begin = __alt_instructions;
 region.end = __alt_instructions_end;
 
-ret = __apply_alternatives(®ion, xenmap - (void *)_start);
+ret = __apply_alternatives(®ion,
+   (unsigned long)xenmap - SYMBOL(_start));
 /* The patching is not expected to fail during boot. */
 BUG_ON(ret != 0);
 
diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c
index 41378a5..83f443f 100644
--- a/xen/arch/arm/arm32/livepatch.c
+++ b/xen/arch/arm/arm32/livepatch.c
@@ -56,7 +56,8 @@ void arch_livepatch_apply(struct livepatch_func *func)
 else
 insn = 0xe1a