Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-11 Thread Vincent Chen
2017-12-08 20:14 GMT+08:00 Mark Rutland :
> On Fri, Dec 08, 2017 at 07:54:42PM +0800, Greentime Hu wrote:
>> 2017-12-08 18:21 GMT+08:00 Mark Rutland :
>> > On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
>> >> +static int grab_timer_node_info(void)
>> >> +{
>> >> + struct device_node *timer_node;
>> >> +
>> >> + timer_node = of_find_node_by_name(NULL, "timer");
>> >
>> > Please use a compatible string, rather than matching the timer by name.
>> >
>> > It's plausible that you have multiple nodes called "timer" in the DT,
>> > under different parent nodes, and this might not be the device you
>> > think it is. I see your dt in patch 24 has two timer nodes.
>> >
>> > It would be best if your clocksource driver exposed some stuct that you
>> > looked at here, so that you're guaranteed to user the same device.
>>
>> We'd like to use "timer" here because there are 2 different timer IPs
>> and we are sure that they won't be in the same SoC.
>> We think this implementation in VDSO should be platform independent to
>> get cycle-count register.
>> Our customer or other SoC provider who can use "timer" and define
>> cycle-count-offset or cycle-count-down then we can get the correct
>> cycle-count.
>
> This is not the right way to do things.
>
> So from a DT perspective, NAK.
>
> You should not add properties to arbitrary DT bindings to handle a Linux
> implementation detail.
>
> Please remove this DT code, and have the drivers for those timer blocks
> export this information to your vdso code somehow.
>

Hi, Mark:
Based on your suggestion, we define a new sturct timer_info to let
timer driver record the value
of cycle-count-offset and cycle-count-down in timer_init function. The
above code in timer driver
is validate only when CONFIG_NDS32 is defined.

>> We sent atcpit100 patch last time along with our arch, however we'd
>> like to send it to its sub system this time and my colleague is still
>> working on it.
>> He may send the timer patch next week.
>
> I think that it would make sense for that patch to be part of the arch
> port, especially given that (AFAICT) there is no dirver for the other
> timer IP that you mention.
>
> [...]
>
>> >> +int arch_setup_additional_pages(struct linux_binprm *bprm, int 
>> >> uses_interp)
>> >> +{
>> >
>> >> + /*Map timer to user space */
>> >> + vdso_base += PAGE_SIZE;
>> >> + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
>> >> + _PAGE_G | _PAGE_C_DEV);
>> >> + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> 
>> >> PAGE_SHIFT,
>> >> +  PAGE_SIZE, prot);
>> >> + if (ret)
>> >> + goto up_fail;
>> >
>> > Maybe this is fine, but it looks a bit suspicious.
>> >
>> > Is it safe to map IO memory to a userspace process like this?
>> >
>> > In general that isn't safe, since userspace could access other registers
>> > (if those exist), perform accesses that change the state of hardware, or
>> > make unsupported access types (e.g. unaligned, atomic) that result in
>> > errors the kernel can't handle.
>> >
>> > Does none of that apply here?
>>
>> We only provide read permission to this page so hareware state won't
>> be chagned. It will trigger exception if we try to write.
>> We will check about the alignment/atomic issue of this region.
>

For alignment issue, we intentionally make an un-alignment read to
access this region and we
got "Segmentation fault" as expected.


Thanks,
Vincent

> Ok, thanks.
>
> This is another reason to only do this for devices/drivers that we have
> drivers for, since we can't know that this is safe in general.
>
> Thanks,
> Mark.


Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-08 Thread Greentime Hu
Hi, Marc:

2017-12-08 20:29 GMT+08:00 Marc Zyngier :
> On 08/12/17 11:54, Greentime Hu wrote:
>> Hi, Mark:
>>
>> 2017-12-08 18:21 GMT+08:00 Mark Rutland :
>>> On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
 From: Greentime Hu 

 This patch adds VDSO support. The VDSO code is currently used for
 sys_rt_sigreturn() and optimised gettimeofday() (using the SoC timer 
 counter).
>>>
>>> [...]
>>>
 +static int grab_timer_node_info(void)
 +{
 + struct device_node *timer_node;
 +
 + timer_node = of_find_node_by_name(NULL, "timer");
>>>
>>> Please use a compatible string, rather than matching the timer by name.
>>>
>>> It's plausible that you have multiple nodes called "timer" in the DT,
>>> under different parent nodes, and this might not be the device you
>>> think it is. I see your dt in patch 24 has two timer nodes.
>>>
>>> It would be best if your clocksource driver exposed some stuct that you
>>> looked at here, so that you're guaranteed to user the same device.
>>
>> We'd like to use "timer" here because there are 2 different timer IPs
>> and we are sure that they won't be in the same SoC.
>> We think this implementation in VDSO should be platform independent to
>> get cycle-count register.
>> Our customer or other SoC provider who can use "timer" and define
>> cycle-count-offset or cycle-count-down then we can get the correct
>> cycle-count.
>>
>> We sent atcpit100 patch last time along with our arch, however we'd
>> like to send it to its sub system this time and my colleague is still
>> working on it.
>> He may send the timer patch next week.
>>
>>
 + of_property_read_u32(timer_node, "cycle-count-offset",
 +  &vdso_data->cycle_count_offset);
 + vdso_data->cycle_count_down =
 + of_property_read_bool(timer_node, "cycle-count-down");
>>>
>>> ... and then you'd only need to parse these in one place, too.
>>>
>>> IIUC these are proeprties for the atcpit device, which has no
>>> documentation or driver in this series.
>>>
>>> So I'm rather confused as to what's going on here.
>>>
>>
>> These properties are defined in dts which can provide the cycle count
>> register offset address of that timer, so that we can get cycle-count.
>>
 + return of_address_to_resource(timer_node, 0, &timer_res);
 +}
>>>
 +int arch_setup_additional_pages(struct linux_binprm *bprm, int 
 uses_interp)
 +{
>>>
 + /*Map timer to user space */
 + vdso_base += PAGE_SIZE;
 + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
 + _PAGE_G | _PAGE_C_DEV);
 + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> 
 PAGE_SHIFT,
 +  PAGE_SIZE, prot);
 + if (ret)
 + goto up_fail;
>>>
>>> Maybe this is fine, but it looks a bit suspicious.
>>>
>>> Is it safe to map IO memory to a userspace process like this?
>>>
>>> In general that isn't safe, since userspace could access other registers
>>> (if those exist), perform accesses that change the state of hardware, or
>>> make unsupported access types (e.g. unaligned, atomic) that result in
>>> errors the kernel can't handle.
>>>
>>> Does none of that apply here?
>>
>> We only provide read permission to this page so hareware state won't
>> be chagned. It will trigger exception if we try to write.
>> We will check about the alignment/atomic issue of this region.
>
> It still feels a bit odd. A hostile userspace could potentially find out
> about what the kernel is doing. For example, if the deadline of the next
> timer is accessible by reading that page, userspace could infer a lot of
> things that we'd normally want to keep hidden. Not knowing this HW, I
> cannot answer that question, but maybe you can.
>
> Another question: MMIO accesses can be quite slow. How much do you gain
> by having a vdso compared to executing a system call?
>

I think the rest of the timer registers should be fine to be read.
Anyway we will discuss about the security issue.

Based on our previous experiments.

Decrease 4,519,021 (47%)  cycle count for executing gettimeofday()
with: without vDSO(using syscall) =  5,091,342 : 9,610,363

The cycle count was get by CPU performance monitor.

Thanks.


Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-08 Thread Marc Zyngier
On 08/12/17 11:54, Greentime Hu wrote:
> Hi, Mark:
> 
> 2017-12-08 18:21 GMT+08:00 Mark Rutland :
>> On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
>>> From: Greentime Hu 
>>>
>>> This patch adds VDSO support. The VDSO code is currently used for
>>> sys_rt_sigreturn() and optimised gettimeofday() (using the SoC timer 
>>> counter).
>>
>> [...]
>>
>>> +static int grab_timer_node_info(void)
>>> +{
>>> + struct device_node *timer_node;
>>> +
>>> + timer_node = of_find_node_by_name(NULL, "timer");
>>
>> Please use a compatible string, rather than matching the timer by name.
>>
>> It's plausible that you have multiple nodes called "timer" in the DT,
>> under different parent nodes, and this might not be the device you
>> think it is. I see your dt in patch 24 has two timer nodes.
>>
>> It would be best if your clocksource driver exposed some stuct that you
>> looked at here, so that you're guaranteed to user the same device.
> 
> We'd like to use "timer" here because there are 2 different timer IPs
> and we are sure that they won't be in the same SoC.
> We think this implementation in VDSO should be platform independent to
> get cycle-count register.
> Our customer or other SoC provider who can use "timer" and define
> cycle-count-offset or cycle-count-down then we can get the correct
> cycle-count.
> 
> We sent atcpit100 patch last time along with our arch, however we'd
> like to send it to its sub system this time and my colleague is still
> working on it.
> He may send the timer patch next week.
> 
> 
>>> + of_property_read_u32(timer_node, "cycle-count-offset",
>>> +  &vdso_data->cycle_count_offset);
>>> + vdso_data->cycle_count_down =
>>> + of_property_read_bool(timer_node, "cycle-count-down");
>>
>> ... and then you'd only need to parse these in one place, too.
>>
>> IIUC these are proeprties for the atcpit device, which has no
>> documentation or driver in this series.
>>
>> So I'm rather confused as to what's going on here.
>>
> 
> These properties are defined in dts which can provide the cycle count
> register offset address of that timer, so that we can get cycle-count.
> 
>>> + return of_address_to_resource(timer_node, 0, &timer_res);
>>> +}
>>
>>> +int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>>> +{
>>
>>> + /*Map timer to user space */
>>> + vdso_base += PAGE_SIZE;
>>> + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
>>> + _PAGE_G | _PAGE_C_DEV);
>>> + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> 
>>> PAGE_SHIFT,
>>> +  PAGE_SIZE, prot);
>>> + if (ret)
>>> + goto up_fail;
>>
>> Maybe this is fine, but it looks a bit suspicious.
>>
>> Is it safe to map IO memory to a userspace process like this?
>>
>> In general that isn't safe, since userspace could access other registers
>> (if those exist), perform accesses that change the state of hardware, or
>> make unsupported access types (e.g. unaligned, atomic) that result in
>> errors the kernel can't handle.
>>
>> Does none of that apply here?
> 
> We only provide read permission to this page so hareware state won't
> be chagned. It will trigger exception if we try to write.
> We will check about the alignment/atomic issue of this region.

It still feels a bit odd. A hostile userspace could potentially find out
about what the kernel is doing. For example, if the deadline of the next
timer is accessible by reading that page, userspace could infer a lot of
things that we'd normally want to keep hidden. Not knowing this HW, I
cannot answer that question, but maybe you can.

Another question: MMIO accesses can be quite slow. How much do you gain
by having a vdso compared to executing a system call?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-08 Thread Mark Rutland
On Fri, Dec 08, 2017 at 07:54:42PM +0800, Greentime Hu wrote:
> 2017-12-08 18:21 GMT+08:00 Mark Rutland :
> > On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
> >> +static int grab_timer_node_info(void)
> >> +{
> >> + struct device_node *timer_node;
> >> +
> >> + timer_node = of_find_node_by_name(NULL, "timer");
> >
> > Please use a compatible string, rather than matching the timer by name.
> >
> > It's plausible that you have multiple nodes called "timer" in the DT,
> > under different parent nodes, and this might not be the device you
> > think it is. I see your dt in patch 24 has two timer nodes.
> >
> > It would be best if your clocksource driver exposed some stuct that you
> > looked at here, so that you're guaranteed to user the same device.
> 
> We'd like to use "timer" here because there are 2 different timer IPs
> and we are sure that they won't be in the same SoC.
> We think this implementation in VDSO should be platform independent to
> get cycle-count register.
> Our customer or other SoC provider who can use "timer" and define
> cycle-count-offset or cycle-count-down then we can get the correct
> cycle-count.

This is not the right way to do things.

So from a DT perspective, NAK. 

You should not add properties to arbitrary DT bindings to handle a Linux
implementation detail.

Please remove this DT code, and have the drivers for those timer blocks
export this information to your vdso code somehow.

> We sent atcpit100 patch last time along with our arch, however we'd
> like to send it to its sub system this time and my colleague is still
> working on it.
> He may send the timer patch next week.

I think that it would make sense for that patch to be part of the arch
port, especially given that (AFAICT) there is no dirver for the other
timer IP that you mention.

[...]

> >> +int arch_setup_additional_pages(struct linux_binprm *bprm, int 
> >> uses_interp)
> >> +{
> >
> >> + /*Map timer to user space */
> >> + vdso_base += PAGE_SIZE;
> >> + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
> >> + _PAGE_G | _PAGE_C_DEV);
> >> + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> 
> >> PAGE_SHIFT,
> >> +  PAGE_SIZE, prot);
> >> + if (ret)
> >> + goto up_fail;
> >
> > Maybe this is fine, but it looks a bit suspicious.
> >
> > Is it safe to map IO memory to a userspace process like this?
> >
> > In general that isn't safe, since userspace could access other registers
> > (if those exist), perform accesses that change the state of hardware, or
> > make unsupported access types (e.g. unaligned, atomic) that result in
> > errors the kernel can't handle.
> >
> > Does none of that apply here?
> 
> We only provide read permission to this page so hareware state won't
> be chagned. It will trigger exception if we try to write.
> We will check about the alignment/atomic issue of this region.

Ok, thanks.

This is another reason to only do this for devices/drivers that we have
drivers for, since we can't know that this is safe in general.

Thanks,
Mark.


Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-08 Thread Greentime Hu
Hi, Mark:

2017-12-08 18:21 GMT+08:00 Mark Rutland :
> On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
>> From: Greentime Hu 
>>
>> This patch adds VDSO support. The VDSO code is currently used for
>> sys_rt_sigreturn() and optimised gettimeofday() (using the SoC timer 
>> counter).
>
> [...]
>
>> +static int grab_timer_node_info(void)
>> +{
>> + struct device_node *timer_node;
>> +
>> + timer_node = of_find_node_by_name(NULL, "timer");
>
> Please use a compatible string, rather than matching the timer by name.
>
> It's plausible that you have multiple nodes called "timer" in the DT,
> under different parent nodes, and this might not be the device you
> think it is. I see your dt in patch 24 has two timer nodes.
>
> It would be best if your clocksource driver exposed some stuct that you
> looked at here, so that you're guaranteed to user the same device.

We'd like to use "timer" here because there are 2 different timer IPs
and we are sure that they won't be in the same SoC.
We think this implementation in VDSO should be platform independent to
get cycle-count register.
Our customer or other SoC provider who can use "timer" and define
cycle-count-offset or cycle-count-down then we can get the correct
cycle-count.

We sent atcpit100 patch last time along with our arch, however we'd
like to send it to its sub system this time and my colleague is still
working on it.
He may send the timer patch next week.


>> + of_property_read_u32(timer_node, "cycle-count-offset",
>> +  &vdso_data->cycle_count_offset);
>> + vdso_data->cycle_count_down =
>> + of_property_read_bool(timer_node, "cycle-count-down");
>
> ... and then you'd only need to parse these in one place, too.
>
> IIUC these are proeprties for the atcpit device, which has no
> documentation or driver in this series.
>
> So I'm rather confused as to what's going on here.
>

These properties are defined in dts which can provide the cycle count
register offset address of that timer, so that we can get cycle-count.

>> + return of_address_to_resource(timer_node, 0, &timer_res);
>> +}
>
>> +int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>> +{
>
>> + /*Map timer to user space */
>> + vdso_base += PAGE_SIZE;
>> + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
>> + _PAGE_G | _PAGE_C_DEV);
>> + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> PAGE_SHIFT,
>> +  PAGE_SIZE, prot);
>> + if (ret)
>> + goto up_fail;
>
> Maybe this is fine, but it looks a bit suspicious.
>
> Is it safe to map IO memory to a userspace process like this?
>
> In general that isn't safe, since userspace could access other registers
> (if those exist), perform accesses that change the state of hardware, or
> make unsupported access types (e.g. unaligned, atomic) that result in
> errors the kernel can't handle.
>
> Does none of that apply here?

We only provide read permission to this page so hareware state won't
be chagned. It will trigger exception if we try to write.
We will check about the alignment/atomic issue of this region.

Thanks.


Re: [PATCH v3 17/33] nds32: VDSO support

2017-12-08 Thread Mark Rutland
On Fri, Dec 08, 2017 at 05:12:00PM +0800, Greentime Hu wrote:
> From: Greentime Hu 
> 
> This patch adds VDSO support. The VDSO code is currently used for
> sys_rt_sigreturn() and optimised gettimeofday() (using the SoC timer counter).

[...]

> +static int grab_timer_node_info(void)
> +{
> + struct device_node *timer_node;
> +
> + timer_node = of_find_node_by_name(NULL, "timer");

Please use a compatible string, rather than matching the timer by name.

It's plausible that you have multiple nodes called "timer" in the DT,
under different parent nodes, and this might not be the device you
think it is. I see your dt in patch 24 has two timer nodes.

It would be best if your clocksource driver exposed some stuct that you
looked at here, so that you're guaranteed to user the same device.

> + of_property_read_u32(timer_node, "cycle-count-offset",
> +  &vdso_data->cycle_count_offset);
> + vdso_data->cycle_count_down =
> + of_property_read_bool(timer_node, "cycle-count-down");

... and then you'd only need to parse these in one place, too.

IIUC these are proeprties for the atcpit device, which has no
documentation or driver in this series.

So I'm rather confused as to what's going on here.

> + return of_address_to_resource(timer_node, 0, &timer_res);
> +}

> +int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
> +{

> + /*Map timer to user space */
> + vdso_base += PAGE_SIZE;
> + prot = __pgprot(_PAGE_V | _PAGE_M_UR_KR | _PAGE_D |
> + _PAGE_G | _PAGE_C_DEV);
> + ret = io_remap_pfn_range(vma, vdso_base, timer_res.start >> PAGE_SHIFT,
> +  PAGE_SIZE, prot);
> + if (ret)
> + goto up_fail;

Maybe this is fine, but it looks a bit suspicious.

Is it safe to map IO memory to a userspace process like this?

In general that isn't safe, since userspace could access other registers
(if those exist), perform accesses that change the state of hardware, or
make unsupported access types (e.g. unaligned, atomic) that result in
errors the kernel can't handle.

Does none of that apply here?

Thanks,
Mark.