Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Pratyush Anand



On Wednesday 14 December 2016 07:14 PM, Mark Rutland wrote:

Even for the non-kdump ie `kexec -l` case we do not have a
> functionality to bypass sha verification in kexec-tools. --lite
> option with the kexec-tools was discouraged and not accepted.

Ok. Do you have a pointer to the thread regarding that, for context?



https://lists.ozlabs.org/pipermail/petitboot/2015-October/000141.html
https://lists.ozlabs.org/pipermail/petitboot/2015-October/000136.html

~Pratyush


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Mark Rutland
Hi,

On Wed, Dec 14, 2016 at 05:51:05PM +0530, Pratyush Anand wrote:
> 
> On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:
> >I see in an earlier message that the need for sha256 was being discussed
> >in another thread. Do either of you happen to have a pointer to that.
> 
> patch 0/2 of this series.

AFAICT, that just says the the existing sha256 check is slow, not *why*
a sha256 check of some description is necessary. I'm still at a loss as
to why it is considered necessary, rather than being a debugging aid or
sanity check.

> >To me, it seems like it doesn't come with much benefit for the kdump
> >case given that's best-effort anyway, and as above the verification code
> >could have been be corrupted. In the non-kdump case it's not strictly
> >necessary and seems like a debugging aid rather than a necessary piece
> >of functionality -- if that's the case, a 20 second delay isn't the end
> >of the world...
> 
> Even for the non-kdump ie `kexec -l` case we do not have a
> functionality to bypass sha verification in kexec-tools. --lite
> option with the kexec-tools was discouraged and not accepted.

Ok. Do you have a pointer to the thread regarding that, for context?

> So,it is 20s for both `kexec -l` and `kexec -p`.

Well, unless we can have a --{no-,}sha-check, and make the default NO
for arm64.

> Also other arch like x86_64 takes negligible time in sha verification.

That's certainly an argument for not changing the other architectures,
but given it's slow for arm64, we could have a different default...

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Pratyush Anand



On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:

On Wed, Dec 14, 2016 at 11:16:17AM +, James Morse wrote:

Hi Pratyush,

On 14/12/16 10:12, Pratyush Anand wrote:

On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:

I would go as far as to generate the page tables at 'kexec -l' time,
and only if


Ok..So you mean that I create a new section which will have page table
entries mapping physicalmemory represented by remaining section, and
then purgatory can just enable mmu with page table from that section,
right? Seems doable. can do that.


I see a problem here. If we create  page table as a new segment then, how can we
verify in purgatory that sha for page table is correct? We need page table
before sha verification start,and we can not rely the page table created by
first kernel until it's sha is verified. So a chicken-egg problem.


There is more than one of those! What happens if your sha256 calculation code is
corrupted? You have to run it before you know. The same goes for all the
purgatory code.

This is why I think its better to do this in the kernel before we exit to
purgatory, but obviously that doesn't work for kdump.


I see in an earlier message that the need for sha256 was being discussed
in another thread. Do either of you happen to have a pointer to that.



patch 0/2 of this series.


To me, it seems like it doesn't come with much benefit for the kdump
case given that's best-effort anyway, and as above the verification code
could have been be corrupted. In the non-kdump case it's not strictly
necessary and seems like a debugging aid rather than a necessary piece
of functionality -- if that's the case, a 20 second delay isn't the end
of the world...



Even for the non-kdump ie `kexec -l` case we do not have a functionality 
to bypass sha verification in kexec-tools. --lite option with the 
kexec-tools was discouraged and not accepted. So,it is 20s for both 
`kexec -l` and `kexec -p`.

Also other arch like x86_64 takes negligible time in sha verification.

~Pratyush

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread James Morse
Hi Mark,

On 14/12/16 11:37, Mark Rutland wrote:
> On Wed, Dec 14, 2016 at 11:16:17AM +, James Morse wrote:
>> On 14/12/16 10:12, Pratyush Anand wrote:
>>> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
> I would go as far as to generate the page tables at 'kexec -l' time,
> and only if

 Ok..So you mean that I create a new section which will have page table
 entries mapping physicalmemory represented by remaining section, and
 then purgatory can just enable mmu with page table from that section,
 right? Seems doable. can do that.
>>>
>>> I see a problem here. If we create  page table as a new segment then, how 
>>> can we
>>> verify in purgatory that sha for page table is correct? We need page table
>>> before sha verification start,and we can not rely the page table created by
>>> first kernel until it's sha is verified. So a chicken-egg problem.
>>
>> There is more than one of those! What happens if your sha256 calculation 
>> code is
>> corrupted? You have to run it before you know. The same goes for all the
>> purgatory code.
>>
>> This is why I think its better to do this in the kernel before we exit to
>> purgatory, but obviously that doesn't work for kdump.
> 
> I see in an earlier message that the need for sha256 was being discussed
> in another thread. Do either of you happen to have a pointer to that.

https://www.spinics.net/lists/arm-kernel/msg544472.html


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Pratyush Anand

Hi James,

Thanks for your input !!

On Wednesday 14 December 2016 04:46 PM, James Morse wrote:

Hi Pratyush,

On 14/12/16 10:12, Pratyush Anand wrote:

> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:

>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>> and only if

>>
>> Ok..So you mean that I create a new section which will have page table
>> entries mapping physicalmemory represented by remaining section, and
>> then purgatory can just enable mmu with page table from that section,
>> right? Seems doable. can do that.

>
> I see a problem here. If we create  page table as a new segment then, how can 
we
> verify in purgatory that sha for page table is correct? We need page table
> before sha verification start,and we can not rely the page table created by
> first kernel until it's sha is verified. So a chicken-egg problem.

There is more than one of those! What happens if your sha256 calculation code is
corrupted? You have to run it before you know. The same goes for all the
purgatory code.



OK, seems reasonable... will do it in kexec code.


This is why I think its better to do this in the kernel before we exit to
purgatory, but obviously that doesn't work for kdump.



> I think, creating page table will just take fraction of second and should be
> good even in purgatory, What do you say?

If it's for kdump its best-effort. I think its easier/simpler to generate and
debug them at 'kexec -l' time, but if you're worried about the increased area
that could be corrupted then do it in purgatory.



~Pratyush

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Mark Rutland
On Wed, Dec 14, 2016 at 11:16:17AM +, James Morse wrote:
> Hi Pratyush,
> 
> On 14/12/16 10:12, Pratyush Anand wrote:
> > On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
> >>> I would go as far as to generate the page tables at 'kexec -l' time,
> >>> and only if
> >>
> >> Ok..So you mean that I create a new section which will have page table
> >> entries mapping physicalmemory represented by remaining section, and
> >> then purgatory can just enable mmu with page table from that section,
> >> right? Seems doable. can do that.
> > 
> > I see a problem here. If we create  page table as a new segment then, how 
> > can we
> > verify in purgatory that sha for page table is correct? We need page table
> > before sha verification start,and we can not rely the page table created by
> > first kernel until it's sha is verified. So a chicken-egg problem.
> 
> There is more than one of those! What happens if your sha256 calculation code 
> is
> corrupted? You have to run it before you know. The same goes for all the
> purgatory code.
> 
> This is why I think its better to do this in the kernel before we exit to
> purgatory, but obviously that doesn't work for kdump.

I see in an earlier message that the need for sha256 was being discussed
in another thread. Do either of you happen to have a pointer to that.

To me, it seems like it doesn't come with much benefit for the kdump
case given that's best-effort anyway, and as above the verification code
could have been be corrupted. In the non-kdump case it's not strictly
necessary and seems like a debugging aid rather than a necessary piece
of functionality -- if that's the case, a 20 second delay isn't the end
of the world...

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Mark Rutland
On Wed, Dec 14, 2016 at 11:16:07AM +, James Morse wrote:
> Hi Pratyush,
> On 14/12/16 09:38, Pratyush Anand wrote:
> > On Saturday 26 November 2016 12:00 AM, James Morse wrote:
> >> On 22/11/16 04:32, Pratyush Anand wrote:
> >>> +/*
> >>> + *disable_dcache: Disable D-cache and flush RAM locations
> >>> + *ram_start - Start address of RAM
> >>> + *ram_end - End address of RAM
> >>> + */
> >>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
> >>> +{
> >>> +switch(get_current_el()) {
> >>> +case 2:
> >>> +reset_sctlr_el2();
> >>> +break;
> >>> +case 1:
> >>> +reset_sctlr_el1();
> >>
> >> You have C code running between disabling the MMU and cleaning the cache. 
> >> The
> >> compiler is allowed to move data on and off the stack in here, but after
> >> disabling the MMU it will see whatever was on the stack before we turned 
> >> the MMU
> >> on. Any data written at the beginning of this function is left in the 
> >> caches.
> >>
> >> I'm afraid this sort of stuff needs to be done in assembly!
> > 
> > All these routines are self coded in assembly even though they are called
> > from C, so should be safe I think. Anyway, I can keep all of them in
> > assembly as well.
> 
> You can't tell the compiler that the stack data is inaccessible until the 
> dcache
> clean call completes. Some future version may do really crazy things in here.
> You can decompile what your compiler version produces to check it doesn't
> load/store to the stack, but that doesn't mean my compiler version does the
> same. This is the kind of thing that is extremely difficult to debug, its best
> not to take the risk.

FWIW, I completely agree.

We've been bitten in the past; see commit 5e051531447259e5 ("arm64:
convert part of soft_restart() to assembly") for an example.

Thanks,
Mark.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread James Morse
Hi Pratyush,

On 14/12/16 09:38, Pratyush Anand wrote:
> On Saturday 26 November 2016 12:00 AM, James Morse wrote:
>> On 22/11/16 04:32, Pratyush Anand wrote:
>>> This patch adds support to enable/disable d-cache, which can be used for
>>> faster purgatory sha256 verification.
>>
>> (I'm not clear why we want the sha256, but that is being discussed elsewhere 
>> on
>>  the thread)
>>
>>
>>> We are supporting only 4K and 64K page sizes. This code will not work if a
>>> hardware is not supporting at least one of these page sizes.  Therefore,
>>> D-cache is disabled by default and enabled only when "enable-dcache" is
>>> passed to the kexec().
>>
>> I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would 
>> be
>> a lot simpler to only support one page size, which should be 4K as that is 
>> what
>> UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)
> 
> Ok.. So, I will implement a new version after considering that 4K will always 
> be
> supported. If 4K is not supported by hw(which is very unlikely) then there 
> would
> be no d-cache enabling feature.

Sounds good tom me. I think its important to keep the purgatory code as small
and as simple as possible as its very hard to debug. If we do get bug reports
they are likely to be 'it didn't nothing', with no further details. If it only
fails on some platform we don't have access to its basically impossible.


>> I would go as far as to generate the page tables at 'kexec -l' time, and 
>> only if
> 
> Ok..So you mean that I create a new section which will have page table entries
> mapping physicalmemory represented by remaining section, and then purgatory 
> can
> just enable mmu with page table from that section, right? Seems doable. can do
> that.
> 
>> '/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore 
>> must
>> support 4K pages). This would keep the purgatory code as simple as possible.
> 
> What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That can 
> also
> tell us that whether 4K is supported or not?

If you're doing it at EL1/EL2 in the purgatory code, sure. But if you generate
the page tables at 'kexec -l' time you can't read this register from EL0 so you
need another way to guess if 4K pages are supported (or just assume they are and
test that register once you're in purgatory).

I was looking for some way to print a message at 'kexec -l' time that the sha256
would be slow as 4K wasn't supported. (a message printed at any other time won't
get seen).


>>> +/*
>>> + *disable_dcache: Disable D-cache and flush RAM locations
>>> + *ram_start - Start address of RAM
>>> + *ram_end - End address of RAM
>>> + */
>>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
>>> +{
>>> +switch(get_current_el()) {
>>> +case 2:
>>> +reset_sctlr_el2();
>>> +break;
>>> +case 1:
>>> +reset_sctlr_el1();
>>
>> You have C code running between disabling the MMU and cleaning the cache. The
>> compiler is allowed to move data on and off the stack in here, but after
>> disabling the MMU it will see whatever was on the stack before we turned the 
>> MMU
>> on. Any data written at the beginning of this function is left in the caches.
>>
>> I'm afraid this sort of stuff needs to be done in assembly!
> 
> All these routines are self coded in assembly even though they are called
> from C, so should be safe I think. Anyway, I can keep all of them in
> assembly as well.

You can't tell the compiler that the stack data is inaccessible until the dcache
clean call completes. Some future version may do really crazy things in here.
You can decompile what your compiler version produces to check it doesn't
load/store to the stack, but that doesn't mean my compiler version does the
same. This is the kind of thing that is extremely difficult to debug, its best
not to take the risk.


Thanks,

James


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread James Morse
Hi Pratyush,

On 14/12/16 10:12, Pratyush Anand wrote:
> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>> and only if
>>
>> Ok..So you mean that I create a new section which will have page table
>> entries mapping physicalmemory represented by remaining section, and
>> then purgatory can just enable mmu with page table from that section,
>> right? Seems doable. can do that.
> 
> I see a problem here. If we create  page table as a new segment then, how can 
> we
> verify in purgatory that sha for page table is correct? We need page table
> before sha verification start,and we can not rely the page table created by
> first kernel until it's sha is verified. So a chicken-egg problem.

There is more than one of those! What happens if your sha256 calculation code is
corrupted? You have to run it before you know. The same goes for all the
purgatory code.

This is why I think its better to do this in the kernel before we exit to
purgatory, but obviously that doesn't work for kdump.


> I think, creating page table will just take fraction of second and should be
> good even in purgatory, What do you say?

If it's for kdump its best-effort. I think its easier/simpler to generate and
debug them at 'kexec -l' time, but if you're worried about the increased area
that could be corrupted then do it in purgatory.


Thanks,

James


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Pratyush Anand



On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:




I would go as far as to generate the page tables at 'kexec -l' time,
and only if


Ok..So you mean that I create a new section which will have page table
entries mapping physicalmemory represented by remaining section, and
then purgatory can just enable mmu with page table from that section,
right? Seems doable. can do that.


I see a problem here. If we create  page table as a new segment then, 
how can we verify in purgatory that sha for page table is correct? We 
need page table before sha verification start,and we can not rely the 
page table created by first kernel until it's sha is verified. So a 
chicken-egg problem.


I think, creating page table will just take fraction of second and 
should be good even in purgatory, What do you say?


~Pratyush

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-12-14 Thread Pratyush Anand

Hi James,

Thanks a lot for your review. Its helpful.

On Saturday 26 November 2016 12:00 AM, James Morse wrote:

Hi Pratyush,

(CC: Mark, mismatched memory attributes in paragraph 3?)

On 22/11/16 04:32, Pratyush Anand wrote:

This patch adds support to enable/disable d-cache, which can be used for
faster purgatory sha256 verification.


(I'm not clear why we want the sha256, but that is being discussed elsewhere on
 the thread)



We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec().


I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
a lot simpler to only support one page size, which should be 4K as that is what
UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)


Ok.. So, I will implement a new version after considering that 4K will 
always be supported. If 4K is not supported by hw(which is very 
unlikely) then there would be no d-cache enabling feature.




I would go as far as to generate the page tables at 'kexec -l' time, and only if


Ok..So you mean that I create a new section which will have page table 
entries mapping physicalmemory represented by remaining section, and 
then purgatory can just enable mmu with page table from that section, 
right? Seems doable. can do that.



'/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
support 4K pages). This would keep the purgatory code as simple as possible.


What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That 
can also tell us that whether 4K is supported or not?




I don't think the performance difference between 4K and 64K page sizes will be
measurable, is purgatory really performance sensitive code?


I agree, implementing only 4K will make it very simple.





Since this is an identity mapped system, so VA_BITS will be same as max PA
bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
level of page table will be there with block descriptor entries.
Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
only table entries pointing to a level 1 lookup. Level 1 will have only
block entries which will map 1GB block. For 64K mapping, TTBR points to
level 1 lookups, which will have only table entries pointing to a level 2
lookup. Level 2 will have only block entries which will map 512MB block. If


This is more complexity to pick a VA size. Why not always use the maximum 48bit
VA? The cost is negligible compared to having simpler (easier to review!)
purgatory code.

By always using 1GB blocks you may be creating aliases with mismatched 
attributes:
* If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
  in the same 1GB of address space. This could be a reserved region with some
  other memory attributes.
* With kdump, we may have failed to park the other CPUs if they are executing
  with interrupts masked and haven't yet handled the smp_send_stop() IPI.
* One of these other CPUs could be reading/writing in this area as it doesn't
  belong to the kdump reserved area, just happens to be in the same 1GB.

I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
sure this is well into the "don't do that" territory.


It would be much better to force the memory areas to be a multiple of 2MB and
2MB aligned, which will allow you to use 2M section mappings for memory, (but
not the uart). This way we only map regions we had reserved and know are memory.



OK. So, 48 bit VA, 4K page size, 3 level page table with entries in 3rd 
level representing 2M block size.







UART base address and RAM addresses are not at least 1GB and 512MB apart
for 4K and 64K respectively, then mapping result could be unpredictable. In
that case we need to support one more level of granularity, but until
someone needs that keep it like this only.

We can not allocate dynamic memory in purgatory. Therefore we keep page
table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
points to table at next level (having block entries).  If index for RAM
area and UART area in first table is not same, then we will need another
next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).




diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
new file mode 100644
index ..bef97ef4
--- /dev/null
+++ b/purgatory/arch/arm64/cache-asm.S
@@ -0,0 +1,186 @@
+/*
+ * Some of the routines have been copied from Linux Kernel, therefore
+ * copying the license as well.
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Pratyush Anand 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the 

Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-11-25 Thread James Morse
Hi Pratyush,

(CC: Mark, mismatched memory attributes in paragraph 3?)

On 22/11/16 04:32, Pratyush Anand wrote:
> This patch adds support to enable/disable d-cache, which can be used for
> faster purgatory sha256 verification.

(I'm not clear why we want the sha256, but that is being discussed elsewhere on
 the thread)


> We are supporting only 4K and 64K page sizes. This code will not work if a
> hardware is not supporting at least one of these page sizes.  Therefore,
> D-cache is disabled by default and enabled only when "enable-dcache" is
> passed to the kexec().

I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
a lot simpler to only support one page size, which should be 4K as that is what
UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)

I would go as far as to generate the page tables at 'kexec -l' time, and only if
'/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
support 4K pages). This would keep the purgatory code as simple as possible.

I don't think the performance difference between 4K and 64K page sizes will be
measurable, is purgatory really performance sensitive code?


> Since this is an identity mapped system, so VA_BITS will be same as max PA
> bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
> level of page table will be there with block descriptor entries.
> Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
> only table entries pointing to a level 1 lookup. Level 1 will have only
> block entries which will map 1GB block. For 64K mapping, TTBR points to
> level 1 lookups, which will have only table entries pointing to a level 2
> lookup. Level 2 will have only block entries which will map 512MB block. If

This is more complexity to pick a VA size. Why not always use the maximum 48bit
VA? The cost is negligible compared to having simpler (easier to review!)
purgatory code.

By always using 1GB blocks you may be creating aliases with mismatched 
attributes:
* If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
  in the same 1GB of address space. This could be a reserved region with some
  other memory attributes.
* With kdump, we may have failed to park the other CPUs if they are executing
  with interrupts masked and haven't yet handled the smp_send_stop() IPI.
* One of these other CPUs could be reading/writing in this area as it doesn't
  belong to the kdump reserved area, just happens to be in the same 1GB.

I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
sure this is well into the "don't do that" territory.


It would be much better to force the memory areas to be a multiple of 2MB and
2MB aligned, which will allow you to use 2M section mappings for memory, (but
not the uart). This way we only map regions we had reserved and know are memory.


> UART base address and RAM addresses are not at least 1GB and 512MB apart
> for 4K and 64K respectively, then mapping result could be unpredictable. In
> that case we need to support one more level of granularity, but until
> someone needs that keep it like this only.
>
> We can not allocate dynamic memory in purgatory. Therefore we keep page
> table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
> first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
> points to table at next level (having block entries).  If index for RAM
> area and UART area in first table is not same, then we will need another
> next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).


> diff --git a/purgatory/arch/arm64/cache-asm.S 
> b/purgatory/arch/arm64/cache-asm.S
> new file mode 100644
> index ..bef97ef4
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache-asm.S
> @@ -0,0 +1,186 @@
> +/*
> + * Some of the routines have been copied from Linux Kernel, therefore
> + * copying the license as well.
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2015 Pratyush Anand 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see .
> + */
> +
> +#include "cache.h"
> +
> +/*
> + *   dcache_line_size - get the minimum D-cache line size from the CTR 
> register.
> + */
> + .macro  dcache_line_size, reg, tmp
> + mrs \tmp, ctr_el0   // read CTR
> + ubfm\tmp, \tmp, #16, #19

[PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory

2016-11-21 Thread Pratyush Anand
This patch adds support to enable/disable d-cache, which can be used for
faster purgatory sha256 verification.

We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec().
Since this is an identity mapped system, so VA_BITS will be same as max PA
bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
level of page table will be there with block descriptor entries.
Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
only table entries pointing to a level 1 lookup. Level 1 will have only
block entries which will map 1GB block. For 64K mapping, TTBR points to
level 1 lookups, which will have only table entries pointing to a level 2
lookup. Level 2 will have only block entries which will map 512MB block. If
UART base address and RAM addresses are not at least 1GB and 512MB apart
for 4K and 64K respectively, then mapping result could be unpredictable. In
that case we need to support one more level of granularity, but until
someone needs that keep it like this only.
We can not allocate dynamic memory in purgatory. Therefore we keep page
table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
points to table at next level (having block entries).  If index for RAM
area and UART area in first table is not same, then we will need another
next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).

Signed-off-by: Pratyush Anand 
---
 purgatory/arch/arm64/Makefile|   2 +
 purgatory/arch/arm64/cache-asm.S | 186 ++
 purgatory/arch/arm64/cache.c | 330 +++
 purgatory/arch/arm64/cache.h |  79 ++
 4 files changed, 597 insertions(+)
 create mode 100644 purgatory/arch/arm64/cache-asm.S
 create mode 100644 purgatory/arch/arm64/cache.c
 create mode 100644 purgatory/arch/arm64/cache.h

diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abeab17b2..0f80f8165d90 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -11,6 +11,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
 
 arm64_PURGATORY_SRCS += \
purgatory/arch/arm64/entry.S \
+   purgatory/arch/arm64/cache-asm.S \
+   purgatory/arch/arm64/cache.c \
purgatory/arch/arm64/purgatory-arm64.c
 
 dist += \
diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
new file mode 100644
index ..bef97ef4
--- /dev/null
+++ b/purgatory/arch/arm64/cache-asm.S
@@ -0,0 +1,186 @@
+/*
+ * Some of the routines have been copied from Linux Kernel, therefore
+ * copying the license as well.
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Pratyush Anand 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+#include "cache.h"
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR 
register.
+ */
+   .macro  dcache_line_size, reg, tmp
+   mrs \tmp, ctr_el0   // read CTR
+   ubfm\tmp, \tmp, #16, #19// cache line size encoding
+   mov \reg, #4// bytes per word
+   lsl \reg, \reg, \tmp// actual cache line size
+   .endm
+
+/*
+ * inval_cache_range(start, end)
+ * - x0 - start- start address of region
+ * - x1 - end  - end address of region
+ */
+.globl inval_cache_range
+inval_cache_range:
+   dcache_line_size x2, x3
+   sub x3, x2, #1
+   tst x1, x3  // end cache line aligned?
+   bic x1, x1, x3
+   b.eq1f
+   dc  civac, x1   // clean & invalidate D / U line
+1: tst x0, x3  // start cache line aligned?
+   bic x0, x0, x3
+   b.eq2f
+   dc  civac, x0   // clean & invalidate D / U line
+   b   3f
+2: dc  ivac, x0// invalidate D / U line
+3: add x0, x0, x2
+   cmp x0, x1
+   b.lo2b
+   dsb sy
+   ret
+/*
+ * flush_dcache_range(start, end)
+ * - x0 - start- start address of region
+ * - x1 - end