Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-27 Thread David Miller
From: James Clarke 
Date: Thu, 27 Oct 2016 17:02:32 +0100

> I was just testing it on the IIIi when I got this. Anyway, it seems to work 
> fine.
> It hasn’t (yet) had one of the stupidly high allocations, but it did flush a 
> block
> of 3658 pages just fine (assuming the flush actually worked). Similarly for 
> the T1.

Thanks for testing.  I'll post the final patch I committed.

> The cut-off seems pretty arbitrary, and the only way to determine it properly 
> would
> be benchmarking (or finding out what the relevant delays are). Given x86 uses 
> 33,
> 32 or 64 seem perfectly fine, but going into the hundreds doesn’t sound stupid
> either... For such small numbers it’s probably hardly going to matter.

It's not too hard to write a kernel module which just does dummy TLB flushes in
the loop and count the cycles using the %tick register.  And I plan to hack on
something like that soon'ish.

Another part of the equation is that it blows away, at a minimum, all kernel
TLB entries.  And that has a certain cost too.


Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-27 Thread David Miller
From: James Clarke 
Date: Thu, 27 Oct 2016 09:25:36 +0100

> I’ve run it on the T5 and it seems to work without lockups:
> 
> [5948090.988821] vln_init: *vmap_lazy_nr is 32754
> [5948090.989943] vln_init: lazy_max_pages() is 32768
> [5948091.157381] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
> start=10006000 end=f000 PAGE_SIZE=2000
> [5948091.157530] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
> start=0001 end=00058c00 PAGE_SIZE=2000
> [5948091.158240] vln_init: vmap_lazy_nr is caeb1c
> [5948091.158252] vln_init: *vmap_lazy_nr is 0
> [5948091.159311] vln_init: lazy_max_pages() is 32768
> ... continues on as normal ...
> 
> (again, that’s my debugging module to see how close the system is to a flush)
> 
> I can't (yet) vouch for the IIIi, but when it comes back up I’ll give it a 
> go[1].
> I'll also put it on the T1 at some point today, but it *should* also work 
> since
> it's using the same sun4v/hypervisor implementation as the T5.

I'm about to test it on my IIIi and will commit this if it seems to work 
properly.

I guess you have no opinion about the cut-off choosen? :-)

Anyways, we can fine tune it later.


Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-27 Thread James Clarke
> On 27 Oct 2016, at 16:51, David Miller  wrote:
> 
> From: James Clarke 
> Date: Thu, 27 Oct 2016 09:25:36 +0100
> 
>> I’ve run it on the T5 and it seems to work without lockups:
>> 
>> [5948090.988821] vln_init: *vmap_lazy_nr is 32754
>> [5948090.989943] vln_init: lazy_max_pages() is 32768
>> [5948091.157381] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
>> start=10006000 end=f000 PAGE_SIZE=2000
>> [5948091.157530] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
>> start=0001 end=00058c00 PAGE_SIZE=2000
>> [5948091.158240] vln_init: vmap_lazy_nr is caeb1c
>> [5948091.158252] vln_init: *vmap_lazy_nr is 0
>> [5948091.159311] vln_init: lazy_max_pages() is 32768
>> ... continues on as normal ...
>> 
>> (again, that’s my debugging module to see how close the system is to a flush)
>> 
>> I can't (yet) vouch for the IIIi, but when it comes back up I’ll give it a 
>> go[1].
>> I'll also put it on the T1 at some point today, but it *should* also work 
>> since
>> it's using the same sun4v/hypervisor implementation as the T5.
> 
> I'm about to test it on my IIIi and will commit this if it seems to work 
> properly.
> 
> I guess you have no opinion about the cut-off choosen? :-)
> 
> Anyways, we can fine tune it later.

I was just testing it on the IIIi when I got this. Anyway, it seems to work 
fine.
It hasn’t (yet) had one of the stupidly high allocations, but it did flush a 
block
of 3658 pages just fine (assuming the flush actually worked). Similarly for the 
T1.

The cut-off seems pretty arbitrary, and the only way to determine it properly 
would
be benchmarking (or finding out what the relevant delays are). Given x86 uses 
33,
32 or 64 seem perfectly fine, but going into the hundreds doesn’t sound stupid
either... For such small numbers it’s probably hardly going to matter.

Tested-by: James Clarke 

James



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-27 Thread James Clarke
> On 27 Oct 2016, at 02:27, James Clarke  wrote:
> 
>> On 26 Oct 2016, at 22:02, David Miller  wrote:
>> 
>> From: James Clarke 
>> Date: Wed, 26 Oct 2016 21:05:36 +0100
>> 
>>> Thanks for this, it's now compiling. I'll let you know if it works
>>> within the next 24 hours.
>> 
>> Thanks.
>> 
>>> Before I forget, what do you think about the following patch? I know
>>> Debian used to use the 64-bit kernel for a 32-bit sparc userland, and so
>>> "Architecture: sparc" was correct, but obviously sparc64 also exists. It
>>> seems more sane to make sparc64 default to "Architecture: sparc64", with
>>> sparc users needing to override this with KBUILD_DEBARCH if they want
>>> to, rather than providing a setup that's broken out of the box for
>>> sparc64 users.
>>> 
>>> From: James Clarke 
>>> Date: Wed, 26 Oct 2016 20:17:10 +0100
>>> Subject: [PATCH] builddeb: Add support for sparc64
>>> 
>>> Signed-off-by: James Clarke 
>> 
>> I don't know.
>> 
>> I still personally use a 32-bit userland on my sparc64 systems because
>> that is what performs the best and is what I will be using for as long
>> as I possibly can.
>> 
>> I've actually never used this target, is this for build the kernel or
>> userland components?
> 
> Yes, make pkg-deb builds kernel, firmware, headers and linux-libc packages.
> By the way, the first build I made of 4.9 (using Debian’s 4.8 config as old
> config) wouldn’t boot, since:
> 
> * sunvdc module needs _mcount
> * sunvnet module needs _mcount and count_bits
> * crc32c_sparc64 module needs _mcount and VISenter
> [* raid6_pq module needs memcpy, though that’s just for a data partition]
> 
> The workaround is not to use CONFIG_MODVERSIONS, but this wasn’t at all clear
> at first. This is because of d3867f0483, which moved these to being exported 
> in
> their .S.
> 
> Anyway, the new kernel is running now and being stress-tested.

Hi David,
I’ve run it on the T5 and it seems to work without lockups:

[5948090.988821] vln_init: *vmap_lazy_nr is 32754
[5948090.989943] vln_init: lazy_max_pages() is 32768
[5948091.157381] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
start=10006000 end=f000 PAGE_SIZE=2000
[5948091.157530] TSB[insmod:261876]: DEBUG flush_tsb_kernel_range 
start=0001 end=00058c00 PAGE_SIZE=2000
[5948091.158240] vln_init: vmap_lazy_nr is caeb1c
[5948091.158252] vln_init: *vmap_lazy_nr is 0
[5948091.159311] vln_init: lazy_max_pages() is 32768
... continues on as normal ...

(again, that’s my debugging module to see how close the system is to a flush)

I can't (yet) vouch for the IIIi, but when it comes back up I’ll give it a 
go[1].
I'll also put it on the T1 at some point today, but it *should* also work since
it's using the same sun4v/hypervisor implementation as the T5.

Thanks,
James

[1] Not sure how long that will take...



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread James Clarke
> On 26 Oct 2016, at 22:02, David Miller  wrote:
> 
> From: James Clarke 
> Date: Wed, 26 Oct 2016 21:05:36 +0100
> 
>> Thanks for this, it's now compiling. I'll let you know if it works
>> within the next 24 hours.
> 
> Thanks.
> 
>> Before I forget, what do you think about the following patch? I know
>> Debian used to use the 64-bit kernel for a 32-bit sparc userland, and so
>> "Architecture: sparc" was correct, but obviously sparc64 also exists. It
>> seems more sane to make sparc64 default to "Architecture: sparc64", with
>> sparc users needing to override this with KBUILD_DEBARCH if they want
>> to, rather than providing a setup that's broken out of the box for
>> sparc64 users.
>> 
>> From: James Clarke 
>> Date: Wed, 26 Oct 2016 20:17:10 +0100
>> Subject: [PATCH] builddeb: Add support for sparc64
>> 
>> Signed-off-by: James Clarke 
> 
> I don't know.
> 
> I still personally use a 32-bit userland on my sparc64 systems because
> that is what performs the best and is what I will be using for as long
> as I possibly can.
> 
> I've actually never used this target, is this for build the kernel or
> userland components?

Yes, make pkg-deb builds kernel, firmware, headers and linux-libc packages.
By the way, the first build I made of 4.9 (using Debian’s 4.8 config as old
config) wouldn’t boot, since:

* sunvdc module needs _mcount
* sunvnet module needs _mcount and count_bits
* crc32c_sparc64 module needs _mcount and VISenter
[* raid6_pq module needs memcpy, though that’s just for a data partition]

The workaround is not to use CONFIG_MODVERSIONS, but this wasn’t at all clear
at first. This is because of d3867f0483, which moved these to being exported in
their .S.

Anyway, the new kernel is running now and being stress-tested.

James



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread David Miller
From: James Clarke 
Date: Wed, 26 Oct 2016 21:05:36 +0100

> Thanks for this, it's now compiling. I'll let you know if it works
> within the next 24 hours.

Thanks.

> Before I forget, what do you think about the following patch? I know
> Debian used to use the 64-bit kernel for a 32-bit sparc userland, and so
> "Architecture: sparc" was correct, but obviously sparc64 also exists. It
> seems more sane to make sparc64 default to "Architecture: sparc64", with
> sparc users needing to override this with KBUILD_DEBARCH if they want
> to, rather than providing a setup that's broken out of the box for
> sparc64 users.
> 
> From: James Clarke 
> Date: Wed, 26 Oct 2016 20:17:10 +0100
> Subject: [PATCH] builddeb: Add support for sparc64
> 
> Signed-off-by: James Clarke 

I don't know.

I still personally use a 32-bit userland on my sparc64 systems because
that is what performs the best and is what I will be using for as long
as I possibly can.

I've actually never used this target, is this for build the kernel or
userland components?



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread James Clarke
On Wed, Oct 26, 2016 at 03:04:59PM -0400, David Miller wrote:
> From: James Clarke 
> Date: Wed, 26 Oct 2016 18:21:06 +0100
>
> >> On 26 Oct 2016, at 18:09, David Miller  wrote:
> >>
> >> From: James Clarke 
> >> Date: Wed, 26 Oct 2016 17:58:16 +0100
> >>
>  On 26 Oct 2016, at 16:54, David Miller  wrote:
> 
>  From: James Clarke 
>  Date: Wed, 26 Oct 2016 09:28:05 +0100
> 
> > Any progress on TLB flushing?
> 
>  I'll keep plugging away at it today.
> >>>
> >>> Great; let me know if you need a guinea pig, as it’s pretty easy for me to
> >>> reproduce.
> >>
> >> Will do, what kind of cpus do you have?
> >
> > * UltraSparc T5 (Niagara5)
> > * UltraSparc T1 (Niagara)
> > * UltraSPARC IIIi
> >
> > The IIIi seems to be down at the moment though.
>
> James, here is what I have so far.  I only gave it a brief testing on
> sun4v, so no guarantees for the sun4u cases.  This is against the
> current sparc GIT tree.
>
> The cut-off is 32 pages, we can discuss whether that's a good value
> to use or not.  FWIW, x86_64 has similar code for this situation and
> uses a cut-off of 33.  Perhaps 64 is a better value, who knows.
>
> It might even make sense to use a different cut-off for the hypervisor
> case since the hypervisor trap we have to use to do the TLB operation
> adds even more expense to each iteration of the range loop.
>
> The policy implemented for huge range flushes below is:
>
> 1) Spitfire - Flush all non-locked entries, by hand using diagnostic
>TLB accesses.
>
> 2) Cheetah - Flush all non-locked entries using "flush all" operation.
>
> 3) sun4v/hypervisor - Flush entire kernel context, which does not
>remove locked or "permanent" entries.
>
> Anyways, let me know how it goes.

Thanks for this, it's now compiling. I'll let you know if it works
within the next 24 hours.

Before I forget, what do you think about the following patch? I know
Debian used to use the 64-bit kernel for a 32-bit sparc userland, and so
"Architecture: sparc" was correct, but obviously sparc64 also exists. It
seems more sane to make sparc64 default to "Architecture: sparc64", with
sparc users needing to override this with KBUILD_DEBARCH if they want
to, rather than providing a setup that's broken out of the box for
sparc64 users.

From: James Clarke 
Date: Wed, 26 Oct 2016 20:17:10 +0100
Subject: [PATCH] builddeb: Add support for sparc64

Signed-off-by: James Clarke 
---
 scripts/package/builddeb | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index 8ea9fd2..63b3112 100755
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -41,6 +41,8 @@ set_debarch() {
debarch="$UTS_MACHINE" ;;
x86_64)
debarch=amd64 ;;
+   sparc64)
+   debarch=sparc64 ;;
sparc*)
debarch=sparc ;;
s390*)
--
2.9.3



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread David Miller
From: James Clarke 
Date: Wed, 26 Oct 2016 18:21:06 +0100

>> On 26 Oct 2016, at 18:09, David Miller  wrote:
>> 
>> From: James Clarke 
>> Date: Wed, 26 Oct 2016 17:58:16 +0100
>> 
 On 26 Oct 2016, at 16:54, David Miller  wrote:
 
 From: James Clarke 
 Date: Wed, 26 Oct 2016 09:28:05 +0100
 
> Any progress on TLB flushing?
 
 I'll keep plugging away at it today.
>>> 
>>> Great; let me know if you need a guinea pig, as it’s pretty easy for me to
>>> reproduce.
>> 
>> Will do, what kind of cpus do you have?
> 
> * UltraSparc T5 (Niagara5)
> * UltraSparc T1 (Niagara)
> * UltraSPARC IIIi
> 
> The IIIi seems to be down at the moment though.

James, here is what I have so far.  I only gave it a brief testing on
sun4v, so no guarantees for the sun4u cases.  This is against the
current sparc GIT tree.

The cut-off is 32 pages, we can discuss whether that's a good value
to use or not.  FWIW, x86_64 has similar code for this situation and
uses a cut-off of 33.  Perhaps 64 is a better value, who knows.

It might even make sense to use a different cut-off for the hypervisor
case since the hypervisor trap we have to use to do the TLB operation
adds even more expense to each iteration of the range loop.

The policy implemented for huge range flushes below is:

1) Spitfire - Flush all non-locked entries, by hand using diagnostic
   TLB accesses.

2) Cheetah - Flush all non-locked entries using "flush all" operation.

3) sun4v/hypervisor - Flush entire kernel context, which does not
   remove locked or "permanent" entries.

Anyways, let me know how it goes.

diff --git a/arch/sparc/mm/ultra.S b/arch/sparc/mm/ultra.S
index 0fa2e62..5d2fd6c 100644
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -113,12 +113,14 @@ __flush_tlb_pending:  /* 27 insns */
 
.align  32
.globl  __flush_tlb_kernel_range
-__flush_tlb_kernel_range:  /* 19 insns */
+__flush_tlb_kernel_range:  /* 31 insns */
/* %o0=start, %o1=end */
cmp %o0, %o1
be,pn   %xcc, 2f
+sub%o1, %o0, %o3
+   srlx%o3, 18, %o4
+   brnz,pn %o4, __spitfire_flush_tlb_kernel_range_slow
 sethi  %hi(PAGE_SIZE), %o4
-   sub %o1, %o0, %o3
sub %o3, %o4, %o3
or  %o0, 0x20, %o0  ! Nucleus
 1: stxa%g0, [%o0 + %o3] ASI_DMMU_DEMAP
@@ -134,6 +136,38 @@ __flush_tlb_kernel_range:  /* 19 insns */
nop
nop
nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+
+__spitfire_flush_tlb_kernel_range_slow:
+   mov 63 * 8, %o4
+1: ldxa[%o4] ASI_ITLB_DATA_ACCESS, %o3
+   andcc   %o3, 0x40, %g0  /* _PAGE_L_4U */
+   bne,pn  %xcc, 2f
+movTLB_TAG_ACCESS, %o3
+   stxa%g0, [%o3] ASI_IMMU
+   stxa%g0, [%o4] ASI_ITLB_DATA_ACCESS
+   membar  #Sync
+2: ldxa[%o4] ASI_DTLB_DATA_ACCESS, %o3
+   andcc   %o3, 0x40, %g0
+   bne,pn  %xcc, 2f
+movTLB_TAG_ACCESS, %o3
+   stxa%g0, [%o3] ASI_DMMU
+   stxa%g0, [%o4] ASI_DTLB_DATA_ACCESS
+   membar  #Sync
+2: sub %o4, 8, %o4
+   brgez,pt%o4, 1b
+nop
+   retl
+nop
 
 __spitfire_flush_tlb_mm_slow:
rdpr%pstate, %g1
@@ -288,6 +322,40 @@ __cheetah_flush_tlb_pending:   /* 27 insns */
retl
 wrpr   %g7, 0x0, %pstate
 
+__cheetah_flush_tlb_kernel_range:  /* 31 insns */
+   /* %o0=start, %o1=end */
+   cmp %o0, %o1
+   be,pn   %xcc, 2f
+sub%o1, %o0, %o3
+   srlx%o3, 18, %o4
+   brnz,pn %o4, 3f
+sethi  %hi(PAGE_SIZE), %o4
+   sub %o3, %o4, %o3
+   or  %o0, 0x20, %o0  ! Nucleus
+1: stxa%g0, [%o0 + %o3] ASI_DMMU_DEMAP
+   stxa%g0, [%o0 + %o3] ASI_IMMU_DEMAP
+   membar  #Sync
+   brnz,pt %o3, 1b
+sub%o3, %o4, %o3
+2: sethi   %hi(KERNBASE), %o3
+   flush   %o3
+   retl
+nop
+3: mov 0x80, %o4
+   stxa%g0, [%o4] ASI_DMMU_DEMAP
+   membar  #Sync
+   stxa%g0, [%o4] ASI_IMMU_DEMAP
+   membar  #Sync
+   retl
+nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+
 #ifdef DCACHE_ALIASING_POSSIBLE
 __cheetah_flush_dcache_page: /* 11 insns */
sethi   %hi(PAGE_OFFSET), %g1
@@ -388,13 +456,15 @@ __hypervisor_flush_tlb_pending: /* 27 

Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread David Miller
From: James Clarke 
Date: Wed, 26 Oct 2016 17:58:16 +0100

>> On 26 Oct 2016, at 16:54, David Miller  wrote:
>> 
>> From: James Clarke 
>> Date: Wed, 26 Oct 2016 09:28:05 +0100
>> 
>>> Any progress on TLB flushing?
>> 
>> I was half-way through an implementation when I noticed that
>> hypervisor TLB flush handler relative branch bug I posted the
>> fix for last night.
> 
> Yep, I saw that. Looks like you forgot to update the comment on
> __hypervisor_flush_tlb_pending; it still says 16 insns rather than 27.

Fixed, thanks.

And now I noticed that the cross-call hypervisor tlb flush assembler
has the bug and needs to be fixed too...

>> I'll keep plugging away at it today.
> 
> Great; let me know if you need a guinea pig, as it’s pretty easy for me to
> reproduce.

Will do, what kind of cpus do you have?


Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread James Clarke
> On 26 Oct 2016, at 18:09, David Miller  wrote:
> 
> From: James Clarke 
> Date: Wed, 26 Oct 2016 17:58:16 +0100
> 
>>> On 26 Oct 2016, at 16:54, David Miller  wrote:
>>> 
>>> From: James Clarke 
>>> Date: Wed, 26 Oct 2016 09:28:05 +0100
>>> 
 Any progress on TLB flushing?
>>> 
>>> I'll keep plugging away at it today.
>> 
>> Great; let me know if you need a guinea pig, as it’s pretty easy for me to
>> reproduce.
> 
> Will do, what kind of cpus do you have?

* UltraSparc T5 (Niagara5)
* UltraSparc T1 (Niagara)
* UltraSPARC IIIi

The IIIi seems to be down at the moment though.

James



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread James Clarke
> On 26 Oct 2016, at 16:54, David Miller  wrote:
> 
> From: James Clarke 
> Date: Wed, 26 Oct 2016 09:28:05 +0100
> 
>> Any progress on TLB flushing?
> 
> I was half-way through an implementation when I noticed that
> hypervisor TLB flush handler relative branch bug I posted the
> fix for last night.

Yep, I saw that. Looks like you forgot to update the comment on
__hypervisor_flush_tlb_pending; it still says 16 insns rather than 27.

> I'll keep plugging away at it today.

Great; let me know if you need a guinea pig, as it’s pretty easy for me to
reproduce.

Thanks,
James


Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread David Miller
From: James Clarke 
Date: Wed, 26 Oct 2016 09:28:05 +0100

> Any progress on TLB flushing?

I was half-way through an implementation when I noticed that
hypervisor TLB flush handler relative branch bug I posted the
fix for last night.

I'll keep plugging away at it today.



Re: [PATCH] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-10-26 Thread James Clarke
> On 26 Oct 2016, at 03:44, David Miller  wrote:
> 
> 
> If the number of pages we are flushing is more than twice the number
> of entries in the TSB, just scan the TSB table for matches rather
> than probing each and every page in the range.
> 
> Based upon a patch and report by James Clarke.
> 
> Signed-off-by: David S. Miller 
> ---
> 
> James this is the final version I pushed into the tree.

Great, thanks. Any progress on TLB flushing?

James