Re: radeon, amdgpu improvements for aarch64

2023-12-27 Thread Taylor R Campbell
> Date: Thu, 28 Dec 2023 01:02:08 +0100
> From: Tobias Nygren 
> 
> I've spent some time testing GPUs on aarch64 and fixing bugs.
> Attached are some patches to make things more usable.

Cool, thanks!  Some notes below.  Everything else seems fine.

> [2] /libdata/firmware/amdgpu is not shipped.
> (Can we include these in the sets?)

amdgpu firmware should be included in the gpufw set, I think.

> --- sys/arch/evbarm/fdt/fdt_machdep.c 4 Aug 2023 09:06:33 -   1.106
> +++ sys/arch/evbarm/fdt/fdt_machdep.c 27 Dec 2023 22:07:38 -
> @@ -194,6 +194,11 @@ fdt_add_boot_physmem(const struct fdt_me
>   bp->bp_start = atop(saddr);
>   bp->bp_pages = atop(eaddr) - bp->bp_start;
>   bp->bp_freelist = VM_FREELIST_DEFAULT;
> +#ifdef _LP64
> + if (eaddr < (1UL<<40)) {

Tiny nit: I'd spell this as:

if (eaddr < ((paddr_t)1 << 40)) {

or as:

if (eaddr <= BITS(0,39)) {

Just to confirm: eaddr is inclusive here, right?  As in, if the range
were [0x1, 0x2), we would have saddr=0x1 eaddr=0x1 (or
maybe eaddr=0x1f000), right?

> --- sys/external/bsd/drm2/include/drm/bus_dma_hacks.h 19 Jul 2022 23:19:44 
> -  1.25
> +++ sys/external/bsd/drm2/include/drm/bus_dma_hacks.h 27 Dec 2023 22:07:38 
> -
> @@ -99,6 +99,8 @@ bus_dmamem_pgfl(bus_dma_tag_t tag)
>  {
>  #if defined(__i386__) || defined(__x86_64__)
>   return x86_select_freelist(tag->_bounce_alloc_hi - 1);
> +#elif defined(__aarch64__)
> + return VM_FREELIST_FIRST1T;

This should look through the tag->_ranges to choose
VM_FREELIST_FIRST1T if it has been restricted with
bus_dmatag_subregion to lie in that range, and VM_FREELIST_DEFAULT if
not.

> --- sys/uvm/uvm_pglist.c  21 Dec 2021 08:27:49 -  1.90
> +++ sys/uvm/uvm_pglist.c  27 Dec 2023 22:06:10 -
> @@ -112,8 +112,9 @@ static int
>  uvm_pglistalloc_c_ps(uvm_physseg_t psi, int num, paddr_t low, paddr_t high,
>  paddr_t alignment, paddr_t boundary, struct pglist *rlist)
>  {
> - signed int candidate, limit, candidateidx, end, idx, skip;
> - int pagemask;
> + long candidate, limit, candidateidx, end, idx;
> + int skip;
> + long pagemask;

I don't really have an issue with this but I think we may need to
switch int to pfn_t for page frame numbers much more systematically.
Curious how changing only start_hint from int to long helps?  (Commit
message doesn't explain.)


radeon, amdgpu improvements for aarch64

2023-12-27 Thread Tobias Nygren
Hi,

I've spent some time testing GPUs on aarch64 and fixing bugs.
Attached are some patches to make things more usable.

I would like to commit these or get feedback on if changes should be
implemented differently. Since I touched things in both MD code
and UVM, review would be appreciated. First line of each patch
contains a summary of what it does.

amdgpu (bonaire C1K) on AADK: works[1][2]
  glamor, picom, alacritty, doomlegacy OK
radeon (bonaire) on AADK: works[1], ditto
radeon (cedar) on LX2K: works, minor rendering glitches

[1] AADK firmware doesn't POST the GPU and VGA BIOS extraction fails.
Must provide a VGA BIOS image dump manually, build a custom
EDK2 with VGA POSTing capability or figure out how to map ROM.
[2] /libdata/firmware/amdgpu is not shipped.
(Can we include these in the sets?)

Kind regards,
-Tobias
This patch set adds a new UVM freelist on aarch64 to manage pages
allocated to GPU buffers, which have a constraint on paddr < 40 bits.

Index: sys/arch/aarch64/include/vmparam.h
===
RCS file: /cvsroot/src/sys/arch/aarch64/include/vmparam.h,v
retrieving revision 1.20
diff -p -u -r1.20 vmparam.h
--- sys/arch/aarch64/include/vmparam.h  16 Apr 2023 14:01:51 -  1.20
+++ sys/arch/aarch64/include/vmparam.h  27 Dec 2023 22:07:38 -
@@ -182,8 +182,9 @@
 #define VM_PHYSSEG_MAX 64  /* XXX */
 #define VM_PHYSSEG_STRAT   VM_PSTRAT_BSEARCH
 
-#define VM_NFREELIST   1
+#define VM_NFREELIST   2
 #define VM_FREELIST_DEFAULT0
+#define VM_FREELIST_FIRST1T1
 
 #elif defined(__arm__)
 
Index: sys/arch/evbarm/fdt/fdt_machdep.c
===
RCS file: /cvsroot/src/sys/arch/evbarm/fdt/fdt_machdep.c,v
retrieving revision 1.106
diff -p -u -r1.106 fdt_machdep.c
--- sys/arch/evbarm/fdt/fdt_machdep.c   4 Aug 2023 09:06:33 -   1.106
+++ sys/arch/evbarm/fdt/fdt_machdep.c   27 Dec 2023 22:07:38 -
@@ -194,6 +194,11 @@ fdt_add_boot_physmem(const struct fdt_me
bp->bp_start = atop(saddr);
bp->bp_pages = atop(eaddr) - bp->bp_start;
bp->bp_freelist = VM_FREELIST_DEFAULT;
+#ifdef _LP64
+   if (eaddr < (1UL<<40)) {
+   bp->bp_freelist = VM_FREELIST_FIRST1T;
+   }
+#endif
 
 #ifdef PMAP_NEED_ALLOC_POOLPAGE
const uint64_t memory_size = *(uint64_t *)arg;
Index: sys/external/bsd/drm2/include/drm/bus_dma_hacks.h
===
RCS file: /cvsroot/src/sys/external/bsd/drm2/include/drm/bus_dma_hacks.h,v
retrieving revision 1.25
diff -p -u -r1.25 bus_dma_hacks.h
--- sys/external/bsd/drm2/include/drm/bus_dma_hacks.h   19 Jul 2022 23:19:44 
-  1.25
+++ sys/external/bsd/drm2/include/drm/bus_dma_hacks.h   27 Dec 2023 22:07:38 
-
@@ -78,7 +78,7 @@ BUS_MEM_TO_PHYS(bus_dma_tag_t dmat, bus_
if (dr->dr_busbase <= ba && ba - dr->dr_busbase <= dr->dr_len)
return ba - dr->dr_busbase + dr->dr_sysbase;
}
-   panic("bus addr has no bus address in dma tag %p: %"PRIxPADDR, dmat,
+   panic("bus addr has no paddr in dma tag %p: %"PRIxPADDR, dmat,
ba);
 }
 #elif defined(__sparc__) || defined(__sparc64__)
@@ -99,6 +99,8 @@ bus_dmamem_pgfl(bus_dma_tag_t tag)
 {
 #if defined(__i386__) || defined(__x86_64__)
return x86_select_freelist(tag->_bounce_alloc_hi - 1);
+#elif defined(__aarch64__)
+   return VM_FREELIST_FIRST1T;
 #else
return VM_FREELIST_DEFAULT;
 #endif
This patch set changes the type of uvm_physseg.start_hint from u_int to u_long.

Index: sys/uvm/uvm_pglist.c
===
RCS file: /cvsroot/src/sys/uvm/uvm_pglist.c,v
retrieving revision 1.90
diff -p -u -r1.90 uvm_pglist.c
--- sys/uvm/uvm_pglist.c21 Dec 2021 08:27:49 -  1.90
+++ sys/uvm/uvm_pglist.c27 Dec 2023 22:06:10 -
@@ -112,8 +112,9 @@ static int
 uvm_pglistalloc_c_ps(uvm_physseg_t psi, int num, paddr_t low, paddr_t high,
 paddr_t alignment, paddr_t boundary, struct pglist *rlist)
 {
-   signed int candidate, limit, candidateidx, end, idx, skip;
-   int pagemask;
+   long candidate, limit, candidateidx, end, idx;
+   int skip;
+   long pagemask;
bool second_pass;
 #ifdef DEBUG
paddr_t idxpa, lastidxpa;
@@ -138,9 +139,9 @@ uvm_pglistalloc_c_ps(uvm_physseg_t psi, 
 * succeeded.
 */
alignment = atop(alignment);
-   candidate = roundup2(uimax(low, uvm_physseg_get_avail_start(psi) +
+   candidate = roundup2(ulmax(low, uvm_physseg_get_avail_start(psi) +
uvm_physseg_get_start_hint(psi)), alignment);
-   limit = uimin(high, uvm_physseg_get_avail_end(psi));
+   limit = ulmin(high, uvm_physseg_get_avail_end(psi));
pagemask = ~((boundary >> PAGE_SHIFT) - 1);
skip = 0;
second_pass = false;
@@ -162,8 +163,8