Re: inteldrm(4) diff needs review and testing
Hello, I happen to have a Haswell 4600 so I tried to apply this patch, I am running a snapshot from 16 Jul 2017 and just updated my src. My drm_linux.h looks much different than the one your patch was meant for. cvs log shows it to be revision 1.56 so I'm not sure where the discrepancy lies. My drm_linux.h does not contain the lines provided in the context of the diff. Is there a tag or branch I should be pulling from? I've followed the instructions following current. This is the first time I've tried to apply a patch from this mailing list so I suspect I am doing something wrong, any pointers? On Sun, Jul 16, 2017 at 03:19:41PM +0200, Mark Kettenis wrote: > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > When I added support for the command parser, I took a bit of a > shortcut and implemented the hash tables as a single linked list. > This diff fixes that. > > For the hash function I used a "mode (size-1)" approach that leaves > one of the hash table entries unused. Perhaps somebody with a CS > background has a better idea that isn't too complicated to implement? > > Paul, Stuart, there is a small chance that this will improve the > vncviewer performance. > > > Index: dev/pci/drm/drm_linux.h > === > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v > retrieving revision 1.56 > diff -u -p -r1.56 drm_linux.h > --- dev/pci/drm/drm_linux.h 14 Jul 2017 11:18:04 - 1.56 > +++ dev/pci/drm/drm_linux.h 16 Jul 2017 12:54:51 - > @@ -40,6 +40,7 @@ > > #include > #include > +#include > > /* The Linux code doesn't meet our usual standards! */ > #ifdef __clang__ > @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n) > return sum; > } > > -#define DECLARE_HASHTABLE(x, y) struct hlist_head x; > +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)] > > -#define hash_init(x) INIT_HLIST_HEAD(&(x)) > -#define hash_add(x, y, z)hlist_add_head(y, &(x)) > -#define hash_del(x) hlist_del_init(x) > -#define hash_empty(x)hlist_empty(&(x)) > -#define hash_for_each_possible(a, b, c, d) \ > - hlist_for_each_entry(b, &(a), c) > -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \ > - hlist_for_each_entry_safe(d, c, &(a), e) > +static inline void > +__hash_init(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) > + INIT_HLIST_HEAD(&table[i]); > +} > + > +static inline bool > +__hash_empty(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) { > + if (!hlist_empty(&table[i])) > + return false; > + } > + > + return true; > +} > + > +#define __hash(table, key) &table[key % (nitems(table) - 1)] > + > +#define hash_init(table) __hash_init(table, nitems(table)) > +#define hash_add(table, node, key) \ > + hlist_add_head(node, __hash(table, key)) > +#define hash_del(node) hlist_del_init(node) > +#define hash_empty(table)__hash_empty(table, nitems(table)) > +#define hash_for_each_possible(table, obj, member, key) \ > + hlist_for_each_entry(obj, __hash(table, key), member) > +#define hash_for_each_safe(table, i, tmp, obj, member) \ > + for (i = 0; i < nitems(table); i++) \ > +hlist_for_each_entry_safe(obj, tmp, &table[i], member) > > #define ACCESS_ONCE(x) (x) > >
Re: inteldrm(4) diff needs review and testing
> Date: Sun, 16 Jul 2017 10:05:54 -0700 > From: Andrew Marks > > Hello, > > I happen to have a Haswell 4600 so I tried to apply this patch, I am > running a snapshot from 16 Jul 2017 and just updated my src. My > drm_linux.h looks much different than the one your patch was meant for. > cvs log shows it to be revision 1.56 so I'm not sure where the > discrepancy lies. My drm_linux.h does not contain the lines provided in > the context of the diff. > > Is there a tag or branch I should be pulling from? I've followed the > instructions following current. No the diff is defenitely against rev 1.56 of drm_linux.h. > This is the first time I've tried to apply a patch from this mailing > list so I suspect I am doing something wrong, any pointers? # cd /usr/src/sys # cat ~/patch | patch -p0 should do the trick. Could be your mail client is mangling the diff though. > On Sun, Jul 16, 2017 at 03:19:41PM +0200, Mark Kettenis wrote: > > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > > > When I added support for the command parser, I took a bit of a > > shortcut and implemented the hash tables as a single linked list. > > This diff fixes that. > > > > For the hash function I used a "mode (size-1)" approach that leaves > > one of the hash table entries unused. Perhaps somebody with a CS > > background has a better idea that isn't too complicated to implement? > > > > Paul, Stuart, there is a small chance that this will improve the > > vncviewer performance. > > > > > > Index: dev/pci/drm/drm_linux.h > > === > > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v > > retrieving revision 1.56 > > diff -u -p -r1.56 drm_linux.h > > --- dev/pci/drm/drm_linux.h 14 Jul 2017 11:18:04 - 1.56 > > +++ dev/pci/drm/drm_linux.h 16 Jul 2017 12:54:51 - > > @@ -40,6 +40,7 @@ > > > > #include > > #include > > +#include > > > > /* The Linux code doesn't meet our usual standards! */ > > #ifdef __clang__ > > @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n) > > return sum; > > } > > > > -#define DECLARE_HASHTABLE(x, y) struct hlist_head x; > > +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)] > > > > -#define hash_init(x) INIT_HLIST_HEAD(&(x)) > > -#define hash_add(x, y, z) hlist_add_head(y, &(x)) > > -#define hash_del(x)hlist_del_init(x) > > -#define hash_empty(x) hlist_empty(&(x)) > > -#define hash_for_each_possible(a, b, c, d) \ > > - hlist_for_each_entry(b, &(a), c) > > -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \ > > - hlist_for_each_entry_safe(d, c, &(a), e) > > +static inline void > > +__hash_init(struct hlist_head *table, u_int size) > > +{ > > + u_int i; > > + > > + for (i = 0; i < size; i++) > > + INIT_HLIST_HEAD(&table[i]); > > +} > > + > > +static inline bool > > +__hash_empty(struct hlist_head *table, u_int size) > > +{ > > + u_int i; > > + > > + for (i = 0; i < size; i++) { > > + if (!hlist_empty(&table[i])) > > + return false; > > + } > > + > > + return true; > > +} > > + > > +#define __hash(table, key) &table[key % (nitems(table) - 1)] > > + > > +#define hash_init(table) __hash_init(table, nitems(table)) > > +#define hash_add(table, node, key) \ > > + hlist_add_head(node, __hash(table, key)) > > +#define hash_del(node) hlist_del_init(node) > > +#define hash_empty(table) __hash_empty(table, nitems(table)) > > +#define hash_for_each_possible(table, obj, member, key) \ > > + hlist_for_each_entry(obj, __hash(table, key), member) > > +#define hash_for_each_safe(table, i, tmp, obj, member) \ > > + for (i = 0; i < nitems(table); i++) \ > > + hlist_for_each_entry_safe(obj, tmp, &table[i], member) > > > > #define ACCESS_ONCE(x) (x) > > > > > >
Re: inteldrm(4) diff needs review and testing
On Sun, Jul 16, 2017 at 08:45:40PM +0200, Mark Kettenis wrote: > > Date: Sun, 16 Jul 2017 10:05:54 -0700 > > From: Andrew Marks > > > > Hello, > > > > I happen to have a Haswell 4600 so I tried to apply this patch, I am > > running a snapshot from 16 Jul 2017 and just updated my src. My > > drm_linux.h looks much different than the one your patch was meant for. > > cvs log shows it to be revision 1.56 so I'm not sure where the > > discrepancy lies. My drm_linux.h does not contain the lines provided in > > the context of the diff. > > > > Is there a tag or branch I should be pulling from? I've followed the > > instructions following current. > > No the diff is defenitely against rev 1.56 of drm_linux.h. > It was my CVS error, wasn't on current > > This is the first time I've tried to apply a patch from this mailing > > list so I suspect I am doing something wrong, any pointers? > > # cd /usr/src/sys > # cat ~/patch | patch -p0 > > should do the trick. Could be your mail client is mangling the diff > though. > I don't notice anything differet on this Dell M3800 Laptop, let me know if there is something specific you'd like me to test. > > On Sun, Jul 16, 2017 at 03:19:41PM +0200, Mark Kettenis wrote: > > > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > > > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > > > > > When I added support for the command parser, I took a bit of a > > > shortcut and implemented the hash tables as a single linked list. > > > This diff fixes that. > > > > > > For the hash function I used a "mode (size-1)" approach that leaves > > > one of the hash table entries unused. Perhaps somebody with a CS > > > background has a better idea that isn't too complicated to implement? > > > > > > Paul, Stuart, there is a small chance that this will improve the > > > vncviewer performance. > > > > > > > > > Index: dev/pci/drm/drm_linux.h > > > === > > > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v > > > retrieving revision 1.56 > > > diff -u -p -r1.56 drm_linux.h > > > --- dev/pci/drm/drm_linux.h 14 Jul 2017 11:18:04 - 1.56 > > > +++ dev/pci/drm/drm_linux.h 16 Jul 2017 12:54:51 - > > > @@ -40,6 +40,7 @@ > > > > > > #include > > > #include > > > +#include > > > > > > /* The Linux code doesn't meet our usual standards! */ > > > #ifdef __clang__ > > > @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n) > > > return sum; > > > } > > > > > > -#define DECLARE_HASHTABLE(x, y) struct hlist_head x; > > > +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)] > > > > > > -#define hash_init(x) INIT_HLIST_HEAD(&(x)) > > > -#define hash_add(x, y, z)hlist_add_head(y, &(x)) > > > -#define hash_del(x) hlist_del_init(x) > > > -#define hash_empty(x)hlist_empty(&(x)) > > > -#define hash_for_each_possible(a, b, c, d) \ > > > - hlist_for_each_entry(b, &(a), c) > > > -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \ > > > - hlist_for_each_entry_safe(d, c, &(a), e) > > > +static inline void > > > +__hash_init(struct hlist_head *table, u_int size) > > > +{ > > > + u_int i; > > > + > > > + for (i = 0; i < size; i++) > > > + INIT_HLIST_HEAD(&table[i]); > > > +} > > > + > > > +static inline bool > > > +__hash_empty(struct hlist_head *table, u_int size) > > > +{ > > > + u_int i; > > > + > > > + for (i = 0; i < size; i++) { > > > + if (!hlist_empty(&table[i])) > > > + return false; > > > + } > > > + > > > + return true; > > > +} > > > + > > > +#define __hash(table, key) &table[key % (nitems(table) - 1)] > > > + > > > +#define hash_init(table) __hash_init(table, nitems(table)) > > > +#define hash_add(table, node, key) \ > > > + hlist_add_head(node, __hash(table, key)) > > > +#define hash_del(node) hlist_del_init(node) > > > +#define hash_empty(table)__hash_empty(table, nitems(table)) > > > +#define hash_for_each_possible(table, obj, member, key) \ > > > + hlist_for_each_entry(obj, __hash(table, key), member) > > > +#define hash_for_each_safe(table, i, tmp, obj, member) \ > > > + for (i = 0; i < nitems(table); i++) \ > > > +hlist_for_each_entry_safe(obj, tmp, &table[i], member) > > > > > > #define ACCESS_ONCE(x) (x) > > > > > > > > > >
Re: inteldrm(4) diff needs review and testing
On 2017-07-16 15:19:41, Mark Kettenis wrote: > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > When I added support for the command parser, I took a bit of a > shortcut and implemented the hash tables as a single linked list. > This diff fixes that. > > For the hash function I used a "mode (size-1)" approach that leaves > one of the hash table entries unused. Perhaps somebody with a CS > background has a better idea that isn't too complicated to implement? > I haven't noticed any regressions or other negative effects due to this patch on an HD 4600 in a Thinkpad T440p. One thing that may be worth mentioning, is that after the recent update to the DRM code, I get a lot of dmesg spam in the form of error: [drm:pid37898:intel_uncore_check_errors] *ERROR* Unclaimed register before interrupt after booting the system. They are present with and without your patch, so it must be due to a separate issue. They don't seem to be causing any negative effects (everything seems to be working just fine). After booting, the system sits for about 15-20 seconds printing these errors to the console as fast as the system can print them. After it's done, everything works normally so at least for me, it seems to be mostly a cosmetic issue. However, I'm not sure if it may be pointing to something that may affect other users. Both an actual dmesg and output from running "dmesg" are attached. As one can see, there are at least enough lines output to completely scroll the initial system dmesg out of the buffer. --- /var/run/dmesg.boot --- OpenBSD 6.1-current (GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS) #15: Mon Jul 17 02:21:03 JST 2017 shoshon...@shoshoni-m.shoshoni.info:/usr/src/sys/arch/amd64/compile/GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS real mem = 12539871232 (11958MB) avail mem = 12154036224 (11590MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xbcc0d000 (67 entries) bios0: vendor LENOVO version "GLET85WW (2.39 )" date 09/29/2016 bios0: LENOVO 20AWS27D00 acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SLIC DBGP ECDT HPET APIC MCFG SSDT SSDT SSDT SSDT SSDT SSDT SSDT PCCT SSDT TCPA UEFI MSDM ASF! BATB FPDT UEFI DMAR acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) EXP3(S4) XHCI(S3) EHC1(S3) EHC2(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpiec0 at acpi0 acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2594.38 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: TSC frequency 2594384360 Hz cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20,
Re: inteldrm(4) diff needs review and testing
On Mon, Jul 17, 2017 at 11:08:13AM +0900, Bryan Linton wrote: > On 2017-07-16 15:19:41, Mark Kettenis wrote: > > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > > > When I added support for the command parser, I took a bit of a > > shortcut and implemented the hash tables as a single linked list. > > This diff fixes that. > > > > For the hash function I used a "mode (size-1)" approach that leaves > > one of the hash table entries unused. Perhaps somebody with a CS > > background has a better idea that isn't too complicated to implement? > > > > I haven't noticed any regressions or other negative effects due to > this patch on an HD 4600 in a Thinkpad T440p. > > One thing that may be worth mentioning, is that after the recent > update to the DRM code, I get a lot of dmesg spam in the form of > > error: [drm:pid37898:intel_uncore_check_errors] *ERROR* Unclaimed > register before interrupt > > after booting the system. > > They are present with and without your patch, so it must be due to > a separate issue. > > They don't seem to be causing any negative effects (everything > seems to be working just fine). After booting, the system sits > for about 15-20 seconds printing these errors to the console as > fast as the system can print them. > > After it's done, everything works normally so at least for me, it > seems to be mostly a cosmetic issue. However, I'm not sure if it > may be pointing to something that may affect other users. > > Both an actual dmesg and output from running "dmesg" are attached. > As one can see, there are at least enough lines output to > completely scroll the initial system dmesg out of the buffer. > I experience the same! with and without the patch. It also fills the dmesg buffer on my Dell M3800 but does not appear to cause any issues other than a lot of logging. I noticed it after I updated to a snapshot with random linking, not saying its related, just mentioning when I noticed it first. > --- /var/run/dmesg.boot --- > > OpenBSD 6.1-current (GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS) #15: Mon Jul 17 > 02:21:03 JST 2017 > > shoshon...@shoshoni-m.shoshoni.info:/usr/src/sys/arch/amd64/compile/GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS > real mem = 12539871232 (11958MB) > avail mem = 12154036224 (11590MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xbcc0d000 (67 entries) > bios0: vendor LENOVO version "GLET85WW (2.39 )" date 09/29/2016 > bios0: LENOVO 20AWS27D00 > acpi0 at bios0: rev 2 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP SLIC DBGP ECDT HPET APIC MCFG SSDT SSDT SSDT SSDT > SSDT SSDT SSDT PCCT SSDT TCPA UEFI MSDM ASF! BATB FPDT UEFI DMAR > acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) EXP3(S4) XHCI(S3) > EHC1(S3) EHC2(S3) > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpiec0 at acpi0 > acpihpet0 at acpi0: 14318179 Hz > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2594.38 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: TSC frequency 2594384360 Hz > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 99MHz > cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 1, core 0, package 0 > cpu2 at mainbus0: apid 2 (application processor) > cpu2: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 1, package 0 > cpu3 at mainbus0: apid 3 (application processor) > cpu3: Intel(R) Core(TM
Re: inteldrm(4) diff needs review and testing
Mark Kettenis writes: > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > When I added support for the command parser, I took a bit of a > shortcut and implemented the hash tables as a single linked list. > This diff fixes that. > > For the hash function I used a "mode (size-1)" approach that leaves > one of the hash table entries unused. Perhaps somebody with a CS > background has a better idea that isn't too complicated to implement? > > Paul, Stuart, there is a small chance that this will improve the > vncviewer performance. > > > Index: dev/pci/drm/drm_linux.h > === > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v > retrieving revision 1.56 > diff -u -p -r1.56 drm_linux.h > --- dev/pci/drm/drm_linux.h 14 Jul 2017 11:18:04 - 1.56 > +++ dev/pci/drm/drm_linux.h 16 Jul 2017 12:54:51 - > @@ -40,6 +40,7 @@ > > #include > #include > +#include > > /* The Linux code doesn't meet our usual standards! */ > #ifdef __clang__ > @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n) > return sum; > } > > -#define DECLARE_HASHTABLE(x, y) struct hlist_head x; > +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)] > > -#define hash_init(x) INIT_HLIST_HEAD(&(x)) > -#define hash_add(x, y, z)hlist_add_head(y, &(x)) > -#define hash_del(x) hlist_del_init(x) > -#define hash_empty(x)hlist_empty(&(x)) > -#define hash_for_each_possible(a, b, c, d) \ > - hlist_for_each_entry(b, &(a), c) > -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \ > - hlist_for_each_entry_safe(d, c, &(a), e) > +static inline void > +__hash_init(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) > + INIT_HLIST_HEAD(&table[i]); > +} > + > +static inline bool > +__hash_empty(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) { > + if (!hlist_empty(&table[i])) > + return false; > + } > + > + return true; > +} > + > +#define __hash(table, key) &table[key % (nitems(table) - 1)] > + > +#define hash_init(table) __hash_init(table, nitems(table)) > +#define hash_add(table, node, key) \ > + hlist_add_head(node, __hash(table, key)) > +#define hash_del(node) hlist_del_init(node) > +#define hash_empty(table)__hash_empty(table, nitems(table)) > +#define hash_for_each_possible(table, obj, member, key) \ > + hlist_for_each_entry(obj, __hash(table, key), member) > +#define hash_for_each_safe(table, i, tmp, obj, member) \ > + for (i = 0; i < nitems(table); i++) \ > +hlist_for_each_entry_safe(obj, tmp, &table[i], member) > > #define ACCESS_ONCE(x) (x) > Seems to work here on HD4000. Quickly tested browsing, watching videos and suspend/resume. I didn't notice any regressions. Though it still has following lines: error: [drm:pid0:cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A error: [drm:pid0:intel_pch_fifo_underrun_irq_handler] *ERROR* PCH transcoder A FIFO underrun These probably appeared once the intel driver was switched. So far they don't seem to cause any noticable issues. Timo OpenBSD 6.1-current (GENERIC.MP) #8: Mon Jul 17 08:58:12 EEST 2017 tmy@phobos.TeleWell.gateway:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 16973611008 (16187MB) avail mem = 16453402624 (15691MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xdae9c000 (68 entries) bios0: vendor LENOVO version "G7ETA4WW (2.64 )" date 10/08/2015 bios0: LENOVO 2355C16 acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SLIC TCPA SSDT SSDT SSDT HPET APIC MCFG ECDT FPDT ASF! UEFI UEFI POAT SSDT SSDT DMAR UEFI DBG2 acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP3(S4) XHCI(S3) EHC1(S3) EHC2(S3) HDEF(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.56 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: TSC frequency 2594563520 Hz cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 MH
Re: inteldrm(4) diff needs review and testing
On Sun, Jul 16, 2017 at 03:19:41PM +0200, Mark Kettenis wrote: > Can somebody test the following diff on Ivy Bridge or Haswell (Intel > HD Graphics 2500/4000/4600/4700/5000/5100/5200)? > > When I added support for the command parser, I took a bit of a > shortcut and implemented the hash tables as a single linked list. > This diff fixes that. > no regression and no more dmesg spam for me. Thanks Giovanni > For the hash function I used a "mode (size-1)" approach that leaves > one of the hash table entries unused. Perhaps somebody with a CS > background has a better idea that isn't too complicated to implement? > > Paul, Stuart, there is a small chance that this will improve the > vncviewer performance. > > > Index: dev/pci/drm/drm_linux.h > === > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v > retrieving revision 1.56 > diff -u -p -r1.56 drm_linux.h > --- dev/pci/drm/drm_linux.h 14 Jul 2017 11:18:04 - 1.56 > +++ dev/pci/drm/drm_linux.h 16 Jul 2017 12:54:51 - > @@ -40,6 +40,7 @@ > > #include > #include > +#include > > /* The Linux code doesn't meet our usual standards! */ > #ifdef __clang__ > @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n) > return sum; > } > > -#define DECLARE_HASHTABLE(x, y) struct hlist_head x; > +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)] > > -#define hash_init(x) INIT_HLIST_HEAD(&(x)) > -#define hash_add(x, y, z)hlist_add_head(y, &(x)) > -#define hash_del(x) hlist_del_init(x) > -#define hash_empty(x)hlist_empty(&(x)) > -#define hash_for_each_possible(a, b, c, d) \ > - hlist_for_each_entry(b, &(a), c) > -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \ > - hlist_for_each_entry_safe(d, c, &(a), e) > +static inline void > +__hash_init(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) > + INIT_HLIST_HEAD(&table[i]); > +} > + > +static inline bool > +__hash_empty(struct hlist_head *table, u_int size) > +{ > + u_int i; > + > + for (i = 0; i < size; i++) { > + if (!hlist_empty(&table[i])) > + return false; > + } > + > + return true; > +} > + > +#define __hash(table, key) &table[key % (nitems(table) - 1)] > + > +#define hash_init(table) __hash_init(table, nitems(table)) > +#define hash_add(table, node, key) \ > + hlist_add_head(node, __hash(table, key)) > +#define hash_del(node) hlist_del_init(node) > +#define hash_empty(table)__hash_empty(table, nitems(table)) > +#define hash_for_each_possible(table, obj, member, key) \ > + hlist_for_each_entry(obj, __hash(table, key), member) > +#define hash_for_each_safe(table, i, tmp, obj, member) \ > + for (i = 0; i < nitems(table); i++) \ > +hlist_for_each_entry_safe(obj, tmp, &table[i], member) > > #define ACCESS_ONCE(x) (x) > > OpenBSD 6.1-current (GENERIC.MP) #11: Mon Jul 17 14:35:43 CEST 2017 giova...@bigio.paclan.it:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4183527424 (3989MB) avail mem = 4050944000 (3863MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xccbff000 (42 entries) bios0: vendor TOSHIBA version "Version 5.00" date 02/15/2017 bios0: TOSHIBA PORTEGE R30-A acpi0 at bios0: rev 0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP HPET APIC MCFG BOOT MSDM SLIC SSDT SSDT SSDT ASF! SSDT SSDT SSDT DMAR FPDT acpi0: wakeup devices GLAN(S4) EHC1(S3) EHC2(S3) XHC_(S3) HDEF(S3) RP06(S4) PXSX(S4) PWRB(S4) LID_(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-4310M CPU @ 2.70GHz, 2694.14 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: TSC frequency 2694141120 Hz cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-4310M CPU @ 2.70GHz, 2693.77 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1
Re: inteldrm(4) diff needs review and testing
Hi Mark, On Sun, Jul 16, 2017 at 03:19:41PM +0200, Mark Kettenis wrote: | Can somebody test the following diff on Ivy Bridge or Haswell (Intel | HD Graphics 2500/4000/4600/4700/5000/5100/5200)? | | When I added support for the command parser, I took a bit of a | shortcut and implemented the hash tables as a single linked list. | This diff fixes that. | | For the hash function I used a "mode (size-1)" approach that leaves | one of the hash table entries unused. Perhaps somebody with a CS | background has a better idea that isn't too complicated to implement? | | Paul, Stuart, there is a small chance that this will improve the | vncviewer performance. It doesn't (vncviewer still consumes a full CPU core), but otherwise there are no regressions from the previous situation. Thanks Mark! Paul OpenBSD 6.1-current (GENERIC.MP) #6: Mon Jul 17 09:52:56 CEST 2017 we...@pom.alm.weirdnet.nl:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 34243919872 (32657MB) avail mem = 33200308224 (31662MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec410 (88 entries) bios0: vendor Dell Inc. version "A12" date 05/06/2015 bios0: Dell Inc. OptiPlex 9020 acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC FPDT SLIC LPIT SSDT SSDT SSDT HPET SSDT MCFG SSDT ASF! DMAR acpi0: wakeup devices UAR1(S3) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) XHC_(S4) HDEF(S4) PEG0(S4) PEGP(S4) PEG1(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.85 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: TSC frequency 3392845890 Hz cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.15 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.15 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 6 (application processor) cpu3: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.15 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 0, core 3, package 0 cpu4 at mainbus0: apid 1 (application processor) cpu4: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.15 MHz cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT cpu4: 256KB 64b/line 8-way L2 cache cpu4: smt 1, core 0, package 0 cpu5 at mainbus0: apid 3 (application processor) cpu5: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3392.15 MHz cpu5: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSB