[IRQ] IRQ affinity not working properly?

2021-01-29 Thread Chris Friesen
Hi, I'm not subscribed to the list, please cc me on replies. I have a CentOS 7 linux system with 48 logical CPUs and a number of Intel NICs running the i40e driver. It was booted with irqaffinity=0-1,24-25 in the kernel boot args, resulting in /proc/irq/default_smp_affinity showing

Re: IRQs in /proc/irq/* that aren't listed in /proc/interrupts?

2020-10-13 Thread Chris Friesen
On 10/12/2020 1:37 PM, Thomas Gleixner wrote: On Mon, Oct 12 2020 at 12:40, Chris Friesen wrote: On one of my X86-64 systems /proc/interrupts starts with the following interrupts (per-cpu info snipped): 0: IR-IO-APIC-edge timer 4: IR-IO-APIC-edge serial 8

IRQs in /proc/irq/* that aren't listed in /proc/interrupts?

2020-10-12 Thread Chris Friesen
Hi, On one of my X86-64 systems /proc/interrupts starts with the following interrupts (per-cpu info snipped): 0: IR-IO-APIC-edge timer 4: IR-IO-APIC-edge serial 8: IR-IO-APIC-edge rtc0 9: IR-IO-APIC-fasteoi acpi 17: IR-IO-APIC-fasteoi

[RT] should pm_qos_resume_latency_us on one CPU affect latency on another?

2019-08-13 Thread Chris Friesen
Hi all, Just wondering if what I'm seeing is expected. I'm using the CentOS 7 RT kernel with boot args of "skew_tick=1 irqaffinity=0 rcu_nocbs=1-27 nohz_full=1-27" among others. Normally if I run cyclictest it sets /dev/cpu_dma_latency to zero. This gives worst-case latency around

[RT] hit recently-fixed PREEMPT_RT CFS-bandwidth timer locking issue in the wild

2019-07-26 Thread Chris Friesen
Hi all, I thought people might be interested to hear that we recently hit the bug fixed by git commit c0ad4aa4d8 on multiple lab systems running the RHEL 7 "kernel-rt" kernel. (But I think other versions are at risk as well.) Interestingly, when the bug hit the system just hung completely.

Re: "swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2017-11-21 Thread Chris Friesen
Huaitong Han 2016-10-12 0:02 GMT+08:00 Chris Friesen <chris.frie...@windriver.com>: On 10/08/2016 02:05 AM, Hillf Danton wrote: On Friday, October 07, 2016 5:01 AM Chris Friesen I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.

Re: "swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2017-11-21 Thread Chris Friesen
Huaitong Han 2016-10-12 0:02 GMT+08:00 Chris Friesen : On 10/08/2016 02:05 AM, Hillf Danton wrote: On Friday, October 07, 2016 5:01 AM Chris Friesen I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.10 but with loads of stuff backported by

Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration

2017-02-24 Thread Chris Friesen
On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote: On 2017/2/22 22:43, Paolo Bonzini wrote: Hopefully Gaohuai and Rongguang can help with this too. Paolo . Yes, we are looking into and testing this. I think this can result in any memory corruption, if VM1 writes its PML buffer into

Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration

2017-02-24 Thread Chris Friesen
On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote: On 2017/2/22 22:43, Paolo Bonzini wrote: Hopefully Gaohuai and Rongguang can help with this too. Paolo . Yes, we are looking into and testing this. I think this can result in any memory corruption, if VM1 writes its PML buffer into

Re: "swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2016-10-11 Thread Chris Friesen
On 10/08/2016 02:05 AM, Hillf Danton wrote: On Friday, October 07, 2016 5:01 AM Chris Friesen I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.10 but with loads of stuff backported by RedHat.) I realize this is not a mainline kernel, but I

Re: "swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2016-10-11 Thread Chris Friesen
On 10/08/2016 02:05 AM, Hillf Danton wrote: On Friday, October 07, 2016 5:01 AM Chris Friesen I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.10 but with loads of stuff backported by RedHat.) I realize this is not a mainline kernel, but I

"swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2016-10-06 Thread Chris Friesen
I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.10 but with loads of stuff backported by RedHat.) I realize this is not a mainline kernel, but I was wondering if anyone is aware of similar issues that had been fixed in mainline. When doing

"swap_free: Bad swap file entry" and "BUG: Bad page map in process" but no swap configured

2016-10-06 Thread Chris Friesen
I have Linux host running as a kvm hypervisor. It's running CentOS. (So the kernel is based on 3.10 but with loads of stuff backported by RedHat.) I realize this is not a mainline kernel, but I was wondering if anyone is aware of similar issues that had been fixed in mainline. When doing

help? usage of indirect per-cpu variables

2016-09-27 Thread Chris Friesen
Hi, I'm trying to wrap my head around indirect percpu variables, and I'm hoping someone can school me on how they work. For example, in mm/slub.c we have "struct kmem_cache *s". s->cpu_slab is a per-cpu variable, so we access it with something like: c = raw_cpu_ptr(s->cpu_slab);

help? usage of indirect per-cpu variables

2016-09-27 Thread Chris Friesen
Hi, I'm trying to wrap my head around indirect percpu variables, and I'm hoping someone can school me on how they work. For example, in mm/slub.c we have "struct kmem_cache *s". s->cpu_slab is a per-cpu variable, so we access it with something like: c = raw_cpu_ptr(s->cpu_slab);

why would we be spending a whole msec running just the cascade() function?

2016-06-24 Thread Chris Friesen
Hi, I'm trying to get an idea why we would be spending a whole millisecond running the cascade() routine. This is on a CentOS 7 kernel (config modified as per below) so feel free to send me off to the other support fora if you like. Running cyclictest gave the following: cyclicte-29932

why would we be spending a whole msec running just the cascade() function?

2016-06-24 Thread Chris Friesen
Hi, I'm trying to get an idea why we would be spending a whole millisecond running the cascade() routine. This is on a CentOS 7 kernel (config modified as per below) so feel free to send me off to the other support fora if you like. Running cyclictest gave the following: cyclicte-29932

[tip:sched/core] sched/cputime: Fix steal_account_process_tick() to always return jiffies

2016-03-08 Thread tip-bot for Chris Friesen
Commit-ID: f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d Gitweb: http://git.kernel.org/tip/f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d Author: Chris Friesen <cbf...@mail.usask.ca> AuthorDate: Sat, 5 Mar 2016 23:18:48 -0600 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Tue, 8

[tip:sched/core] sched/cputime: Fix steal_account_process_tick() to always return jiffies

2016-03-08 Thread tip-bot for Chris Friesen
Commit-ID: f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d Gitweb: http://git.kernel.org/tip/f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d Author: Chris Friesen AuthorDate: Sat, 5 Mar 2016 23:18:48 -0600 Committer: Ingo Molnar CommitDate: Tue, 8 Mar 2016 12:24:56 +0100 sched/cputime: Fix

[PATCH v2] sched/cputime: steal_account_process_tick() should return jiffies

2016-03-05 Thread Chris Friesen
time and only account it once it's worth a jiffy. (Thanks to Frederic Weisbecker for suggestions to fix a bug in my first version of the patch.) Signed-off-by: Chris Friesen <chris.frie...@windriver.com> --- kernel/sched/cputime.c | 14 +++--- 1 file changed, 7 insertions(+), 7 del

[PATCH v2] sched/cputime: steal_account_process_tick() should return jiffies

2016-03-05 Thread Chris Friesen
time and only account it once it's worth a jiffy. (Thanks to Frederic Weisbecker for suggestions to fix a bug in my first version of the patch.) Signed-off-by: Chris Friesen --- kernel/sched/cputime.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/sched

Re: [PATCH] steal_account_process_tick() should return jiffies

2016-03-05 Thread Chris Friesen
On 03/05/2016 07:19 AM, Frederic Weisbecker wrote: On Sat, Mar 05, 2016 at 11:27:01AM +0100, Thomas Gleixner wrote: Chris, On Fri, 4 Mar 2016, Chris Friesen wrote: First of all the subject line should contain a subsystem prefix, i.e. "sched/cputime:" T

Re: [PATCH] steal_account_process_tick() should return jiffies

2016-03-05 Thread Chris Friesen
On 03/05/2016 07:19 AM, Frederic Weisbecker wrote: On Sat, Mar 05, 2016 at 11:27:01AM +0100, Thomas Gleixner wrote: Chris, On Fri, 4 Mar 2016, Chris Friesen wrote: First of all the subject line should contain a subsystem prefix, i.e. "sched/cputime:" T

[PATCH] steal_account_process_tick() should return jiffies

2016-03-04 Thread Chris Friesen
is to change steal_account_process_tick() to always return jiffies. If CONFIG_VIRT_CPU_ACCOUNTING_GEN is not enabled then this is a no-op. As far as I can tell this bug has been present since commit dee08a72. Signed-off-by: Chris Friesen <chris.frie...@windriver.com> --- kernel/sched/cputime

[PATCH] steal_account_process_tick() should return jiffies

2016-03-04 Thread Chris Friesen
is to change steal_account_process_tick() to always return jiffies. If CONFIG_VIRT_CPU_ACCOUNTING_GEN is not enabled then this is a no-op. As far as I can tell this bug has been present since commit dee08a72. Signed-off-by: Chris Friesen --- kernel/sched/cputime.c | 2 +- 1 file changed, 1

Re: question about logic of steal_account_process_tick() ?

2016-03-04 Thread Chris Friesen
On 03/04/2016 01:51 PM, Chris Friesen wrote: The thing is, steal_account_process_tick() returns units of cputime, which I think is nanoseconds on x86_64. So if we have a tiny amount of stolen time it seems like that will prevent a whole tick from being accounted into user/system/idle. I feel

Re: question about logic of steal_account_process_tick() ?

2016-03-04 Thread Chris Friesen
On 03/04/2016 01:51 PM, Chris Friesen wrote: The thing is, steal_account_process_tick() returns units of cputime, which I think is nanoseconds on x86_64. So if we have a tiny amount of stolen time it seems like that will prevent a whole tick from being accounted into user/system/idle. I feel

question about logic of steal_account_process_tick() ?

2016-03-04 Thread Chris Friesen
I'm trying to wrap my head around how steal_account_process_tick() interacts with account_process_tick(). Suppose we have CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and CONFIG_NO_HZ_IDLE, with a cpu hog on cpu0 to prevent it going idle. As I understand it, account_process_tick() will be called once

question about logic of steal_account_process_tick() ?

2016-03-04 Thread Chris Friesen
I'm trying to wrap my head around how steal_account_process_tick() interacts with account_process_tick(). Suppose we have CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and CONFIG_NO_HZ_IDLE, with a cpu hog on cpu0 to prevent it going idle. As I understand it, account_process_tick() will be called once

Re: weird /proc/stat output with newer (4.1, 4.2) kernels in kvm guest on 3.10 host?

2016-03-03 Thread Chris Friesen
On 03/02/2016 11:12 AM, Chris Friesen wrote: I'm running a 3.10-based host with libvirt 1.2.12 and qemu 2.2. Running a Fedora23 cloud image as a guest, the "cpu" lines in /proc/stat seem to be hardly changing: [fedora@fedora23 boot]$ uptime 17:01:50 up 44 min, 1 user, load ave

Re: weird /proc/stat output with newer (4.1, 4.2) kernels in kvm guest on 3.10 host?

2016-03-03 Thread Chris Friesen
On 03/02/2016 11:12 AM, Chris Friesen wrote: I'm running a 3.10-based host with libvirt 1.2.12 and qemu 2.2. Running a Fedora23 cloud image as a guest, the "cpu" lines in /proc/stat seem to be hardly changing: [fedora@fedora23 boot]$ uptime 17:01:50 up 44 min, 1 user, load ave

weird /proc/stat output with newer (4.1, 4.2) kernels in kvm guest on 3.10 host?

2016-03-02 Thread Chris Friesen
Hi, I'm running a 3.10-based host with libvirt 1.2.12 and qemu 2.2. Running a Fedora23 cloud image as a guest, the "cpu" lines in /proc/stat seem to be hardly changing: [fedora@fedora23 boot]$ uptime 17:01:50 up 44 min, 1 user, load average: 3.00, 2.99, 2.79 [fedora@fedora23 boot]$ grep

weird /proc/stat output with newer (4.1, 4.2) kernels in kvm guest on 3.10 host?

2016-03-02 Thread Chris Friesen
Hi, I'm running a 3.10-based host with libvirt 1.2.12 and qemu 2.2. Running a Fedora23 cloud image as a guest, the "cpu" lines in /proc/stat seem to be hardly changing: [fedora@fedora23 boot]$ uptime 17:01:50 up 44 min, 1 user, load average: 3.00, 2.99, 2.79 [fedora@fedora23 boot]$ grep

Re: question about cpusets vs sched_setaffinity()

2015-12-11 Thread Chris Friesen
On 12/11/2015 04:15 PM, Jason Baron wrote: On 12/10/2015 04:30 PM, Chris Friesen wrote: If I put a task into a cpuset and then call sched_setaffinity() on it, it will be affined to the intersection of the two sets of cpus. (Those specified on the set, and those specified in the syscall

Re: question about cpusets vs sched_setaffinity()

2015-12-11 Thread Chris Friesen
On 12/11/2015 04:15 PM, Jason Baron wrote: On 12/10/2015 04:30 PM, Chris Friesen wrote: If I put a task into a cpuset and then call sched_setaffinity() on it, it will be affined to the intersection of the two sets of cpus. (Those specified on the set, and those specified in the syscall

question about cpusets vs sched_setaffinity()

2015-12-10 Thread Chris Friesen
Hi, I've got a question about the interaction between cpusets and sched_setaffinity(). If I put a task into a cpuset and then call sched_setaffinity() on it, it will be affined to the intersection of the two sets of cpus. (Those specified on the set, and those specified in the syscall.)

question about cpusets vs sched_setaffinity()

2015-12-10 Thread Chris Friesen
Hi, I've got a question about the interaction between cpusets and sched_setaffinity(). If I put a task into a cpuset and then call sched_setaffinity() on it, it will be affined to the intersection of the two sets of cpus. (Those specified on the set, and those specified in the syscall.)

weird interaction between kvm and NO_HZ_FULL?

2015-03-20 Thread Chris Friesen
Hi, I'm running 3.10 (yeah, I know) and I'm playing with CONFIG_NO_HZ_FULL. I'm getting a strange result where some CPUs are able to turn off local timer interrupts and others aren't. Is there a known interaction between kvm-based VMs and CONFIG_NO_HZ_FULL? Background: I've got an x86-64

weird interaction between kvm and NO_HZ_FULL?

2015-03-20 Thread Chris Friesen
Hi, I'm running 3.10 (yeah, I know) and I'm playing with CONFIG_NO_HZ_FULL. I'm getting a strange result where some CPUs are able to turn off local timer interrupts and others aren't. Is there a known interaction between kvm-based VMs and CONFIG_NO_HZ_FULL? Background: I've got an x86-64

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 01:17 PM, Martin K. Petersen wrote: I'd suggest trying /dev/sgN instead. That seems to work. Much appreciated. And it's now showing an "optimal_io_size" of 0, so I think the issue is dealt with. Thanks for all the help, it's been educational. :) Chris -- To unsubscribe

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 10:25 AM, Martin K. Petersen wrote: >>>>>> "Chris" == Chris Friesen writes: > > Chris, > > Chris> Also, I think it's wrong for filesystems and userspace to use it > Chris> for alignment. In E.4 and E.5 in the "sbc3r25

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 11:42 AM, Martin K. Petersen wrote: "Martin" == Martin K Petersen writes: Martin> I know there was a bug open with Seagate. I assume it has been Martin> fixed in their latest firmware. Seagate confirms that this issue was fixed about a year ago. Will provide more data when I

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 11:42 AM, Martin K. Petersen wrote: Martin == Martin K Petersen martin.peter...@oracle.com writes: Martin I know there was a bug open with Seagate. I assume it has been Martin fixed in their latest firmware. Seagate confirms that this issue was fixed about a year ago. Will

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 10:25 AM, Martin K. Petersen wrote: Chris == Chris Friesen chris.frie...@windriver.com writes: Chris, Chris Also, I think it's wrong for filesystems and userspace to use it Chris for alignment. In E.4 and E.5 in the sbc3r25.pdf doc, it looks Chris like they use the optimal

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-07 Thread Chris Friesen
On 11/07/2014 01:17 PM, Martin K. Petersen wrote: I'd suggest trying /dev/sgN instead. That seems to work. Much appreciated. And it's now showing an optimal_io_size of 0, so I think the issue is dealt with. Thanks for all the help, it's been educational. :) Chris -- To unsubscribe from

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 07:56 PM, Martin K. Petersen wrote: "Chris" == Chris Friesen writes: Chris, Chris> For a RAID card I expect it would be related to chunk size or Chris> stripe width or something...but even then I would expect to be Chris> able to cap it at 100MB or so. O

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 12:12 PM, Martin K. Petersen wrote: "Chris" == Chris Friesen writes: Chris> That'd work, but is it the best way to go? I mean, I found one Chris> report of a similar problem on an SSD (model number unknown). In Chris> that case it was a near-UINT_MAX value a

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 11:34 AM, Martin K. Petersen wrote: "Chris" == Chris Friesen writes: Chris> Perhaps the ST900MM0026 should be blacklisted as well? Sure. I'll widen the net a bit for that Seagate model. That'd work, but is it the best way to go? I mean, I found one report

Re: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 10:47 AM, Chris Friesen wrote: Hi, I'm running a modified 3.4-stable on relatively recent X86 server-class hardware. I recently installed a Seagate ST900MM0026 (900GB 2.5in 10K SAS drive) and it's reporting a value of 4294966784 for optimal_io_size. The other parameters look

absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-06 Thread Chris Friesen
Hi, I'm running a modified 3.4-stable on relatively recent X86 server-class hardware. I recently installed a Seagate ST900MM0026 (900GB 2.5in 10K SAS drive) and it's reporting a value of 4294966784 for optimal_io_size. The other parameters look normal though:

absurdly high optimal_io_size on Seagate SAS disk

2014-11-06 Thread Chris Friesen
Hi, I'm running a modified 3.4-stable on relatively recent X86 server-class hardware. I recently installed a Seagate ST900MM0026 (900GB 2.5in 10K SAS drive) and it's reporting a value of 4294966784 for optimal_io_size. The other parameters look normal though:

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 10:47 AM, Chris Friesen wrote: Hi, I'm running a modified 3.4-stable on relatively recent X86 server-class hardware. I recently installed a Seagate ST900MM0026 (900GB 2.5in 10K SAS drive) and it's reporting a value of 4294966784 for optimal_io_size. The other parameters look

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 11:34 AM, Martin K. Petersen wrote: Chris == Chris Friesen chris.frie...@windriver.com writes: Chris Perhaps the ST900MM0026 should be blacklisted as well? Sure. I'll widen the net a bit for that Seagate model. That'd work, but is it the best way to go? I mean, I found one

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 12:12 PM, Martin K. Petersen wrote: Chris == Chris Friesen chris.frie...@windriver.com writes: Chris That'd work, but is it the best way to go? I mean, I found one Chris report of a similar problem on an SSD (model number unknown). In Chris that case it was a near-UINT_MAX

Re: absurdly high optimal_io_size on Seagate SAS disk

2014-11-06 Thread Chris Friesen
On 11/06/2014 07:56 PM, Martin K. Petersen wrote: Chris == Chris Friesen chris.frie...@windriver.com writes: Chris, Chris For a RAID card I expect it would be related to chunk size or Chris stripe width or something...but even then I would expect to be Chris able to cap it at 100MB or so

Re: semantics of reader/writer semaphores in rt patch

2014-10-27 Thread Chris Friesen
On 10/25/2014 04:19 PM, Thomas Gleixner wrote: On Thu, 23 Oct 2014, Chris Friesen wrote: I recently noticed that when CONFIG_PREEMPT_RT_FULL is enabled we the semantics change. From "include/linux/rwsem_rt.h": * Note that the semantics are different from the usual * Lin

Re: semantics of reader/writer semaphores in rt patch

2014-10-27 Thread Chris Friesen
On 10/25/2014 04:19 PM, Thomas Gleixner wrote: On Thu, 23 Oct 2014, Chris Friesen wrote: I recently noticed that when CONFIG_PREEMPT_RT_FULL is enabled we the semantics change. From include/linux/rwsem_rt.h: * Note that the semantics are different from the usual * Linux rw-sems

semantics of reader/writer semaphores in rt patch?

2014-10-24 Thread Chris Friesen
I recently noticed that when CONFIG_PREEMPT_RT_FULL is enabled the semantics change. From "include/linux/rwsem_rt.h": * Note that the semantics are different from the usual * Linux rw-sems, in PREEMPT_RT mode we do not allow * multiple readers to hold the lock at once, we only allow * a

semantics of reader/writer semaphores in rt patch?

2014-10-24 Thread Chris Friesen
I recently noticed that when CONFIG_PREEMPT_RT_FULL is enabled the semantics change. From include/linux/rwsem_rt.h: * Note that the semantics are different from the usual * Linux rw-sems, in PREEMPT_RT mode we do not allow * multiple readers to hold the lock at once, we only allow * a

[tracing] trying to make sense of trace output, can't figure out where time is going

2014-09-03 Thread Chris Friesen
p, CPU C states, and NMI on error. Is there a different tracer that would give more insight? The irqs-off or preemption-off tracers perhaps? Please CC me as I'm not subscribed to the list. Thanks, Chris Friesen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel&q

[tracing] trying to make sense of trace output, can't figure out where time is going

2014-09-03 Thread Chris Friesen
states, and NMI on error. Is there a different tracer that would give more insight? The irqs-off or preemption-off tracers perhaps? Please CC me as I'm not subscribed to the list. Thanks, Chris Friesen -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [PATCH/RFC] Re: recvmmsg() timeout behavior strangeness [RESEND]

2014-05-28 Thread Chris Friesen
On 05/28/2014 01:50 PM, 'Arnaldo Carvalho de Melo' wrote: What is being discussed here is how to return the EFAULT that may happen _after_ datagram processing, be it interrupted by an EFAULT, signal, or plain returning all that was requested, with no errors. This EFAULT _after_ datagram

Re: [PATCH/RFC] Re: recvmmsg() timeout behavior strangeness [RESEND]

2014-05-28 Thread Chris Friesen
On 05/28/2014 01:50 PM, 'Arnaldo Carvalho de Melo' wrote: What is being discussed here is how to return the EFAULT that may happen _after_ datagram processing, be it interrupted by an EFAULT, signal, or plain returning all that was requested, with no errors. This EFAULT _after_ datagram

[bug?] can't unmount filesystem after running "exportfs -u" -- dangling refcount?

2014-03-21 Thread Chris Friesen
I'm seeing some odd behaviour on an NFS server. We're running 3.4.82 and have a drbd-replicated filesystem (active/standby) that is exported via NFS (explicitly v3, and /proc/fs/nfsd is mounted). On a server switchover we take down the NFS server IP address, run "exportfs -u" on the

[bug?] can't unmount filesystem after running exportfs -u -- dangling refcount?

2014-03-21 Thread Chris Friesen
I'm seeing some odd behaviour on an NFS server. We're running 3.4.82 and have a drbd-replicated filesystem (active/standby) that is exported via NFS (explicitly v3, and /proc/fs/nfsd is mounted). On a server switchover we take down the NFS server IP address, run exportfs -u on the exports,

Re: mmap vs fs cache

2013-03-08 Thread Chris Friesen
On 03/08/2013 09:00 AM, Howard Chu wrote: First obvious conclusion - kswapd is being too aggressive. When free memory hits the low watermark, the reclaim shrinks slapd down from 25GB to 18-19GB, while the page cache still contains ~7GB of unmapped pages. Ideally I'd like a tuning knob so I can

Re: mmap vs fs cache

2013-03-08 Thread Chris Friesen
On 03/08/2013 03:40 AM, Howard Chu wrote: There is no way that a process that is accessing only 30GB of a mmap should be able to fill up 32GB of RAM. There's nothing else running on the machine, I've killed or suspended everything else in userland besides a couple shells running top and vmstat.

Re: mmap vs fs cache

2013-03-08 Thread Chris Friesen
On 03/08/2013 03:40 AM, Howard Chu wrote: There is no way that a process that is accessing only 30GB of a mmap should be able to fill up 32GB of RAM. There's nothing else running on the machine, I've killed or suspended everything else in userland besides a couple shells running top and vmstat.

Re: mmap vs fs cache

2013-03-08 Thread Chris Friesen
On 03/08/2013 09:00 AM, Howard Chu wrote: First obvious conclusion - kswapd is being too aggressive. When free memory hits the low watermark, the reclaim shrinks slapd down from 25GB to 18-19GB, while the page cache still contains ~7GB of unmapped pages. Ideally I'd like a tuning knob so I can

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-28 Thread Chris Friesen
On 02/28/2013 01:57 AM, Florian Weimer wrote: In any case, there's another reading of the UEFI Secure Boot requirements: you may run any code you wish after calling ExitBootServices(). That could be an unsigned, traditional GRUB. But this will not generally address the issue of dual-booting

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-28 Thread Chris Friesen
On 02/28/2013 01:57 AM, Florian Weimer wrote: In any case, there's another reading of the UEFI Secure Boot requirements: you may run any code you wish after calling ExitBootServices(). That could be an unsigned, traditional GRUB. But this will not generally address the issue of dual-booting

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-27 Thread Chris Friesen
On 02/27/2013 11:59 AM, Theodore Ts'o wrote: On Wed, Feb 27, 2013 at 11:36:09AM -0600, Chris Friesen wrote: ... At this point you've got a running infected Win8 install that is running on Secure Boot hardware but is actually running malware. Admittedly this would be tricky to do reliably

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-27 Thread Chris Friesen
On 02/27/2013 09:24 AM, Theodore Ts'o wrote: On Tue, Feb 26, 2013 at 11:54:51AM -0500, Peter Jones wrote: No, no, no. Quit saying nobody knows. We've got a pretty good idea - we've got a contract with them, and it says they provide the signing service, and under circumstances where the thing

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-27 Thread Chris Friesen
On 02/27/2013 09:24 AM, Theodore Ts'o wrote: On Tue, Feb 26, 2013 at 11:54:51AM -0500, Peter Jones wrote: No, no, no. Quit saying nobody knows. We've got a pretty good idea - we've got a contract with them, and it says they provide the signing service, and under circumstances where the thing

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-27 Thread Chris Friesen
On 02/27/2013 11:59 AM, Theodore Ts'o wrote: On Wed, Feb 27, 2013 at 11:36:09AM -0600, Chris Friesen wrote: ... At this point you've got a running infected Win8 install that is running on Secure Boot hardware but is actually running malware. Admittedly this would be tricky to do reliably

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-26 Thread Chris Friesen
On 02/26/2013 03:40 PM, Florian Weimer wrote: * Chris Friesen: On 02/25/2013 10:14 AM, Matthew Garrett wrote: Windows 8 will not load unsigned drivers if Secure Boot is enabled. For reference: http://msdn.microsoft.com/en-us/library/windows/desktop/hh848062%28v=vs.85%29.aspx Thanks. Do

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-26 Thread Chris Friesen
On 02/26/2013 03:40 PM, Florian Weimer wrote: * Chris Friesen: On 02/25/2013 10:14 AM, Matthew Garrett wrote: Windows 8 will not load unsigned drivers if Secure Boot is enabled. For reference: http://msdn.microsoft.com/en-us/library/windows/desktop/hh848062%28v=vs.85%29.aspx Thanks. Do

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-25 Thread Chris Friesen
On 02/25/2013 10:14 AM, Matthew Garrett wrote: On Mon, Feb 25, 2013 at 04:50:50PM +0100, Florian Weimer wrote: * Matthew Garrett: On Mon, Feb 25, 2013 at 03:46:14PM +0100, Florian Weimer wrote: You could just drop the requirement that ring 0 code must be signed. I don't think Windows 8

Re: [GIT PULL] Load keys from signed PE binaries

2013-02-25 Thread Chris Friesen
On 02/25/2013 10:14 AM, Matthew Garrett wrote: On Mon, Feb 25, 2013 at 04:50:50PM +0100, Florian Weimer wrote: * Matthew Garrett: On Mon, Feb 25, 2013 at 03:46:14PM +0100, Florian Weimer wrote: You could just drop the requirement that ring 0 code must be signed. I don't think Windows 8

Re: Read I/O starvation with writeback RAID controller

2013-02-22 Thread Chris Friesen
On 02/22/2013 02:35 PM, Jan Engelhardt wrote: On Friday 2013-02-22 20:28, Martin Svec wrote: Yes, I've already tried the ROW scheduler. It helped for some low iodepths depending on quantum settings but generally didn't solve the problem. I think the key issue is that none of the schedulers

Re: Read I/O starvation with writeback RAID controller

2013-02-22 Thread Chris Friesen
On 02/22/2013 02:35 PM, Jan Engelhardt wrote: On Friday 2013-02-22 20:28, Martin Svec wrote: Yes, I've already tried the ROW scheduler. It helped for some low iodepths depending on quantum settings but generally didn't solve the problem. I think the key issue is that none of the schedulers

Re: IPsec AH use of ahash

2013-01-21 Thread Chris Friesen
On 01/21/2013 09:31 AM, Tom St Denis wrote: - Original Message - From: "Steven Rostedt" To: "Tom St Denis" Cc: "David Dillow", "Borislav Petkov", linux-kernel@vger.kernel.org, net...@vger.kernel.org Sent: Monday, 21 January, 2013 10:28:33 AM Subject: Re: IPsec AH use of ahash When I

Re: IPsec AH use of ahash

2013-01-21 Thread Chris Friesen
On 01/21/2013 09:31 AM, Tom St Denis wrote: - Original Message - From: Steven Rostedtrost...@goodmis.org To: Tom St Deniststde...@elliptictech.com Cc: David Dillowd...@thedillows.org, Borislav Petkovb...@alien8.de, linux-kernel@vger.kernel.org, net...@vger.kernel.org Sent: Monday, 21

Re: [sqlite] light weight write barriers

2012-11-15 Thread Chris Friesen
On 11/15/2012 11:06 AM, Ryan Johnson wrote: The easiest way to implement this fsync would involve three things: 1. Schedule writes for all dirty pages in the fs cache that belong to the affected file, wait for the device to report success, issue a cache flush to the device (or request ordering

Re: [sqlite] light weight write barriers

2012-11-15 Thread Chris Friesen
On 11/15/2012 11:06 AM, Ryan Johnson wrote: The easiest way to implement this fsync would involve three things: 1. Schedule writes for all dirty pages in the fs cache that belong to the affected file, wait for the device to report success, issue a cache flush to the device (or request ordering

Re: scsi target, likely GPL violation

2012-11-07 Thread Chris Friesen
On 11/07/2012 07:02 PM, Jon Mason wrote: I'm not a lawyer, nor do I play one on TV, but if I understand the GPL correctly, RTS only needs to provide the relevant source to their customers upon request. Not quite. Assuming the GPL applies, and that they have modified the code, then they must

Re: scsi target, likely GPL violation

2012-11-07 Thread Chris Friesen
On 11/07/2012 07:02 PM, Jon Mason wrote: I'm not a lawyer, nor do I play one on TV, but if I understand the GPL correctly, RTS only needs to provide the relevant source to their customers upon request. Not quite. Assuming the GPL applies, and that they have modified the code, then they must

Re: [RFC] Second attempt at kernel secure boot support

2012-11-06 Thread Chris Friesen
On 11/06/2012 01:56 AM, Florian Weimer wrote: Personally, I think the only way out of this mess is to teach users how to disable Secure Boot. If you're going to go that far, why not just get them to install a RedHat (or SuSE, or Ubuntu, or whoever) key and use that instead? Secure boot

Re: [RFC] Second attempt at kernel secure boot support

2012-11-06 Thread Chris Friesen
On 11/06/2012 01:56 AM, Florian Weimer wrote: Personally, I think the only way out of this mess is to teach users how to disable Secure Boot. If you're going to go that far, why not just get them to install a RedHat (or SuSE, or Ubuntu, or whoever) key and use that instead? Secure boot

Re: [RFC] Second attempt at kernel secure boot support

2012-11-05 Thread Chris Friesen
On 11/05/2012 09:31 AM, Jiri Kosina wrote: I had a naive idea of just putting in-kernel verification of a complete ELF binary passed to kernel by userspace, and if the signature matches, jumping to it. Would work for elf-x86_64 nicely I guess, but we'd lose a lot of other functionality

Re: Fwd: Nice processes prevent frequency increases - possible scheduler regression (known good in 2.6.35)

2012-11-05 Thread Chris Friesen
On 11/03/2012 09:40 AM, Michal Zatloukal wrote: On Sat, Nov 3, 2012 at 12:48 PM, Mike Galbraith wrote: On Sat, 2012-11-03 at 04:33 -0700, Mike Galbraith wrote: On Fri, 2012-11-02 at 21:09 +0100, Michal Zatloukal wrote: Your nice 19 tasks receiving 'too much' CPU when there are other

Re: Fwd: Nice processes prevent frequency increases - possible scheduler regression (known good in 2.6.35)

2012-11-05 Thread Chris Friesen
On 11/03/2012 09:40 AM, Michal Zatloukal wrote: On Sat, Nov 3, 2012 at 12:48 PM, Mike Galbraithefa...@gmx.de wrote: On Sat, 2012-11-03 at 04:33 -0700, Mike Galbraith wrote: On Fri, 2012-11-02 at 21:09 +0100, Michal Zatloukal wrote: Your nice 19 tasks receiving 'too much' CPU when there are

Re: [RFC] Second attempt at kernel secure boot support

2012-11-05 Thread Chris Friesen
On 11/05/2012 09:31 AM, Jiri Kosina wrote: I had a naive idea of just putting in-kernel verification of a complete ELF binary passed to kernel by userspace, and if the signature matches, jumping to it. Would work for elf-x86_64 nicely I guess, but we'd lose a lot of other functionality

Re: [RFC] Second attempt at kernel secure boot support

2012-11-02 Thread Chris Friesen
On 11/02/2012 04:03 PM, Eric W. Biederman wrote: Matthew Garrett writes: On Fri, Nov 02, 2012 at 01:49:25AM -0700, Eric W. Biederman wrote: When the goal is to secure Linux I don't see how any of this helps. Windows 8 compromises are already available so if we turn most of these arguments

Re: [RFC] Second attempt at kernel secure boot support

2012-11-02 Thread Chris Friesen
On 11/02/2012 09:48 AM, Vivek Goyal wrote: On Thu, Nov 01, 2012 at 03:02:25PM -0600, Chris Friesen wrote: With secure boot enabled, then the kernel should refuse to let an unsigned kexec load new images, and kexec itself should refuse to load unsigned images. Yep, good in theory. Now

Re: [RFC] Second attempt at kernel secure boot support

2012-11-02 Thread Chris Friesen
On 11/02/2012 09:48 AM, Vivek Goyal wrote: On Thu, Nov 01, 2012 at 03:02:25PM -0600, Chris Friesen wrote: With secure boot enabled, then the kernel should refuse to let an unsigned kexec load new images, and kexec itself should refuse to load unsigned images. Yep, good in theory. Now

Re: [RFC] Second attempt at kernel secure boot support

2012-11-02 Thread Chris Friesen
On 11/02/2012 04:03 PM, Eric W. Biederman wrote: Matthew Garrettmj...@srcf.ucam.org writes: On Fri, Nov 02, 2012 at 01:49:25AM -0700, Eric W. Biederman wrote: When the goal is to secure Linux I don't see how any of this helps. Windows 8 compromises are already available so if we turn most

Re: [RFC] Second attempt at kernel secure boot support

2012-11-01 Thread Chris Friesen
On 11/01/2012 02:27 PM, Pavel Machek wrote: Could someone write down exact requirements for Linux kernel to be signed by Microsoft? Because thats apparently what you want, and I don't think crippling kexec/suspend is enough. As I understand it, the kernel won't be signed by Microsoft.

Re: [RFC] Second attempt at kernel secure boot support

2012-11-01 Thread Chris Friesen
On 11/01/2012 02:27 PM, Pavel Machek wrote: Could someone write down exact requirements for Linux kernel to be signed by Microsoft? Because thats apparently what you want, and I don't think crippling kexec/suspend is enough. As I understand it, the kernel won't be signed by Microsoft.

Re: [RFC] Second attempt at kernel secure boot support

2012-10-31 Thread Chris Friesen
On 10/31/2012 02:14 PM, Oliver Neukum wrote: On Wednesday 31 October 2012 17:39:19 Alan Cox wrote: On Wed, 31 Oct 2012 17:17:43 + Matthew Garrett wrote: On Wed, Oct 31, 2012 at 05:21:21PM +, Alan Cox wrote: On Wed, 31 Oct 2012 17:10:48 + Matthew Garrett wrote: The kernel is

  1   2   3   4   5   6   >