Commit-ID: 1ca4fa3ab604734e38e2a3000c9abf788512ffa7
Gitweb: https://git.kernel.org/tip/1ca4fa3ab604734e38e2a3000c9abf788512ffa7
Author: Hidetoshi Seto
AuthorDate: Tue, 29 Jan 2019 10:12:45 -0500
Committer: Ingo Molnar
CommitDate: Mon, 4 Feb 2019 09:13:21 +0100
sched/debug: Initialize
By changes in vtime* codes by previous patches, now account_idle_time()
become a function to be called only from tick-accounting codes.
Introduce __account_idle_ticks() to do iowait accounting in ticks
properly. For this purpose record jiffies at end of iowait.
Not-Tested-by: Hidetoshi Seto
Get iowait's timestamp for accounting w/ VIRT_CPU_ACCOUNTING_GEN.
(currently arm is only user of this?)
At last of this series of changes, introduce common function
vtime_iowait_exit to replace all arch_record_iowait_exit.
Not-tested-by: Hidetoshi Seto
---
arch/ia64/include/asm/cput
Like s390 and ia64, ppc also has VIRT_CPU_ACCOUNTING.
Check "timestamp at end of iowait" for idle/iowait accounting.
Not-Tested-by: Hidetoshi Seto
---
arch/powerpc/include/asm/cputime.h |3 +++
arch/powerpc/kernel/time.c | 21 +
2 files changed, 20
Using VIRT_CPU_ACCOUNTING, ia64 utilize "timestamp at end of iowait"
like s390.
Not-Tested-by: Hidetoshi Seto
---
arch/ia64/include/asm/cputime.h |2 +
arch/ia64/kernel/time.c | 43 ++-
2 files changed, 44 insertions(+), 1 deletion
s390_get_idle_time give us the duration from idle entry to now.
But it does not tell us how to divide it to idle and iowait.
Modify this function to return 2 values. To realize this, s390's
cputime accounting also requires timestamp at end of iowait.
Not-Tested-by: Hidetoshi Seto
---
arch
The current account_idle_time() cannot process mixed cputime which
contain both of idle cputime and iowait cputime.
So introduce new account_idle_and_iowait() to do paranoid work.
Following patches will add users of this new function.
Not-Tested-by: Hidetoshi Seto
---
kernel/sched/cputime.c
Now observer cpu can refer both of idle entry time and iowait exit
time of observed sleeping cpu, so observer can get idle/iowait time
of sleeping cpu by calculating cputimes not accounted yet.
Not-Tested-by: Hidetoshi Seto
---
include/linux/sched.h|1 +
kernel/sched/core.c | 27
ing.
Suggested-by: Peter Zijlstra
Not-Tested-by: Hidetoshi Seto
---
kernel/sched/core.c| 40
kernel/sched/cputime.c |2 +-
kernel/sched/sched.h |4 +++-
3 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/core.c b/
SMP iowait stats
https://www.kernel.org/pub/linux/kernel/people/wli/vm/iowait/iowait-2.5.45-6
Thanks,
H.Seto
---
Hidetoshi Seto (8):
cputime, sched: record last_iowait
cputime, nohz: handle last_iowait for nohz
cputime: introduce account_idle_and_iowait
cputime, s390:
.
update patch description, separate from following changes.
(patch body does not changed from 1/2 of v4)
v1-4: https://lkml.org/lkml/2014/4/17/120
Signed-off-by: Hidetoshi Seto
Reported-by: Fernando Luis Vazquez Cao
Reported-by: Tetsuo Handa
Cc: Frederic Weisbecker
Cc: Thomas Gleixner
Cc
(2014/05/21 12:19), Chen Yucong wrote:
> On Wed, 2014-05-21 at 11:43 +0900, Hidetoshi Seto wrote:
>> (2014/05/21 11:03), Chen Yucong wrote:
>>> On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
>>>> (2014/05/20 11:11), Chen Yucong wrote:
>>>>>
(2014/05/21 11:03), Chen Yucong wrote:
> On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
>> (2014/05/20 11:11), Chen Yucong wrote:
>>> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as
>>> possible. So the
>>> clear operation
(2014/05/20 11:11), Chen Yucong wrote:
> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as
> possible. So the
> clear operation of mces_seen should also be lcoal to Per-CPU rather than
> monarch CPU.
I don't think it should be local.
Originally what we want to have here
set, then (iowait_exittime - idle_entrytime)
> gets accounted as iowait, and the remaining (now - iowait_exittime)
> as "true" idle.
>
> Run-tested: /proc/stats no longer go backwards.
>
> Signed-off-by: Denys Vlasenko
> Cc: Frederic Weisbecker
> Cc: Hidetoshi Set
(2014/04/23 18:41), Peter Zijlstra wrote:
> On Wed, Apr 23, 2014 at 04:40:18PM +0900, Hidetoshi Seto wrote:
>> (2014/04/23 4:45), Peter Zijlstra wrote:
>>> On Thu, Apr 17, 2014 at 06:41:41PM +0900, Hidetoshi Seto wrote:
>>>> [TARGET OF THIS PATCH]:
>>>>
&g
(2014/04/23 4:45), Peter Zijlstra wrote:
> On Thu, Apr 17, 2014 at 06:41:41PM +0900, Hidetoshi Seto wrote:
>> [TARGET OF THIS PATCH]:
>>
>> Complete rework for iowait accounting implies that some user
>> interfaces might be replaced completely. It will introduce
Ping?
(I'll have a week holidays from next week.
So thank you if you could give me your comments soon!)
Thanks,
H.Seto
(2014/04/17 18:35), Hidetoshi Seto wrote:
> Hi all,
>
> This patch set (rebased on v3.15-rc1) is my 4th try to fix an issue
> that idle/iowait of /proc/sta
(2014/04/17 19:05), Peter Zijlstra wrote:
> Anyway, if you want to preserve the same broken ass crap we had pre
> NOHZ, something like the below should do that.
>
> I'm not really thrilled with iowait_{start,stop}() but I think they
> should have the same general cost as the atomic ops we already
ents to explain more details
v3: use seqcount instead of seqlock
(achieved by inserting cleanup as former patch)
plus introduce delayed iowait accounting
v2: update comments and description about problem 2.
include fix for minor typo
Signed-off-by: Hidetoshi Seto
Reported-by: Fernando Luis Va
o other way to reach update_ts_time_stats(), fold
this static routine into tick_nohz_stop_idle().
(Still there is problem 2. Continue to following patch 2/2.)
Signed-off-by: Hidetoshi Seto
Reported-by: Fernando Luis Vazquez Cao
Reported-by: Tetsuo Handa
Cc: Frederic Weisbecker
Cc: Thomas Gleixner
ar (I hope so).
Of course still reviews are welcome.
Thanks,
H.Seto
Hidetoshi Seto (2):
nohz: make updating sleep stats local
nohz: delayed iowait accounting for nohz idle time stats
include/linux/tick.h |6 +-
kernel/time/tick-sche
(2014/04/16 18:36), Peter Zijlstra wrote:
> On Wed, Apr 16, 2014 at 03:33:06PM +0900, Hidetoshi Seto wrote:
>> So we need 2 operations:
>> a) remove regression
>
> What regression; there's never been talk about a regression, just a bug
> found. AFAICT this
(2014/04/15 19:19), Peter Zijlstra wrote:
> On Thu, Apr 10, 2014 at 06:13:54PM +0900, Hidetoshi Seto wrote:
>> [WHAT THIS PATCH PROPOSED]:
>>
>> To fix problem 1, this patch adds seqcount for NO_HZ idle
>> accounting to avoid possible races between reader/writer.
>&
(2014/04/15 19:04), Peter Zijlstra wrote:
> On Thu, Apr 10, 2014 at 06:13:54PM +0900, Hidetoshi Seto wrote:
>> This patch is v3 of patch set to fix an issue that idle/iowait
>> of /proc/stat can go backward. Originally reported by Tetsuo and
>> Fernando at last year, Mar 201
Ping?
(2014/04/10 18:07), Hidetoshi Seto wrote:
> Hi all,
>
> This patch set (rebased on v3.14) is my 3rd try to fix an issue
> that idle/iowait of /proc/stat can go backward. Originally reported
> by Tetsuo and Fernando at last year, Mar 2013.
>
> This v3 takes new approa
iowait accounting
v2: update comments and description about problem 2.
include fix for minor typo
Signed-off-by: Hidetoshi Seto
Reported-by: Fernando Luis Vazquez Cao
Reported-by: Tetsuo Handa
Cc: Frederic Weisbecker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andrew Mo
-off-by: Hidetoshi Seto
Cc: Fernando Luis Vazquez Cao
Cc: Tetsuo Handa
Cc: Frederic Weisbecker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andrew Morton
Cc: Arjan van de Ven
Cc: Oleg Nesterov
Cc: Preeti U Murthy
Cc: Denys Vlasenko
Cc:
---
include/linux/tick.h |4
still reviews are welcome!
Thanks,
H.Seto
Hidetoshi Seto (2):
nohz: stop updating sleep stats from get_cpu_{idle,iowait}_time_us()
nohz: use delayed iowait accounting to avoid race on idle time stats
include/linux/tick.h |6 ++-
kernel/time/tick-sched.c | 116
(2014/04/03 18:51), Denys Vlasenko wrote:
> On Thu, Apr 3, 2014 at 9:02 AM, Hidetoshi Seto
> wrote:
>>>> [PROBLEM 2]: broken iowait accounting.
>>>>
>>>> As historical nature, cpu's idle time was accounted as either
>>>> idle or iowait
(2014/04/03 4:35), Denys Vlasenko wrote:
> On Mon, Mar 31, 2014 at 4:08 AM, Hidetoshi Seto
> wrote:
>> There are 2 problems:
>>
>> [PROBLEM 1]: there is no exclusive control.
>>
>> It is easy to understand that there are 2 different cpu - an
>> observing
oducer and stressor for a day. The rate of reproduce
is different for different system, but in my case, running
"git gc" on kernel source repository aside of checker works fine.
Signed-off-by: Hidetoshi Seto
Reviewed-by: Preeti U Murthy
Reported-by: Fernando Luis Vazquez Cao
Reported-
eptime stats v2
https://lkml.org/lkml/2013/10/19/86
v2: update comments and description about problem 2.
include fix for minor typo
Signed-off-by: Hidetoshi Seto
Reviewed-by: Preeti U Murthy
Reported-by: Fernando Luis Vazquez Cao
Reported-by: Tetsuo Handa
Cc: Frederic Weisbecker
Cc: Thomas
Hi all,
This patch set (rebased on v3.14-rc8) will fix an issue that
idle/iowait of /proc/stat can go backward. Originally reported
by Tetsuo and Fernando at last year, Mar 2013.
v2 have Preeti's Reviewed-by (Thanks!).
Of course still reviews are welcome.
Thanks,
H.Seto
Hidetoshi Se
(2014/03/24 16:45), Preeti Murthy wrote:
> Hi Hidetoshi,
>
> The patch looks good to me except the comments around the monotonicity
> of the return value of the idle stats observer. I am unable to relate them
> to the dependency on nr_iowait_cpu.
>
> I see that when the reader queries for the idl
racy sleeptime stats
https://lkml.org/lkml/2013/8/8/638
[PATCH RESEND 0/4] nohz: Fix racy sleeptime stats
https://lkml.org/lkml/2013/8/16/274
2nd patchset from Frederic:
[RFC PATCH 0/5] nohz: Fix racy sleeptime stats v2
https://lkml.org/lkml/2013/10/19/86
Signed-off-by: Hidet
oducer and stressor for a day. The rate of reproduce
is different for different system, but in my case, running
"git gc" on kernel source repository aside of checker works fine.
Thanks,
H.Seto
Signed-off-by: Hidetoshi Seto
Reported-by: Fernando Luis Vazquez Cao
Reported-by: Tetsuo Handa
Cc:
Hi all,
This patch set (based on v3.14-rc7) will fix an issue that
idle/iowait of /proc/stat can go backward. Originally reported
by Tetsuo and Fernando at last year, Mar 2013.
Reviews are welcome.
Thanks,
H.Seto
Hidetoshi Seto (2):
nohz: use seqlock to avoid race on idle time stats
Fix corporate name for copyright.
Signed-off-by: Hidetoshi Seto
---
include/linux/srcu.h |2 +-
kernel/rcu/srcu.c|2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 9b058ee..04f5abb 100644
--- a/include/linux/srcu.h
Fix corporate name for copyright.
Signed-off-by: Hidetoshi Seto
---
fs/btrfs/delayed-inode.c |2 +-
fs/btrfs/delayed-inode.h |2 +-
fs/btrfs/math.h |2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
Bob Picco wrote:
I will be pushing Peter Keilty's clocksource ia64 patches within the next
week or so. At that time I'll ask for inclusion into -mm. Please see:
http://marc.info/?t=11788158551&r=1&w=2
You'll notice that the time interpolator is removed.
I see. Your patch will replace the t
ay() call
0.24us / 1 gettimeofday() call with patch
- x3 process:
1.59us / 1 gettimeofday() call
1.11us / 1 gettimeofday() call with patch
- x4 process:
2.34us / 1 gettimeofday() call
1.29us / 1 gettimeofday() call with patch
I know that this patch could not help quite hug
Arjan van de Ven wrote:
>> It'd be nice if we could just teach the userspace balancer to not try to
>> move perpcu IRQs?
>>
>> otoh, the patch is super-cheap. Arjan?
>
> I can fix irqbalance no problem, however I like the kernel approach as
> well, since it's not just irqbalance that moves irqs,
not sure what stuff of CPEI need to be fixed, but I think that
returning error to attempting move PER_CPU irq is useful for all
applications since it will never work.
Following small patch takes b) style.
It works, the warning disappeared and irqbalance still runs well.
Thanks,
H.Seto
Signed-off-by: Hide
Matthew Wilcox wrote:
On Thu, Sep 01, 2005 at 05:45:54PM -0500, Brent Casavant wrote:
I am extremely concerned about the performance implications of this
implementation. These changes have several deleterious effects on I/O
performance.
I agree. I think the iochk patches should be abandone
Oh my,
Hidetoshi Seto wrote:
I'd like to merge this part into 2.6.13-rc1 even if the latter half isn't
This is typo, should be 2.6.14-rc1. :-p
Thanks,
H.Seto
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [E
'd like to merge this part into 2.6.13-rc1 even if the latter half isn't
accepted. This former half functions without the latter, and helps
realize of effective recovery from MCA.
Tony, could you apply this part to your tree?
Thanks,
H.Seto
Signed-off-by: Hidetoshi Seto <[EMAIL PROTE
pends on the situation.
Comments, to this paranoia part, are welcomed.
Thanks,
H.Seto
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/kernel/mca.c | 21 +
arch/ia64/lib/iomap_check.c | 28 +++-
include/asm-ia64/io.h
Thank you for your comment, Brent.
Brent Casavant wrote:
On Thu, 1 Sep 2005, Hidetoshi Seto wrote:
static inline unsigned int
___ia64_inb (unsigned long port)
{
volatile unsigned char *addr = __ia64_mk_io_addr(port);
unsigned char ret;
+ unsigned long flags
This patch implements IOCHK interfaces that enable PCI drivers to
detect error and make their error handling easier.
Please refer archives if you need, e.g. http://lwn.net/Articles/139240/
Thanks,
H.Seto
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
drivers/pci
This patch implements ia64-specific IOCHK interfaces that enable
PCI drivers to detect error and make their error handling easier.
Please refer archives if you need, e.g. http://lwn.net/Articles/139240/
Thanks,
H.Seto
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/K
Linas Vepstas wrote:
On Wed, Jul 06, 2005 at 02:17:21PM +0900, Hidetoshi Seto was heard to remark:
Touching poisoned data become a MCA, so now it directly means
Several questions:
Is MCA an exception or fault of some sort, so at some point,
the kernel would catch a fault?
So when you
Linas Vepstas wrote:
On Wed, Jul 06, 2005 at 02:18:53PM +0900, Hidetoshi Seto was heard to remark:
+static int pci_error_recovery(peidx_table_t *peidx)
Minor comment:
Maybe a different name for this routine would be good;
this potentially conflicts with generic pci routines.
Good point
Linas Vepstas wrote:
Thus, one wouldn't want to perform an iochk_read() in this way unless
one was already pretty sure that an error had already occured ...
If another kind of I/O error detecting system finds a error before
performing iochk_read(), it can prevents coming iochk_read() from
spen
Benjamin Herrenschmidt wrote:
On Thu, 2005-07-07 at 11:41 -0700, Greg KH wrote:
How about the issue of tying this into the other pci error reporting
infrastructure that is being worked on?
The other infrastructure is for asynchronous reporting and recovery.
We still need synchronous detection
david mosberger wrote:
- could anyone write same barrier for intel compiler?
Tony or David, could you help me?
I think it might be best to make ia64_mca_barrier() a proper
subroutine written in assembly code. Yes, that costs some time, but
we're talking about wasting 1,000+ cycles just t
YOSHIFUJI Hideaki wrote:
Index: linux-2.6.13-rc1/lib/iomap.c
===
--- linux-2.6.13-rc1.orig/lib/iomap.c
+++ linux-2.6.13-rc1/lib/iomap.c
@@ -230,3 +230,9 @@ void pci_iounmap(struct pci_dev *dev, vo
}
EXPORT_SYMBOL(pci_iomap);
EXPOR
TE_INFO is required.
To realize this, I changed control lock from spin to rw.
There would be better way, if so, this part should be
replaced.
Changes from previous one for 2.6.11.11:
- (non)
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/kernel/mca.c |6 +
ng.
CPE:
1) OS gets control
2) OS request to SAL
3) SAL gathers data and return it to OS
Therefore, we can make CPE handler to care bridge states,
to check states before calling SAL procedure.
Changes from previous one for 2.6.11.11:
- (non)
Signed-off-by: Hidetoshi Seto &
uture, if possible.
Changes from previous one for 2.6.11.11:
- (non)
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/kernel/mca_drv.c | 84
arch/ia64/lib/iomap_check.c |1
2 files changed, 85
hanges from previous one for 2.6.11.11:
- (non)
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/kernel/mca.c | 13 +
arch/ia64/lib/iomap_check.c |7 ++-
2 files changed, 19 insertions(+), 1 delet
lls us "where it happens",
we can recover it...? All right, let's see next (8 of 10).
Changes from previous one for 2.6.11.11:
- move barrier function macro into gcc_inirin.h.
- could anyone write same barrier for intel compiler?
Tony or David, could you help me?
next (6 of 10).
Changes from previous one for 2.6.11.11:
- (non)
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/lib/iomap_check.c | 45
1 files changed, 45 insertions(+)
Index: linux-2.6.13-
rectly, we need to
check both end of the bus, device and its host bridge.
OK, but often bridges are shared by multiple devices, right?
So we need care to handle it... Yes, see next (5 of 10).
Changes from previous one for 2.6.11.11:
- trivial coding style fix.
Signed-off-by: Hidetoshi Seto &l
er. After removing iocookie from list, return
the result.
This is too simple. We need more codes... See next (4 of 10).
Changes from previous one for 2.6.11.11:
- trivial coding style fix.
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/lib/iomap_check.c | 55 ++
rom previous one for 2.6.11.11:
- simplify define of iocookie structure.
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
arch/ia64/Kconfig | 13 +
arch/ia64/lib/Makefile |1 +
arch/ia64/lib/iomap_check.c | 30 ++
include/
ind using EXPORT_SYMBOL_GPL but keep them as
before. Does anyone worry about this?
Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>
---
drivers/pci/pci.c |2 ++
include/asm-generic/iomap.h | 32
lib/iomap.c |6 +
Thanks for all comments!
OK, I'd like to sort our situation:
$ Here are 2 features:
- iochk_clear/read() interface for error "detection"
by Seto ... me :-)
- callback, thread, and event notification for error "recovery"
by Linas ... expert in PPC64
$ What will "dete
Linas Vepstas wrote:
Below is some "pseudocode" version (mentally substitute
"pci error event" for every occurance of "eeh"). Its got some
ppc64-specific crud in there that we have to fix to make it
truly generic (I just cut and pasted from current code).
Would a cleaned up version of this code
Linas Vepstas wrote:
If their defaults are no-ops, device
maintainers who develops their driver on not-implemented arch should be
more careful.
Why? People who write device drivers already know if/when they need to
disable interrupts, and so they already disable if they need it.
OK, I'll remake
Linas Vepstas wrote:
>> I'd prefer to see it as ioerr_clear(), ioerr_read() ...
>
> I'd prefer pci_io_start() and pci_io_check_err()
>
> The names should have "pci" in them.
>
> I don't like "ioerr_clear" because it implies we are clearing the io error; we are not; we are clearing the checker
for
Jesse Barnes wrote:
This was my thought too last time we had this discussion. A completely
asynchronous call is probably needed in addition to Hidetoshi's proposed API,
since as you point out, the driver may not be running when an error occurs
(e.g. in the case of a DMA error or more general bu
Matthew Wilcox wrote:
I think what Jeff meant was "this new API handles none of this".
And that's true, it doesn't handle DMA errors. But I think that's just
something that hasn't been written/designed yet.
Yes, this API just supports drivers wanting to be more RAS-aware.
It would be happy if how
Hi, long time no see :-)
Currently, I/O error is not a leading cause of system failure.
However, since Linux nowadays is making great progress on its
scalability, and ever larger number of PCI devices are being
connected to a single high-performance server, the risk of the
I/O error is increasing d
Hi, Ben.
How kind of you to remember.
Benjamin Herrenschmidt wrote:
I was reading the list archives for the discussion back in September
about PCI error reporting. Has there been any further progress on this
since then ?
Now I have a rewrite of the previous "clear/read_pci_errors" patch.
The new on
75 matches
Mail list logo