Re: [GIT PULL 0/5] perf/core improvements and fixes
* Arnaldo Carvalho de Melo wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 691286b5561aab2e1b00119bc328598c01250548: > > kprobes/x86: Remove stale ARCH_SUPPORTS_KPROBES_ON_FTRACE define > (2014-10-17 07:18:34 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-core-for-mingo > > for you to fetch changes up to e8564b710c6df2c3aeb56c507c22f4bcfa4c0b2d: > > perf script: Add period as a default output column (2014-10-17 15:22:19 > -0300) > > > perf/core improvements and fixes: > > User visible: > > o Add period data column and make it default in 'perf script' (Jiri Olsa) > > Infrastructure: > > * Move exit stuff from perf_evsel__delete to perf_evsel__exit, delete > should be just a front end for exit + free (Arnaldo Carvalho de Melo) > > * Add missing 'struct option' forward declaration (Arnaldo Carvalho de Melo) > > * No need to drag util/cgroup.h into evsel.h (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo > > > Arnaldo Carvalho de Melo (3): > perf evsel: Move exit stuff from __delete to __exit > perf evlist: Add missing 'struct option' forward declaration > perf evsel: No need to drag util/cgroup.h > > Jiri Olsa (2): > perf script: Add period data column > perf script: Add period as a default output column > > tools/perf/Documentation/perf-script.txt | 2 +- > tools/perf/builtin-record.c | 1 + > tools/perf/builtin-script.c | 21 + > tools/perf/builtin-stat.c| 1 + > tools/perf/util/evlist.h | 2 ++ > tools/perf/util/evsel.c | 11 ++- > tools/perf/util/evsel.h | 3 ++- > 7 files changed, 30 insertions(+), 11 deletions(-) Pulled, thanks a lot Arnaldo! Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] perf evsel: Move exit stuff from __delete to __exit
Commit-ID: 597e48c138632d1f55409dcfa5bee4e1152e7d4f Gitweb: http://git.kernel.org/tip/597e48c138632d1f55409dcfa5bee4e1152e7d4f Author: Arnaldo Carvalho de Melo AuthorDate: Thu, 16 Oct 2014 13:25:01 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 17 Oct 2014 11:14:15 -0300 perf evsel: Move exit stuff from __delete to __exit So that when an evsel is embedded into other struct it can free up resources calling perf_evsel__exit(). Cc: Adrian Hunter Cc: Borislav Petkov Cc: David Ahern Cc: Don Zickus Cc: Frederic Weisbecker Cc: Jean Pihet Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-n1w68pfe9m2vkhm4sqs8y...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/evsel.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index d1ecde0..786ea55 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -850,17 +850,17 @@ void perf_evsel__exit(struct perf_evsel *evsel) assert(list_empty(&evsel->node)); perf_evsel__free_fd(evsel); perf_evsel__free_id(evsel); + close_cgroup(evsel->cgrp); + zfree(&evsel->group_name); + if (evsel->tp_format) + pevent_free_format(evsel->tp_format); + zfree(&evsel->name); perf_evsel__object.fini(evsel); } void perf_evsel__delete(struct perf_evsel *evsel) { perf_evsel__exit(evsel); - close_cgroup(evsel->cgrp); - zfree(&evsel->group_name); - if (evsel->tp_format) - pevent_free_format(evsel->tp_format); - zfree(&evsel->name); free(evsel); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] perf script: Add period data column
Commit-ID: 535aeaae7de821ba5d43ee2a204ee667ca95aae4 Gitweb: http://git.kernel.org/tip/535aeaae7de821ba5d43ee2a204ee667ca95aae4 Author: Jiri Olsa AuthorDate: Mon, 25 Aug 2014 16:45:42 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 17 Oct 2014 15:21:30 -0300 perf script: Add period data column Adding period data column to be displayed in perf script. It's possible to get period values using -f option, like: $ perf script -f comm,tid,time,period,ip,sym,dso :26019 26019 52414.329088: 3707 8105443a native_write_msr_safe ([kernel.kallsyms]) :26019 26019 52414.329088: 44 8105443a native_write_msr_safe ([kernel.kallsyms]) :26019 26019 52414.329093: 1987 8105443a native_write_msr_safe ([kernel.kallsyms]) :26019 26019 52414.329093: 6 8105443a native_write_msr_safe ([kernel.kallsyms]) ls 26019 52414.329442: 5375583407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so) ls 26019 52414.329442: 20993407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so) ls 26019 52414.330181:124210034080917bb get_next_seq (/usr/lib64/libc-2.17.so) ls 26019 52414.330181: 377434080917bb get_next_seq (/usr/lib64/libc-2.17.so) ls 26019 52414.331427:1083662 810c7dc2 update_curr ([kernel.kallsyms]) ls 26019 52414.331427:360 810c7dc2 update_curr ([kernel.kallsyms]) Signed-off-by: Jiri Olsa Acked-by: David Ahern Cc: "Jen-Cheng(Tommy) Huang" Cc: Andi Kleen Cc: Corey Ashford Cc: David Ahern Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jen-Cheng(Tommy) Huang Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jo...@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf-script.txt | 2 +- tools/perf/builtin-script.c | 12 +++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 5a0160d..2149480 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -115,7 +115,7 @@ OPTIONS -f:: --fields:: Comma separated list of fields to print. Options are: -comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline. +comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline, period. Field list can be prepended with the type, trace, sw or hw, to indicate to which event type the field list applies. e.g., -f sw:comm,tid,time,ip,sym and -f trace:time,cpu,trace diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 6b4925f..0659dff 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -44,6 +44,7 @@ enum perf_output_field { PERF_OUTPUT_ADDR= 1U << 10, PERF_OUTPUT_SYMOFFSET = 1U << 11, PERF_OUTPUT_SRCLINE = 1U << 12, + PERF_OUTPUT_PERIOD = 1U << 13, }; struct output_option { @@ -63,6 +64,7 @@ struct output_option { {.str = "addr", .field = PERF_OUTPUT_ADDR}, {.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET}, {.str = "srcline", .field = PERF_OUTPUT_SRCLINE}, + {.str = "period", .field = PERF_OUTPUT_PERIOD}, }; /* default set to maintain compatibility with current format */ @@ -229,6 +231,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel, PERF_OUTPUT_CPU)) return -EINVAL; + if (PRINT_FIELD(PERIOD) && + perf_evsel__check_stype(evsel, PERF_SAMPLE_PERIOD, "PERIOD", + PERF_OUTPUT_PERIOD)) + return -EINVAL; + return 0; } @@ -448,6 +455,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample, print_sample_start(sample, thread, evsel); + if (PRINT_FIELD(PERIOD)) + printf("%10" PRIu64 " ", sample->period); + if (PRINT_FIELD(EVNAME)) { const char *evname = perf_evsel__name(evsel); printf("%s: ", evname ? evname : "[unknown]"); @@ -1543,7 +1553,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused) "comma separated output fields prepend with 'type:'. " "Valid types: hw,sw,trace,raw. " "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso," -"addr,symoff", parse_output_fields), +"addr,symoff,period", parse_output_fields), OPT_BOOLEAN('a', "all-cpus", &system_wide, "system-wide collection from all CPUs"), OPT_STRING
[tip:perf/urgent] perf evlist: Add missing 'struct option' forward declaration
Commit-ID: 724ce97e9f8616ffb62b940f3726685c6f31f9b9 Gitweb: http://git.kernel.org/tip/724ce97e9f8616ffb62b940f3726685c6f31f9b9 Author: Arnaldo Carvalho de Melo AuthorDate: Fri, 17 Oct 2014 12:16:00 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 17 Oct 2014 12:16:00 -0300 perf evlist: Add missing 'struct option' forward declaration It was being found, by chance, because evsel.h needlessly includes util/cgroup.h, which will be sorted out in a following patch. Cc: Adrian Hunter Cc: Borislav Petkov Cc: David Ahern Cc: Don Zickus Cc: Frederic Weisbecker Cc: Jean Pihet Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-xsvxr747wkkpg1ay9dram...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/evlist.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index bd312b0..649b0c5 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -117,6 +117,8 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist, void *ucontext)); int perf_evlist__start_workload(struct perf_evlist *evlist); +struct option; + int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str, int unset); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] perf evsel: No need to drag util/cgroup.h
Commit-ID: f14d570785e6760284a9849f9bafd0a9825a1a25 Gitweb: http://git.kernel.org/tip/f14d570785e6760284a9849f9bafd0a9825a1a25 Author: Arnaldo Carvalho de Melo AuthorDate: Fri, 17 Oct 2014 12:17:40 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 17 Oct 2014 12:17:40 -0300 perf evsel: No need to drag util/cgroup.h The only thing we need is a forward declaration for 'struct cgroup_sel', that is inside 'struct perf_evsel'. Include cgroup.h instead on the tools that support cgroups. Cc: Adrian Hunter Cc: Borislav Petkov Cc: David Ahern Cc: Don Zickus Cc: Frederic Weisbecker Cc: Jean Pihet Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-record.c | 1 + tools/perf/builtin-stat.c | 1 + tools/perf/util/evsel.c | 1 + tools/perf/util/evsel.h | 3 ++- 4 files changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index a6b2132..2583a9b 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -15,6 +15,7 @@ #include "util/parse-events.h" #include "util/callchain.h" +#include "util/cgroup.h" #include "util/header.h" #include "util/event.h" #include "util/evlist.h" diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index b22c62f..055ce92 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -43,6 +43,7 @@ #include "perf.h" #include "builtin.h" +#include "util/cgroup.h" #include "util/util.h" #include "util/parse-options.h" #include "util/parse-events.h" diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 786ea55..2f9e680 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -16,6 +16,7 @@ #include #include "asm/bug.h" #include "callchain.h" +#include "cgroup.h" #include "evsel.h" #include "evlist.h" #include "util.h" diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 1d5c754..163c560 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -7,7 +7,6 @@ #include #include #include "xyarray.h" -#include "cgroup.h" #include "symbol.h" struct perf_counts_values { @@ -42,6 +41,8 @@ struct perf_sample_id { u64 period; }; +struct cgroup_sel; + /** struct perf_evsel - event selector * * @name - Can be set to retain the original event name passed by the user, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] perf script: Add period as a default output column
Commit-ID: e8564b710c6df2c3aeb56c507c22f4bcfa4c0b2d Gitweb: http://git.kernel.org/tip/e8564b710c6df2c3aeb56c507c22f4bcfa4c0b2d Author: Jiri Olsa AuthorDate: Mon, 25 Aug 2014 16:45:43 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 17 Oct 2014 15:22:19 -0300 perf script: Add period as a default output column Adding period as a default output column in script command fo hardware, software and raw events. If PERF_SAMPLE_PERIOD sample type is defined in perf.data, following will be displayed in perf script output: $ perf script ls 8034 57477.887209: 25 task-clock: 81361d72 memset ([kernel.kallsyms]) ls 8034 57477.887464: 25 task-clock: 816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms]) ls 8034 57477.887708: 25 task-clock: 811a94f0 do_munmap ([kernel.kallsyms]) ls 8034 57477.887959: 25 task-clock:34080916c6 get_next_seq (/usr/lib64/libc-2.17.so) ls 8034 57477.888208: 25 task-clock:3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so) ls 8034 57477.888717: 25 task-clock: 814242c8 n_tty_write ([kernel.kallsyms]) ls 8034 57477.889285: 25 task-clock:3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so) Signed-off-by: Jiri Olsa Cc: David Ahern Cc: "Jen-Cheng(Tommy) Huang" Cc: Andi Kleen Cc: Corey Ashford Cc: David Ahern Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jen-Cheng(Tommy) Huang Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/r/1408977943-16594-10-git-send-email-jo...@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-script.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 0659dff..9708a12 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -82,7 +82,8 @@ static struct { .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID | PERF_OUTPUT_CPU | PERF_OUTPUT_TIME | PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP | - PERF_OUTPUT_SYM | PERF_OUTPUT_DSO, + PERF_OUTPUT_SYM | PERF_OUTPUT_DSO | + PERF_OUTPUT_PERIOD, .invalid_fields = PERF_OUTPUT_TRACE, }, @@ -93,7 +94,8 @@ static struct { .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID | PERF_OUTPUT_CPU | PERF_OUTPUT_TIME | PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP | - PERF_OUTPUT_SYM | PERF_OUTPUT_DSO, + PERF_OUTPUT_SYM | PERF_OUTPUT_DSO | + PERF_OUTPUT_PERIOD, .invalid_fields = PERF_OUTPUT_TRACE, }, @@ -112,7 +114,8 @@ static struct { .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID | PERF_OUTPUT_CPU | PERF_OUTPUT_TIME | PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP | - PERF_OUTPUT_SYM | PERF_OUTPUT_DSO, + PERF_OUTPUT_SYM | PERF_OUTPUT_DSO | + PERF_OUTPUT_PERIOD, .invalid_fields = PERF_OUTPUT_TRACE, }, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
(fixes Davidlohr bounce) On Sat, 2014-10-18 at 08:54 +0200, Mike Galbraith wrote: > On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: > > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's > > nothing to wake up) changes the futex code to avoid taking a lock when > > there are no waiters. This code has been subsequently fixed in commit > > 11d4616bd07f (futex: revert back to the explicit waiter counting code). > > Both the original commit and the fix-up rely on get_futex_key_refs() to > > always imply a barrier. > > > > However, for private futexes, none of the cases in the switch statement > > of get_futex_key_refs() would be hit and the function completes without > > a memory barrier as required before checking the "waiters" in > > futex_wake() -> hb_waiters_pending(). The consequence is a race with a > > thread waiting on a futex on another CPU, allowing the waker thread to > > read "waiters == 0" while the waiter thread to have read "futex_val == > > locked" (in kernel). > > > > Without this fix, the problem (user space deadlocks) can be seen with > > Android bionic's mutex implementation on an arm64 multi-cluster system. > > How 'bout that, you just triggered my "watch this pot" alarm. > > https://lkml.org/lkml/2014/10/8/406 > > The hang I encountered with stockfish only ever happened on one specific > box. Linus/Thomas said it I was likely a problem with the futex usage, > but it suspiciously deterministic, so I put this on the "watch out for > further evidence" back burner. > > The barrier fixing up my problematic box smells a lot like evidence. > > > Signed-off-by: Catalin Marinas > > Reported-by: Matteo Franchin > > Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing > > to wake up) > > Cc: > > Cc: Davidlohr Bueso > > Cc: Linus Torvalds > > Cc: Darren Hart > > Cc: Thomas Gleixner > > Cc: Peter Zijlstra > > Cc: Ingo Molnar > > Cc: Paul E. McKenney > > --- > > kernel/futex.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/kernel/futex.c b/kernel/futex.c > > index 815d7af2ffe8..f3a3a071283c 100644 > > --- a/kernel/futex.c > > +++ b/kernel/futex.c > > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key) > > case FUT_OFF_MMSHARED: > > futex_get_mm(key); /* implies MB (B) */ > > break; > > + default: > > + smp_mb(); /* explicit MB (B) */ > > } > > } > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's > nothing to wake up) changes the futex code to avoid taking a lock when > there are no waiters. This code has been subsequently fixed in commit > 11d4616bd07f (futex: revert back to the explicit waiter counting code). > Both the original commit and the fix-up rely on get_futex_key_refs() to > always imply a barrier. > > However, for private futexes, none of the cases in the switch statement > of get_futex_key_refs() would be hit and the function completes without > a memory barrier as required before checking the "waiters" in > futex_wake() -> hb_waiters_pending(). Good catch, glad I ran into this thread (my email recently changed). Private process futex (PTHREAD_PROCESS_PRIVATE) have no reference on an inode or mm so it would need the explicit barrier in those cases. > The consequence is a race with a > thread waiting on a futex on another CPU, allowing the waker thread to > read "waiters == 0" while the waiter thread to have read "futex_val == > locked" (in kernel). Yeah missing wakeups are a strong sign of a problem with the hb_waiters_pending() side. > Without this fix, the problem (user space deadlocks) can be seen with > Android bionic's mutex implementation on an arm64 multi-cluster system. > Signed-off-by: Catalin Marinas > Reported-by: Matteo Franchin > Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to > wake up) > Cc: > Cc: Davidlohr Bueso > Cc: Linus Torvalds > Cc: Darren Hart > Cc: Thomas Gleixner > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Paul E. McKenney > --- > kernel/futex.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/futex.c b/kernel/futex.c > index 815d7af2ffe8..f3a3a071283c 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key) > case FUT_OFF_MMSHARED: > futex_get_mm(key); /* implies MB (B) */ > break; > + default: > + smp_mb(); /* explicit MB (B) */ > } Should we comment that this default is for the private futex case? Otherwise: Acked-by: Davidlohr Bueso -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Relationship
I'm simple, caring and understanding,I just need happiness. I'm interested and would love to get to know more about you,but I'm aware that it takes some time to know someone better so please write to me and tell me more about yourself.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
18.10.2014, 01:40, "Oleg Nesterov" : > The lockless get_task_struct(tsk) is only safe if tsk == current > and didn't pass exit_notify(), or if this tsk was found on a rcu > protected list (say, for_each_process() or find_task_by_vpid()). > IOW, it is only safe if release_task() was not called before we > take rcu_read_lock(), in this case we can rely on the fact that > delayed_put_pid() can not drop the (potentially) last reference > until rcu_read_unlock(). > > And as Kirill pointed out task_numa_compare()->task_numa_assign() > path does get_task_struct(dst_rq->curr) and this is not safe. The > task_struct itself can't go away, but rcu_read_lock() can't save > us from the final put_task_struct() in finish_task_switch(); this > reference goes away without rcu gp. > > Reported-by: Kirill Tkhai > Signed-off-by: Oleg Nesterov > --- > kernel/sched/fair.c | 8 +++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 0090e8c..52049b9 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1158,7 +1158,13 @@ static void task_numa_compare(struct task_numa_env > *env, > > rcu_read_lock(); > cur = ACCESS_ONCE(dst_rq->curr); > - if (cur->pid == 0) /* idle */ > + /* > + * No need to move the exiting task, and this ensures that ->curr > + * wasn't reaped and thus get_task_struct() in task_numa_assign() > + * is safe; note that rcu_read_lock() can't protect from the final > + * put_task_struct() after the last schedule(). > + */ > + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) > cur = NULL; > > /* Oleg, I've looked once again, and now it's not good for me. Where is the guarantee this memory hasn't been allocated again? If so, PF_EXITING is not of the task we are interesting, but it's not a task's even. rcu_read_lock() ... ... cur = ACCESS_ONCE(dst_rq->curr); ... ... rq->curr = next; ... put_prev_task() ... __put_prev_task ... kmem_cache_free() ... ... ...memset(, 0, ) ...... if (cur->flags & PF_EXITING) ...... ...... get_task_struct()...... Kirill -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
18.10.2014, 12:15, "Kirill Tkhai" : > 18.10.2014, 01:40, "Oleg Nesterov" : >> The lockless get_task_struct(tsk) is only safe if tsk == current >> and didn't pass exit_notify(), or if this tsk was found on a rcu >> protected list (say, for_each_process() or find_task_by_vpid()). >> IOW, it is only safe if release_task() was not called before we >> take rcu_read_lock(), in this case we can rely on the fact that >> delayed_put_pid() can not drop the (potentially) last reference >> until rcu_read_unlock(). >> >> And as Kirill pointed out task_numa_compare()->task_numa_assign() >> path does get_task_struct(dst_rq->curr) and this is not safe. The >> task_struct itself can't go away, but rcu_read_lock() can't save >> us from the final put_task_struct() in finish_task_switch(); this >> reference goes away without rcu gp. >> >> Reported-by: Kirill Tkhai >> Signed-off-by: Oleg Nesterov >> --- >> kernel/sched/fair.c | 8 +++- >> 1 files changed, 7 insertions(+), 1 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 0090e8c..52049b9 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -1158,7 +1158,13 @@ static void task_numa_compare(struct task_numa_env >> *env, >> >> rcu_read_lock(); >> cur = ACCESS_ONCE(dst_rq->curr); >> - if (cur->pid == 0) /* idle */ >> + /* >> + * No need to move the exiting task, and this ensures that ->curr >> + * wasn't reaped and thus get_task_struct() in task_numa_assign() >> + * is safe; note that rcu_read_lock() can't protect from the final >> + * put_task_struct() after the last schedule(). >> + */ >> + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) >> cur = NULL; >> >> /* > > Oleg, I've looked once again, and now it's not good for me. > Where is the guarantee this memory hasn't been allocated again? > If so, PF_EXITING is not of the task we are interesting, but it's > not a task's even. > > rcu_read_lock() ... ... > cur = ACCESS_ONCE(dst_rq->curr); ... ... > rq->curr = next; ... > put_prev_task() ... > __put_prev_task ... > kmem_cache_free() ... > ... again> > ... memset(, 0, ) > ... ... > if (cur->flags & PF_EXITING) ... ... > ... ... > get_task_struct() ... ... How about this? diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index b78280c..d46427e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1165,7 +1165,21 @@ static void task_numa_compare(struct task_numa_env *env, rcu_read_lock(); cur = ACCESS_ONCE(dst_rq->curr); - if (cur->pid == 0) /* idle */ + /* +* No need to move the exiting task, and this ensures that ->curr +* wasn't reaped and thus get_task_struct() in task_numa_assign() +* is safe; note that rcu_read_lock() can't protect from the final +* put_task_struct() after the last schedule(). +*/ + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) + cur = NULL; + /* +* Check once again to be sure curr is still on dst_rq. Even if +* it points on a new task, which is using the memory of freed +* cur, it's OK, because we've locked RCU before +* delayed_put_task_struct() callback is called to put its struct. +*/ + if (cur != ACCESS_ONCE(dst_rq->curr)) cur = NULL; /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] sound fixes for 3.18-rc1
Linus, please pull sound fixes for v3.18-rc1 from: git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git tags/sound-fix-3.18-rc1 The topmost commit is c8b00fd2f4c504a564adcad5b8bd6952ab850b02 sound fixes for 3.18-rc1 Here are a collection of small fixes after 3.18 merge. The urgent one is the fix for kernel panics with linked PCM substream triggered by the recent nonatomic PCM ops support. Other two fixes (emu10k1 and bebob) are stable fixes, and one easy PCI ID addition for a new Intel HD-audio controller. James Ralston (1): ALSA: hda_intel: Add Device IDs for Intel Sunrise Point PCH Takashi Iwai (2): ALSA: pcm: Fix referred substream in snd_pcm_action_group() unlock loop ALSA: emu10k1: Fix deadlock in synth voice lookup Takashi Sakamoto (1): ALSA: bebob: Fix failure to detect source of clock for Terratec Phase 88 --- sound/core/pcm_native.c | 2 +- sound/firewire/bebob/bebob_terratec.c | 4 ++-- sound/pci/emu10k1/emu10k1_callback.c | 6 ++ sound/pci/hda/hda_intel.c | 4 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 85fe1a216225..bfe1cf6b492f 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -818,7 +818,7 @@ static int snd_pcm_action_group(struct action_ops *ops, /* unlock streams */ snd_pcm_group_for_each_entry(s1, substream) { if (s1 != substream) { - if (s->pcm->nonatomic) + if (s1->pcm->nonatomic) mutex_unlock(&s1->self_group.mutex); else spin_unlock(&s1->self_group.lock); diff --git a/sound/firewire/bebob/bebob_terratec.c b/sound/firewire/bebob/bebob_terratec.c index eef8ea7d9b97..0e4c0bfc463b 100644 --- a/sound/firewire/bebob/bebob_terratec.c +++ b/sound/firewire/bebob/bebob_terratec.c @@ -17,10 +17,10 @@ phase88_rack_clk_src_get(struct snd_bebob *bebob, unsigned int *id) unsigned int enable_ext, enable_word; int err; - err = avc_audio_get_selector(bebob->unit, 0, 0, &enable_ext); + err = avc_audio_get_selector(bebob->unit, 0, 9, &enable_ext); if (err < 0) goto end; - err = avc_audio_get_selector(bebob->unit, 0, 0, &enable_word); + err = avc_audio_get_selector(bebob->unit, 0, 8, &enable_word); if (err < 0) goto end; diff --git a/sound/pci/emu10k1/emu10k1_callback.c b/sound/pci/emu10k1/emu10k1_callback.c index 3f3ef38d9b6e..874cd76c7b7f 100644 --- a/sound/pci/emu10k1/emu10k1_callback.c +++ b/sound/pci/emu10k1/emu10k1_callback.c @@ -85,6 +85,8 @@ snd_emu10k1_ops_setup(struct snd_emux *emux) * get more voice for pcm * * terminate most inactive voice and give it as a pcm voice. + * + * voice_lock is already held. */ int snd_emu10k1_synth_get_voice(struct snd_emu10k1 *hw) @@ -92,12 +94,10 @@ snd_emu10k1_synth_get_voice(struct snd_emu10k1 *hw) struct snd_emux *emu; struct snd_emux_voice *vp; struct best_voice best[V_END]; - unsigned long flags; int i; emu = hw->synth; - spin_lock_irqsave(&emu->voice_lock, flags); lookup_voices(emu, hw, best, 1); /* no OFF voices */ for (i = 0; i < V_END; i++) { if (best[i].voice >= 0) { @@ -113,11 +113,9 @@ snd_emu10k1_synth_get_voice(struct snd_emu10k1 *hw) vp->emu->num_voices--; vp->ch = -1; vp->state = SNDRV_EMUX_ST_OFF; - spin_unlock_irqrestore(&emu->voice_lock, flags); return ch; } } - spin_unlock_irqrestore(&emu->voice_lock, flags); /* not found */ return -ENOMEM; diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index aa302fb03fc5..cfcca4c30d4d 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -218,6 +218,7 @@ MODULE_SUPPORTED_DEVICE("{{Intel, ICH6}," "{Intel, LPT}," "{Intel, LPT_LP}," "{Intel, WPT_LP}," +"{Intel, SPT}," "{Intel, HPT}," "{Intel, PBG}," "{Intel, SCH}," @@ -1998,6 +1999,9 @@ static const struct pci_device_id azx_ids[] = { /* Wildcat Point-LP */ { PCI_DEVICE(0x8086, 0x9ca0), .driver_data = AZX_DRIVER_PCH | AZX_DCAPS_INTEL_PCH }, + /* Sunrise Point */ + { PCI_DEVICE(0x8086, 0xa170), + .driver_data = AZX_DRIVER_PCH | AZX_DCAPS_INTEL_PCH }, /* Haswell */ { PCI_DEVICE(0x8086, 0x0a0c), .driver_dat
Re:[PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
And smp_rmb() beetween ifs which is pairs with rq unlocking > 18.10.2014, 12:15, "Kirill Tkhai" : > >> 18.10.2014, 01:40, "Oleg Nesterov" : >> >>> The lockless get_task_struct(tsk) is only safe if tsk == current >>> and didn't pass exit_notify(), or if this tsk was found on a rcu >>> protected list (say, for_each_process() or find_task_by_vpid()). >>> IOW, it is only safe if release_task() was not called before we >>> take rcu_read_lock(), in this case we can rely on the fact that >>> delayed_put_pid() can not drop the (potentially) last reference >>> until rcu_read_unlock(). >>> >>> And as Kirill pointed out task_numa_compare()->task_numa_assign() >>> path does get_task_struct(dst_rq->curr) and this is not safe. The >>> task_struct itself can't go away, but rcu_read_lock() can't save >>> us from the final put_task_struct() in finish_task_switch(); this >>> reference goes away without rcu gp. >>> >>> Reported-by: Kirill Tkhai >>> Signed-off-by: Oleg Nesterov >>> --- >>> kernel/sched/fair.c | 8 +++- >>> 1 files changed, 7 insertions(+), 1 deletions(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 0090e8c..52049b9 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -1158,7 +1158,13 @@ static void task_numa_compare(struct task_numa_env >>> *env, >>> >>> rcu_read_lock(); >>> cur = ACCESS_ONCE(dst_rq->curr); >>> - if (cur->pid == 0) /* idle */ >>> + /* >>> + * No need to move the exiting task, and this ensures that ->curr >>> + * wasn't reaped and thus get_task_struct() in task_numa_assign() >>> + * is safe; note that rcu_read_lock() can't protect from the final >>> + * put_task_struct() after the last schedule(). >>> + */ >>> + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) >>> cur = NULL; >>> >>> /* >> >> Oleg, I've looked once again, and now it's not good for me. >> Where is the guarantee this memory hasn't been allocated again? >> If so, PF_EXITING is not of the task we are interesting, but it's >> not a task's even. >> >> rcu_read_lock() ... ... >> cur = ACCESS_ONCE(dst_rq->curr); ... ... >> rq->curr = next; ... >> put_prev_task() ... >> __put_prev_task ... >> kmem_cache_free() ... >> ... >> ... memset(, 0, ) >> ... ... >> if (cur->flags & PF_EXITING) ... ... >> ... ... >> get_task_struct() ... ... > > How about this? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index b78280c..d46427e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1165,7 +1165,21 @@ static void task_numa_compare(struct task_numa_env > *env, > > rcu_read_lock(); > cur = ACCESS_ONCE(dst_rq->curr); > - if (cur->pid == 0) /* idle */ > + /* > + * No need to move the exiting task, and this ensures that ->curr > + * wasn't reaped and thus get_task_struct() in task_numa_assign() > + * is safe; note that rcu_read_lock() can't protect from the final > + * put_task_struct() after the last schedule(). > + */ > + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) > + cur = NULL; > + /* > + * Check once again to be sure curr is still on dst_rq. Even if > + * it points on a new task, which is using the memory of freed > + * cur, it's OK, because we've locked RCU before > + * delayed_put_task_struct() callback is called to put its struct. > + */ > + if (cur != ACCESS_ONCE(dst_rq->curr)) > cur = NULL; > > /* > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] reset: socfpga: use arch_initcall for early initialization
On Friday 10 October 2014 11:32:35 atull wrote: > On Thu, 9 Oct 2014, Steffen Trumtrar wrote: > > Do you have an example where this is really needed? > > My last version of the fpga manager framework > (https://lkml.org/lkml/2014/8/1/518) > added fpga_mgr_firmware_write(). This can be called from a device driver's > probe function to request a fpga image be loaded. I want to support FPGA > based functionality being seen pretty similar to really hard hardware. So > the FPGA could have a PCI bus or something else that would want to be > early. Please be more specific. I agree we need a good reason for not just using deferred probing, and PCI host bridges in general are no longer something that needs to be probed early. If you have a particular use case in mind that can't be solved in a better way, we can talk about making this an earlier initcall (probably not arch_initcall), but in general we try hard to avoid new ones like this. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/12] Driver core: Unified interface for firmware node properties
On Friday 17 October 2014 14:14:53 Rafael J. Wysocki wrote: > +/** > + * fwnode_property_present - check if a property of a firmware node is > present > + * @fwnode: Firmware node whose property to check > + * @propname: Name of the property > + */ > +bool fwnode_property_present(struct fwnode_handle *fwnode, const char > *propname) > +{ > + if (is_of_node(fwnode)) > + return of_property_read_bool(of_node(fwnode), propname); > + else if (is_acpi_node(fwnode)) > + return !acpi_dev_prop_get(acpi_node(fwnode), propname, NULL); > + > + return false; > +} > +EXPORT_SYMBOL_GPL(fwnode_property_present); > Should this be return acpi_dev_prop_get(acpi_node(fwnode), propname, NULL); without the '!'? I'm also unsure about the '_present' vs '_read_bool' naming. IIRC we had a long debate about this before we decided on 'read_bool' for DT, and I don't really want to start a new debate, but being consistent would be nice. We could of course have static inline bool fwnode_property_read_bool(struct fwnode_handle *fwnode, const char *propname) { return fwnode_property_present(fwnode, propname); } which is completely redundant, but would help for drivers using the interface to document whether we are checking for bool property that we expect to be either empty or absent (_get_bool), vs checking for the presence of a non-empty property (_present). Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 10/12] gpio: Support for unified device properties interface
On Friday 17 October 2014 20:09:51 Arnd Bergmann wrote: > On October 17, 2014 2:16:00 PM CEST, "Rafael J. Wysocki" > wrote: > >From: Mika Westerberg > > > >Some drivers need to deal with only firmware representation of its > >GPIOs. An example would be a GPIO button array driver where each button > >is described as a separate firmware node in device tree. Typically > >these > >child nodes do not have physical representation in the Linux device > >model. > > > >In order to help device drivers to handle such firmware child nodes we > >add dev[m]_get_named_gpiod_from_child() that takes a child firmware > >node pointer as its second argument (the first one is the parent device > >itself), finds the GPIO using whatever is the underlying firmware > >method, and requests the GPIO properly. > > Could we also have a wrapper around this function without a "name" argument, > using just the index? Expanding on this thought: I think we should mandate for new bindings that they use either a name and no index, or an index but not name, and I also think that for named gpios, we should try to converge on a common naming scheme. As discussed, we will probably want to support all the existing ways to do this even with ACPI and with the unified interface, but it doesn't have to be the obvious way. We could do it like this: // internal implementation, may be called from drivers with legacy bindings struct gpio_desc *__fwnode_get_gpiod_from_property(struct fwnode_handle *fwnode, const char *propname, int index) { ... /* your current code */ } // recommended interface static inline struct gpio_desc *fwnode_get_gpiod(struct fwnode_handle *fwnode, int index) { return __fwnode_get_gpiod_from_property(fwnode, "gpios", index); } // alternative interface struct gpio_desc *fwnode_get_gpiod(struct fwnode_handle *fwnode, const char *name) { char propname[64]; int ret; ret = snprintf(propname, sizeof(propname), "%s-gpios", name); if (ret > sizeof(propname)) return -EINVAL; return __fwnode_get_gpiod_from_property(fwnode, propname, 0); } The above is just a suggestion, I'm hoping for the GPIO maintainers to provide more guidance if they have other ideas. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] blkdev: add flush generation counter
On Thu, Oct 16, 2014 at 5:14 PM, Dmitry Monakhov wrote: > Ming Lei writes: > >> On Tue, Oct 14, 2014 at 12:03 PM, Dmitry Monakhov >> wrote: >>> PROF: >>> *Flush machinery addumptions >>> C1. At any given time, only one flush shall be in progress. This is >>> double buffering sufficient. >>> C2. Flush is deferred if any request is executing DATA of its ence. >>> This avoids issuing separate POSTFLUSHes for requests which ed >>> PREFLUSH. >>> C3. The second condition is ignored if there is a request which has >>> waited longer than FLUSH_PENDING_TIMEOUT. This is to avoid >>> starvation in the unlikely case where there are continuous am of >>> FUA (without FLUSH) requests. >>> >>> So if we will increment flush_tag counter in two places: >>> blk_kick_flush: the place where flush request is issued >>> flush_end_io : the place where flush is completed >>> And by using rule (C1) we can guarantee that: >>> if (flush_tag & 0x1 == 1) then flush_tag is in progress >>> if (flush_tag & 0x1 == 0) then (flush_tag & ~(0x1)) completed >>> In other words we can define it as: >>> >>> FLUSH_TAG_IDX= (flush_tag +1) & ~0x1 >>> FLUSH_TAG_STATE = flush_tag & 0x1 ? IN_PROGRESS : COMPLETED >>> >>> After that we can define rules for flush optimization: >>> We can be sure that our data was flushed only if: >>> 1) data's bio was completed before flush request was QUEUED >>>and COMPLETED >>> So in terms of formulas we can write it like follows: >>> is_data_flushed = (blkdev->flush_tag & ~(0x1)) > >>> ((data->flush_tag + 0x1) & (~0x1)) >> >> Looks you are just trying to figure out if the 'data' is flushed or not, >> so care to explain a bit what the optimization is? > Indeed I try to understand whenever data was flushed or not. > inodes on filesystem may update this ID inside ->end_io callback. > Later if user called fsync/fdatasync we may figure out that inode's > data was flushed already so we do not have to issue explicit barrier. > This helps for intensive multi-file dio-aio/fsync workloads > chunk servers, virt-disk images, mail server, etc. > > for (i=0;i pwrite(fd[i], buf, 4096, 0) > for (i=0;i fdatasync(fd[i]) > Before this optimization we have to issue a barrier to each fdatasync > one by one, and send only one when optimization enabled. Could you share some performance data about the optimization? The flush command itself shouldn't be very expensive, but flushing cache to storage from drive should be. Maybe the following simple optimization can be done in blk-flush for your dio case: - only issue flush command to drive if there are writes completed since last flush. >>> >>> >>> In order to support stacked block devices (especially linear dm) >>> I've implemented get_flush_idx function as queue's callback. >>> >>> *Mutli-Queue scalability notes* >>> This implementation try to makes global optimization for all hw-queues >>> for a device which require read from each hw-queue like follows: >>> queue_for_each_hw_ctx(q, hctx, i) >>> fid += ACCESS_ONCE(hctx->fq->flush_idx) >> >> I am wondering if it can work, suppose request A is submitted >> to hw_queue 0, and request B is submitted to hw_queue 1, then >> you may thought request A has been flushed out when request >> B is just flushed via hctx 1. > Correct, because we know that at the moment A and B was completed > at least one barrier was issued and completed (even via hwctx 2) > Flush has blkdev/request_queue-wide guarantee regardless to hwctx. > If it not the case for MQ then all filesystem are totally broken, > and blkdev_issue_flush MUST be changed to flush all hwctx. It is my fault, your approach is correct. Thanks, >> >>> >>> In my tests I do not see any visiable difference on performance on >>> my hardware (W2600CR: 8cores x 2 sockets, 2numa nodes). >>> Really fast HW may prefer to return flush_id for a single hw-queue >>> in order to do so we have to encode flush_id with hw_queue_id >>> like follows: >>> fid_t = (hctx->fq->flush_id << MQ_SHIFT) | hctx->id >>> #define MQ_ID(fid) (fid & ((1 << MQ_SHIFT) -1)) >>> Later blk_flush_idx_is_stable() can assumes fid_t as unstable if >>> if was obtained from another hw-queue: >>> >>> bool blk_flush_idx_is_stable(struct request_queue *q, fid_t fid) >>> { >>> int difference; >>> fid_t cur = blk_get_flush_idx(q, false); >>> if (MQ_ID(fid) != MQ_ID(fid)) >>> return 0; >>> >>> difference = (blk_get_flush_idx(q, false) - id); >>> return (difference > 0); >>> >>> } >>> Please let me know if you prefer that design to global one. >>> >>> CC: Jens Axboe >>> CC: Christoph Hellwig >>> CC: Ming Lei >>> Signed-off-by: Dmitry Monakhov >>> --- >>> block/blk-core.c |1 + >>> block/blk-flush.c | 50 >>> +++- >>> block/blk-mq.c |5 ++- >>> block/blk-settings.c |6 + >>> block/blk-sysfs.c | 11 ++ >>> block/blk.h
Re: [PATCH resend] [media] rc-core: fix protocol_change regression in ir_raw_event_register
On Thu, Oct 16, 2014 at 11:49 PM, David Härdeman wrote: > I think this is already addressed in this thread: > http://www.spinics.net/lists/linux-media/msg79865.html The patch in that thread would have broken things since the store_protocol function is not changed at the same time. The patch I sent also takes that into account. My concern is still that user space behaviour changes. In my case, lirc simply does not work anymore. More generically, anyone now using e.g. nuvoton-cir with anything other than RC6_MCE will not get their devices working without first explictly enabling the correct protocol from sysfs or with ir-keytable. Correct me if I'm wrong but the change_protocol function in struct rc_dev is meant for changing hardware decoder protocols which means only a few drivers actually use it. So the added empty function change_protocol into rc-ir-raw.c doesnt really make sense in the first place. Tomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
LTTng-UST bytecode interpreter
Hi Alexei, Following our Plumbers discussion, here are links to lttng-ust and lttng-tools parts that are relevant to the bytecode I use for tracepoint filtering: http://git.lttng.org/?p=lttng-tools.git;a=summary src/lib/lttng-ctl/filter/*.[ch] -> parser of filter expressions to AST, then to intermediate representation, followed by bytecode generation. The bytecode is then moved from the client to the application being traced through the lttng-sessiond daemon. http://git.lttng.org/?p=lttng-ust.git;a=summary liblttng-ust/lttng-filter.c: filter "linker" attaching bytecode to tracepoint. _lttng_filter_event_link_bytecode() has all the steps required to translate a bytecode into something the interpreter can use. liblttng-ust/lttng-filter-specialize.c: Perform type specialization of some opcodes. This is done after linking to an event fields, now that we know their type. liblttng-ust/lttng-filter-validator.c Validation of the bytecode: making sure typing is consistent, checks there are no loops (no backward jump). liblttng-ust/lttng-filter-interpreter.c Bytecode interpreter, executes quickly without any checks, relying on the fact that they were already performed by the validator. It is a threaded interpreter which has 2 registers aliasing the top of its stack. My general approach is to use an interpreter to deal with the general case, which makes porting to new architectures easy. We can then have JIT phases if we want to eventually translate this bytecode into native instruction. Working with a bytecode which has a slightly higher level semantic allows dealing with strings as a basic type in addition to integers and floating point values. Please note that the current bytecode is limited to 64-bit integers. We can eventually extend it to be more compact (8, 16, 32-bit integers). Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
LTTng-UST bytecode interpreter
Hi Alexei, Following our Plumbers discussion, here are links to lttng-ust and lttng-tools parts that are relevant to the bytecode I use for tracepoint filtering: http://git.lttng.org/?p=lttng-tools.git;a=summary src/lib/lttng-ctl/filter/*.[ch] -> parser of filter expressions to AST, then to intermediate representation, followed by bytecode generation. The bytecode is then moved from the client to the application being traced through the lttng-sessiond daemon. http://git.lttng.org/?p=lttng-ust.git;a=summary liblttng-ust/lttng-filter.c: filter "linker" attaching bytecode to tracepoint. _lttng_filter_event_link_bytecode() has all the steps required to translate a bytecode into something the interpreter can use. liblttng-ust/lttng-filter-specialize.c: Perform type specialization of some opcodes. This is done after linking to an event fields, now that we know their type. liblttng-ust/lttng-filter-validator.c Validation of the bytecode: making sure typing is consistent, checks there are no loops (no backward jump). liblttng-ust/lttng-filter-interpreter.c Bytecode interpreter, executes quickly without any checks, relying on the fact that they were already performed by the validator. It is a threaded interpreter which has 2 registers aliasing the top of its stack. My general approach is to use an interpreter to deal with the general case, which makes porting to new architectures easy. We can then have JIT phases if we want to eventually translate this bytecode into native instruction. Working with a bytecode which has a slightly higher level semantic allows dealing with strings as a basic type in addition to integers and floating point values. Please note that the current bytecode is limited to 64-bit integers. We can eventually extend it to be more compact (8, 16, 32-bit integers). This is just provided as input in case some ideas can be useful for your work on eBPF. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i8k: Add support for Dell Latitude E6440
On Sat, Oct 18, 2014 at 12:23:39AM +0200, Pali Rohár wrote: > On Friday 10 October 2014 22:56:55 Guenter Roeck wrote: > > On 10/10/2014 02:12 AM, Pali Rohár wrote: > > > Dell Latitude E6440 needs same settings as E6540. > > > > > > Signed-off-by: Pali Rohár > > > > Acked-by: Guenter Roeck > > > > Greg, Arnd: PING > > Can you apply also this patch for E6440? Support for E6540 is > already in linus tree: 06c88b0d7ad87540405aea7f91d98ef43be04c95 I can't apply anything to my trees until 3.18-rc1 is out, but will do this after that happens. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CRASH during boot 3.16.3+
On 2014-10-13 14:26, Chuck Ebbert wrote: >> How can I capture the output easily? > > Add "boot_delay=3000" to the kernel command line. This will add a 3 > second delay between each line. Then take pictures of the screen > while it boots. And try to time the shots so you take them during the > delays. On the first system I tested (one that I do not really mind rebooting) this parameter appears to have no effect. Kind regards, Udo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clocksource, Add warning to clocksource_delta() validation code
On 10/17/2014 02:27 PM, John Stultz wrote: > On Fri, Oct 17, 2014 at 11:23 AM, Prarit Bhargava wrote: >> >> >> On 10/17/2014 02:17 PM, John Stultz wrote: >>> On Fri, Oct 17, 2014 at 6:57 AM, Prarit Bhargava wrote: A bug report came in against an older kernel which output "backward time" messages and the report noted that the upstream kernel worked. After some investigation it turned out that one of the sockets was bad on the system and the "backward time" messages were caused by a real, but intermittent, hardware failure. Commit 09ec54429c6d10f87d1f084de53ae2c1c3a81108 ("clocksource: Move cycle_last validation to core code") modifies the x86 clocksource such that if a negative delta between two reads of time is calculated the clocksource_delta() code will return 0. There is no warning when this occurs and there really should be one in order to catch not only hardware issues like the issue above, but potential coding issues as the code is modified. This patch introduces a WARN() which will also dump a stack trace to the console so the exact code path can be evaluated. I tested this by booting on the broken hardware and left the system idle until a negative clocksource_delta() event occurred. Cc: John Stultz Cc: Thomas Gleixner Signed-off-by: Prarit Bhargava --- kernel/time/timekeeping_internal.h |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h index 4ea005a..abe6bc8 100644 --- a/kernel/time/timekeeping_internal.h +++ b/kernel/time/timekeeping_internal.h @@ -17,7 +17,12 @@ static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask) { cycle_t ret = (now - last) & mask; - return (s64) ret > 0 ? ret : 0; + if ((s64)ret > 0) + return ret; + + WARN(1, "Clocksource calculated negative delta, %lld. last = %llu, now = %llu, mask = %llx\n", +(s64)ret, last, now, mask); + return 0; >>> >>> >>> I realize you followed up that this wasn't finished, but just as some >>> feedback, there's a number of types of hardware where there may be a >>> very slight skew between cpu TSC, and this will briefly trigger right >>> after each timekeeping update if a system is reading the clock >>> frequently (think of the case where the update happens on the cpu >>> thats just a little bit ahead, while a timestamping loop is running on >>> a cpu that is a little bit behind). >> >> Ah, interesting. Okay ... drop this patch then. Thanks for the info John. > > If you're wanting something that aids with debugging, maybe some sort > calmly stated warn-once in the dmesg might be ok, that or some other > flag exported via a debugging interface. IMO with the clock code I'd prefer it to be 100% accurate. There is nothing more annoying than going through the rigors of debugging and discovering that it is a hardware issue of some sort... This can be dropped IMO. It's really not that important. P. > > thanks > -john > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
threadirqs and kthreadd_task
Hi, I have been trying to enable "threadirqs" command line option on 3.10.12 kernel and kernel crashes with NULL pointer access for kthreadd_task in wake_up_process function: [0.00] Call Trace: [0.00] [<8005544c>] wake_up_process+0xc/0x48 [0.00] [<8004c1b8>] kthread_create_on_node+0x64/0xc4 [0.00] [<8006d338>] __setup_irq+0x130/0x634 [0.00] [<8006d89c>] setup_irq+0x60/0xb4 [0.00] [<8050267c>] icu_of_init+0x1f8/0x2f8 [0.00] [<80517b30>] of_irq_init+0x2c8/0x304 [0.00] [<805007bc>] start_kernel+0x218/0x3a4 But as far as i see, kthreadd_task is initialized in start_kernel --> rest_init function which is much later than setting up interrupts ( init_IRQ ): So my question is, how are systems working which enabled "threadirqs" command line option enabled ( the code flow seems similar even on 3.17 kernel ). Am i missing the obvious? thanks for pointers -- Ajay -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: getaddrinfo slowdown in 3.17.1, due to getifaddrs
On Fri, Oct 17, 2014 at 02:34:30AM +0200, Steinar H. Gunderson wrote: > e341694e3eb57fcda9f1adc7bfea42fe080d8d7a looks like it might cause something > like this (it certainly added the synchronize_net() call). Cc-ing people on > that commit; quoting the entire rest of the message for reference. I see there's discussion on what to do with this; thanks. :-) FWIW, I've verified that reverting these four patches (in that order) fixes the problem: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/net/netlink/af_netlink.c?id=9ce12eb16ffb143f3a509da86283ddd0b10bcdb3 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/net/netlink/af_netlink.c?id=6c8f7e70837468da4e658080d4448930fb597e1b https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/net/netlink/af_netlink.c?id=67a24ac18b0262178ba9f05501b2c6e6731d449a https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/net/netlink/af_netlink.c?id=e341694e3eb57fcda9f1adc7bfea42fe080d8d7a My Perl script is seemingly not the only affected program on the system; witness the average page load times for our PHP-based home page: http://home.samfundet.no/~sesse/web_load_time_samfundet_no-week.png /* Steinar */ -- Homepage: http://www.sesse.net/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] iio: inkern: Add of_xlate function to struct iio_dev
On 02/10/14 13:32, Ivan T. Ivanov wrote: > When #iio-cells is greater than '0', the driver could provide > a custom of_xlate function that reads the *args* and returns > the appropriate index in registered IIO channels array. > > Add simple translation function, suitable for the most 1:1 > mapped channels in IIO chips, and use it when driver did not > provide custom implementation. > > Signed-off-by: Ivan T. Ivanov Any more comments on this? Been sat a while and the discussions seems to have died out. As Ivan has pointed out, very similar approaches are used elsewhere (gpio for example). > --- > drivers/iio/inkern.c| 32 +++- > include/linux/iio/iio.h | 8 > 2 files changed, 35 insertions(+), 5 deletions(-) > > diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c > index f084610..6c3e478 100644 > --- a/drivers/iio/inkern.c > +++ b/drivers/iio/inkern.c > @@ -100,6 +100,28 @@ static int iio_dev_node_match(struct device *dev, void > *data) > return dev->of_node == data && dev->type == &iio_device_type; > } > > +/** > + * __of_iio_simple_xlate - translate iiospec to the IIO channel index > + * @indio_dev: pointer to the iio_dev structure > + * @iiospec: IIO specifier as found in the device tree > + * > + * This is simple translation function, suitable for the most 1:1 mapped > + * channels in IIO chips. This function performs only one sanity check: > + * whether IIO index is less than num_channels (that is specified in the > + * iio_dev). > + */ > +static int __of_iio_simple_xlate(struct iio_dev *indio_dev, > + const struct of_phandle_args *iiospec) > +{ > + if (!iiospec->args_count) > + return 0; > + > + if (iiospec->args[0] >= indio_dev->num_channels) > + return -EINVAL; > + > + return iiospec->args[0]; > +} > + > static int __of_iio_channel_get(struct iio_channel *channel, > struct device_node *np, int index) > { > @@ -122,18 +144,18 @@ static int __of_iio_channel_get(struct iio_channel > *channel, > > indio_dev = dev_to_iio_dev(idev); > channel->indio_dev = indio_dev; > - index = iiospec.args_count ? iiospec.args[0] : 0; > - if (index >= indio_dev->num_channels) { > - err = -EINVAL; > + if (!indio_dev->of_xlate) > + indio_dev->of_xlate = __of_iio_simple_xlate; > + index = indio_dev->of_xlate(indio_dev, &iiospec); > + if (index < 0) > goto err_put; > - } > channel->channel = &indio_dev->channels[index]; > > return 0; > > err_put: > iio_device_put(indio_dev); > - return err; > + return index; > } > > static struct iio_channel *of_iio_channel_get(struct device_node *np, int > index) > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h > index 15dc6bc..d5bb219 100644 > --- a/include/linux/iio/iio.h > +++ b/include/linux/iio/iio.h > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > /* IIO TODO LIST */ > /* > * Provide means of adjusting timer accuracy. > @@ -413,6 +414,11 @@ struct iio_buffer_setup_ops { > * @currentmode: [DRIVER] current operating mode > * @dev: [DRIVER] device structure, should be assigned a parent > * and owner > + * @of_xlate:[DRIVER] function pointer to obtain channel > specifier > + * index. When #iio-cells is greater than '0', the driver > + * could provide a custom of_xlate function that reads > + * the *args* and returns the appropriate index in > + * registered IIO channels array. > * @event_interface: [INTERN] event chrdevs associated with interrupt lines > * @buffer: [DRIVER] any buffer present > * @buffer_list: [INTERN] list of all buffers currently attached > @@ -451,6 +457,8 @@ struct iio_dev { > int currentmode; > struct device dev; > > + int (*of_xlate)(struct iio_dev *indio_dev, > + const struct of_phandle_args *iiospec); > struct iio_event_interface *event_interface; > > struct iio_buffer *buffer; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.12 000/197] 3.12.31-stable review
Jiri Slaby wrote: This is the start of the stable review cycle for the 3.12.31 release. There are 197 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Hi Jiri, the commit mentioned below has already been backported to 3.10 stable, can you please add it to 3.12 stable? A test build with patch-3.12.31-rc1 and the commit in question builds and runs fine for me. Thank you in advance. Regards Ingmar Forwarded message Subject: linux-stable: backport usb related commit to 3.10 and 3.12? Date:Sat, 23 Aug 2014 13:19:09 +0200 To: linux-kernel@vger.kernel.org Greetings. On an administered server, I noticed that there's a constant load around 0.70 when running kernels 3.10 and 3.12, even if the system is doing nothing, and in single user mode. Culprit seems to be the process 'khubd'. No such effect when running kernels 3.4 or 3.14. I reverse bisected linux-stable and found that the following commit fixes it for me, for both kernel 3.10 and 3.12: [08d1dec6f4054e3613f32051d9b149d4203ce0d2] usb:hub set hub->change_bits when over-current happens Could that commit be backported to the stable 3.10 and 3.12 series? (Disclaimer: I'm not a programmer, but I'd be happy to help with any testing involved) Regards Ingmar -- $ git diff ac5166bcdb43889a5bd837f5076b78049e1f8bca 08d1dec6f4054e3613f32051d9b149d4203ce0d2 diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index c1422a0..babba88 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1147,7 +1147,8 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) /* Tell khubd to disconnect the device or * check for a new connection */ - if (udev || (portstatus & USB_PORT_STAT_CONNECTION)) + if (udev || (portstatus & USB_PORT_STAT_CONNECTION) || + (portstatus & USB_PORT_STAT_OVERCURRENT)) set_bit(port1, hub->change_bits); } else if (portstatus & USB_PORT_STAT_ENABLE) { -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] iio: inkern: Add of_xlate function to struct iio_dev
On 10/18/2014 01:42 PM, Jonathan Cameron wrote: On 02/10/14 13:32, Ivan T. Ivanov wrote: When #iio-cells is greater than '0', the driver could provide a custom of_xlate function that reads the *args* and returns the appropriate index in registered IIO channels array. Add simple translation function, suitable for the most 1:1 mapped channels in IIO chips, and use it when driver did not provide custom implementation. Signed-off-by: Ivan T. Ivanov Any more comments on this? Been sat a while and the discussions seems to have died out. As Ivan has pointed out, very similar approaches are used elsewhere (gpio for example). Looks good to me: Reviewed-by: Lars-Peter Clausen When we initially added the DT support to IIO I was hoping that we can get away with just using the simple and generic xlate function for all devices. But it looks as if some more complex devices need to overwrite it. We should be careful about adding new driver specific xlate implementations and make sure that it is actually needed. One thing we might want to consider though is instead of adding the xlate callback to the iio_dev struct add it to the iio_info struct since it should be the same for different device instances of the same driver. And this is also where all the other callbacks are. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] blkdev: add flush generation counter
Ming Lei writes: > On Thu, Oct 16, 2014 at 5:14 PM, Dmitry Monakhov wrote: >> Ming Lei writes: >> >>> On Tue, Oct 14, 2014 at 12:03 PM, Dmitry Monakhov >>> wrote: PROF: *Flush machinery addumptions C1. At any given time, only one flush shall be in progress. This is double buffering sufficient. C2. Flush is deferred if any request is executing DATA of its ence. This avoids issuing separate POSTFLUSHes for requests which ed PREFLUSH. C3. The second condition is ignored if there is a request which has waited longer than FLUSH_PENDING_TIMEOUT. This is to avoid starvation in the unlikely case where there are continuous am of FUA (without FLUSH) requests. So if we will increment flush_tag counter in two places: blk_kick_flush: the place where flush request is issued flush_end_io : the place where flush is completed And by using rule (C1) we can guarantee that: if (flush_tag & 0x1 == 1) then flush_tag is in progress if (flush_tag & 0x1 == 0) then (flush_tag & ~(0x1)) completed In other words we can define it as: FLUSH_TAG_IDX= (flush_tag +1) & ~0x1 FLUSH_TAG_STATE = flush_tag & 0x1 ? IN_PROGRESS : COMPLETED After that we can define rules for flush optimization: We can be sure that our data was flushed only if: 1) data's bio was completed before flush request was QUEUED and COMPLETED So in terms of formulas we can write it like follows: is_data_flushed = (blkdev->flush_tag & ~(0x1)) > ((data->flush_tag + 0x1) & (~0x1)) >>> >>> Looks you are just trying to figure out if the 'data' is flushed or not, >>> so care to explain a bit what the optimization is? >> Indeed I try to understand whenever data was flushed or not. >> inodes on filesystem may update this ID inside ->end_io callback. >> Later if user called fsync/fdatasync we may figure out that inode's >> data was flushed already so we do not have to issue explicit barrier. >> This helps for intensive multi-file dio-aio/fsync workloads >> chunk servers, virt-disk images, mail server, etc. >> >> for (i=0;i> pwrite(fd[i], buf, 4096, 0) >> for (i=0;i> fdatasync(fd[i]) >> Before this optimization we have to issue a barrier to each fdatasync >> one by one, and send only one when optimization enabled. > > Could you share some performance data about the optimization? I'm in the middle of comparison of new version of the patch, I'll post numbers with new version. But here is quick timings of simulation of "chunk server" time xfs_io -c "pwrite -S 0x1 -b4096 0 4096000" \ -c "sync_range -w 0 4096000" \ -c "sync_range -a 0 4096000" \ -c "fdatasync" -f /mnt/{1..32} > /dev/null | time_out | w/o patch | with patch | |--+---++ | | || | Real | 0m2.448 | 0m1.547s | | user | 0m0.003 | 0m0.010s | | sys | 0m0.287 | 0m0.262s | > The flush command itself shouldn't be very expensive, but flushing > cache to storage from drive should be. Maybe the following simple > optimization can be done in blk-flush for your dio case: Yes. But filesystem is alive so usually there are other tasks which also do IO so global flag will not helps that many. Imagine several virtual images host on same fs. So per-inode granularity is good thing to have. I am not trying to implement "the silver bullet", I just want to optimize the things which can be optimized w/o destroying others. > > - only issue flush command to drive if there are writes completed since > last flush. > In order to support stacked block devices (especially linear dm) I've implemented get_flush_idx function as queue's callback. *Mutli-Queue scalability notes* This implementation try to makes global optimization for all hw-queues for a device which require read from each hw-queue like follows: queue_for_each_hw_ctx(q, hctx, i) fid += ACCESS_ONCE(hctx->fq->flush_idx) >>> >>> I am wondering if it can work, suppose request A is submitted >>> to hw_queue 0, and request B is submitted to hw_queue 1, then >>> you may thought request A has been flushed out when request >>> B is just flushed via hctx 1. >> Correct, because we know that at the moment A and B was completed >> at least one barrier was issued and completed (even via hwctx 2) >> Flush has blkdev/request_queue-wide guarantee regardless to hwctx. >> If it not the case for MQ then all filesystem are totally broken, >> and blkdev_issue_flush MUST be changed to flush all hwctx. > > It is my fault, your approach is correct. > > > Thanks, >>> In my tests I do not see any visiable difference on performance on my hardware (W2600CR: 8cores x 2 sockets, 2numa nodes). Really fast HW may prefer to return flush_id for a single hw-queue in order to do so we have to encode flush_id with hw_queue_id >>
Re: [PATCH] io: accel: kxcjk-1013: Fix iio_event_spec direction
On 10/10/14 15:53, Daniel Baluta wrote: > Because IIO_EV_DIR_* are not bitmasks but enums, > IIO_EV_DIR_RISING | IIO_EV_DIR_FALLING is not equal > with IIO_EV_DIR_EITHER. > > This could lead to potential misformatted sysfs attributes > like: > * in_accel_x_thresh_(null)_en > * in_accel_x_thresh_(null)_period > * in_accel_x_thresh_(null)_value > > or even memory corruption. > > Fixes: b4b491c083 (iio: accel: kxcjk-1013: Support threshold) > Signed-off-by: Daniel Baluta Good catch. I'll have to hold this for a little until Greg takes my last pull request and I can bring the fixes branch up past the merge window. > --- > drivers/iio/accel/kxcjk-1013.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c > index 98909a9..a23e58c 100644 > --- a/drivers/iio/accel/kxcjk-1013.c > +++ b/drivers/iio/accel/kxcjk-1013.c > @@ -894,7 +894,7 @@ static const struct attribute_group kxcjk1013_attrs_group > = { > > static const struct iio_event_spec kxcjk1013_event = { > .type = IIO_EV_TYPE_THRESH, > - .dir = IIO_EV_DIR_RISING | IIO_EV_DIR_FALLING, > + .dir = IIO_EV_DIR_EITHER, > .mask_separate = BIT(IIO_EV_INFO_VALUE) | >BIT(IIO_EV_INFO_ENABLE) | >BIT(IIO_EV_INFO_PERIOD) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] iio: inkern: Add of_xlate function to struct iio_dev
On 18/10/14 12:50, Lars-Peter Clausen wrote: > On 10/18/2014 01:42 PM, Jonathan Cameron wrote: >> On 02/10/14 13:32, Ivan T. Ivanov wrote: >>> When #iio-cells is greater than '0', the driver could provide >>> a custom of_xlate function that reads the *args* and returns >>> the appropriate index in registered IIO channels array. >>> >>> Add simple translation function, suitable for the most 1:1 >>> mapped channels in IIO chips, and use it when driver did not >>> provide custom implementation. >>> >>> Signed-off-by: Ivan T. Ivanov >> Any more comments on this? Been sat a while and the discussions seems >> to have died out. >> >> As Ivan has pointed out, very similar approaches are used >> elsewhere (gpio for example). > > Looks good to me: > > Reviewed-by: Lars-Peter Clausen > > When we initially added the DT support to IIO I was hoping that we can get > away > with just using the simple and generic xlate function for all devices. But it > looks as if some more complex devices need to overwrite it. We should be > careful > about adding new driver specific xlate implementations and make sure that it > is > actually needed. > > One thing we might want to consider though is instead of adding the xlate > callback to the iio_dev struct add it to the iio_info struct since it should > be > the same for different device instances of the same driver. And this is also > where all the other callbacks are. Good point - would definitely prefer that. J > > - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] input: stmpe: enforce device tree only mode
The STMPE keypad controller is only used with device tree configured systems, so force the configuration to come from device tree only, and now actually get the rows and cols from the device tree too. Signed-off-by: Linus Walleij --- .../devicetree/bindings/input/stmpe-keypad.txt | 2 + drivers/input/keyboard/Kconfig | 1 + drivers/input/keyboard/stmpe-keypad.c | 104 + include/linux/mfd/stmpe.h | 20 4 files changed, 48 insertions(+), 79 deletions(-) diff --git a/Documentation/devicetree/bindings/input/stmpe-keypad.txt b/Documentation/devicetree/bindings/input/stmpe-keypad.txt index 1b97222e8a0b..12bb771d66d4 100644 --- a/Documentation/devicetree/bindings/input/stmpe-keypad.txt +++ b/Documentation/devicetree/bindings/input/stmpe-keypad.txt @@ -8,6 +8,8 @@ Optional properties: - debounce-interval: Debouncing interval time in milliseconds - st,scan-count: Scanning cycles elapsed before key data is updated - st,no-autorepeat : If specified device will not autorepeat + - keypad,num-rows : See ./matrix-keymap.txt + - keypad,num-columns : See ./matrix-keymap.txt Example: diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig index a3958c63d7d5..753d61c0a3a9 100644 --- a/drivers/input/keyboard/Kconfig +++ b/drivers/input/keyboard/Kconfig @@ -559,6 +559,7 @@ config KEYBOARD_SH_KEYSC config KEYBOARD_STMPE tristate "STMPE keypad support" depends on MFD_STMPE + depends on OF select INPUT_MATRIXKMAP help Say Y here if you want to use the keypad controller on STMPE I/O diff --git a/drivers/input/keyboard/stmpe-keypad.c b/drivers/input/keyboard/stmpe-keypad.c index ef5e67fb567e..d46391f48310 100644 --- a/drivers/input/keyboard/stmpe-keypad.c +++ b/drivers/input/keyboard/stmpe-keypad.c @@ -45,7 +45,7 @@ #define STMPE_KEYPAD_MAX_ROWS 8 #define STMPE_KEYPAD_MAX_COLS 8 #define STMPE_KEYPAD_ROW_SHIFT 3 -#define STMPE_KEYPAD_KEYMAP_SIZE \ +#define STMPE_KEYPAD_KEYMAP_MAX_SIZE \ (STMPE_KEYPAD_MAX_ROWS * STMPE_KEYPAD_MAX_COLS) /** @@ -99,16 +99,30 @@ static const struct stmpe_keypad_variant stmpe_keypad_variants[] = { }, }; +/** + * struct stmpe_keypad - STMPE keypad state container + * @stmpe: pointer to parent STMPE device + * @input: spawned input device + * @variant: STMPE variant + * @debounce_ms: debounce interval, in ms. Maximum is + * %STMPE_KEYPAD_MAX_DEBOUNCE. + * @scan_count: number of key scanning cycles to confirm key data. + * Maximum is %STMPE_KEYPAD_MAX_SCAN_COUNT. + * @no_autorepeat: disable key autorepeat + * @rows: bitmask for the rows + * @cols: bitmask for the columns + * @keymap: the keymap + */ struct stmpe_keypad { struct stmpe *stmpe; struct input_dev *input; const struct stmpe_keypad_variant *variant; - const struct stmpe_keypad_platform_data *plat; - + unsigned int debounce_ms; + unsigned int scan_count; + bool no_autorepeat; unsigned int rows; unsigned int cols; - - unsigned short keymap[STMPE_KEYPAD_KEYMAP_SIZE]; + unsigned short keymap[STMPE_KEYPAD_KEYMAP_MAX_SIZE]; }; static int stmpe_keypad_read_data(struct stmpe_keypad *keypad, u8 *data) @@ -208,15 +222,14 @@ static int stmpe_keypad_altfunc_init(struct stmpe_keypad *keypad) static int stmpe_keypad_chip_init(struct stmpe_keypad *keypad) { - const struct stmpe_keypad_platform_data *plat = keypad->plat; const struct stmpe_keypad_variant *variant = keypad->variant; struct stmpe *stmpe = keypad->stmpe; int ret; - if (plat->debounce_ms > STMPE_KEYPAD_MAX_DEBOUNCE) + if (keypad->debounce_ms > STMPE_KEYPAD_MAX_DEBOUNCE) return -EINVAL; - if (plat->scan_count > STMPE_KEYPAD_MAX_SCAN_COUNT) + if (keypad->scan_count > STMPE_KEYPAD_MAX_SCAN_COUNT) return -EINVAL; ret = stmpe_enable(stmpe, STMPE_BLOCK_KEYPAD); @@ -245,7 +258,7 @@ static int stmpe_keypad_chip_init(struct stmpe_keypad *keypad) ret = stmpe_set_bits(stmpe, STMPE_KPC_CTRL_MSB, STMPE_KPC_CTRL_MSB_SCAN_COUNT, -plat->scan_count << 4); +keypad->scan_count << 4); if (ret < 0) return ret; @@ -253,17 +266,18 @@ static int stmpe_keypad_chip_init(struct stmpe_keypad *keypad) STMPE_KPC_CTRL_LSB_SCAN | STMPE_KPC_CTRL_LSB_DEBOUNCE, STMPE_KPC_CTRL_LSB_SCAN | - (plat->debounce_ms << 1)); + (keypad->debounce_ms << 1)); } -static void stmpe_keypad_fill_used_pins(struct stmpe_keypad *keypad) +static void stmpe_keypad_fill_used_pins(struct stmpe_keypad *keypad,
[PATCH 1/3] mfd: stmpe: add pull up/down register offsets for STMPE
This adds the register offsets for pull up/down for the STMPE 1601, 1801 and 24xx expanders. This is used to bias GPIO lines and keypad lines. Signed-off-by: Linus Walleij --- Hi Sam, Lee: I think you should just ACK this so Dmitry can take the patch series through the input tree, where the registers need to be used to enable the keypad on the STMPE2401. --- drivers/mfd/stmpe.c | 4 drivers/mfd/stmpe.h | 3 +++ include/linux/mfd/stmpe.h | 2 ++ 3 files changed, 9 insertions(+) diff --git a/drivers/mfd/stmpe.c b/drivers/mfd/stmpe.c index 3004d5ba0b82..497505bad4c4 100644 --- a/drivers/mfd/stmpe.c +++ b/drivers/mfd/stmpe.c @@ -547,6 +547,7 @@ static const u8 stmpe1601_regs[] = { [STMPE_IDX_GPDR_LSB]= STMPE1601_REG_GPIO_SET_DIR_LSB, [STMPE_IDX_GPRER_LSB] = STMPE1601_REG_GPIO_RE_LSB, [STMPE_IDX_GPFER_LSB] = STMPE1601_REG_GPIO_FE_LSB, + [STMPE_IDX_GPPUR_LSB] = STMPE1601_REG_GPIO_PU_LSB, [STMPE_IDX_GPAFR_U_MSB] = STMPE1601_REG_GPIO_AF_U_MSB, [STMPE_IDX_IEGPIOR_LSB] = STMPE1601_REG_INT_EN_GPIO_MASK_LSB, [STMPE_IDX_ISGPIOR_MSB] = STMPE1601_REG_INT_STA_GPIO_MSB, @@ -695,6 +696,7 @@ static const u8 stmpe1801_regs[] = { [STMPE_IDX_GPDR_LSB]= STMPE1801_REG_GPIO_SET_DIR_LOW, [STMPE_IDX_GPRER_LSB] = STMPE1801_REG_GPIO_RE_LOW, [STMPE_IDX_GPFER_LSB] = STMPE1801_REG_GPIO_FE_LOW, + [STMPE_IDX_GPPUR_LSB] = STMPE1801_REG_GPIO_PULL_UP_LOW, [STMPE_IDX_IEGPIOR_LSB] = STMPE1801_REG_INT_EN_GPIO_MASK_LOW, [STMPE_IDX_ISGPIOR_LSB] = STMPE1801_REG_INT_STA_GPIO_LOW, }; @@ -778,6 +780,8 @@ static const u8 stmpe24xx_regs[] = { [STMPE_IDX_GPDR_LSB]= STMPE24XX_REG_GPDR_LSB, [STMPE_IDX_GPRER_LSB] = STMPE24XX_REG_GPRER_LSB, [STMPE_IDX_GPFER_LSB] = STMPE24XX_REG_GPFER_LSB, + [STMPE_IDX_GPPUR_LSB] = STMPE24XX_REG_GPPUR_LSB, + [STMPE_IDX_GPPDR_LSB] = STMPE24XX_REG_GPPDR_LSB, [STMPE_IDX_GPAFR_U_MSB] = STMPE24XX_REG_GPAFR_U_MSB, [STMPE_IDX_IEGPIOR_LSB] = STMPE24XX_REG_IEGPIOR_LSB, [STMPE_IDX_ISGPIOR_MSB] = STMPE24XX_REG_ISGPIOR_MSB, diff --git a/drivers/mfd/stmpe.h b/drivers/mfd/stmpe.h index bee0abf82040..84adb46b3e2f 100644 --- a/drivers/mfd/stmpe.h +++ b/drivers/mfd/stmpe.h @@ -188,6 +188,7 @@ int stmpe_remove(struct stmpe *stmpe); #define STMPE1601_REG_GPIO_ED_MSB 0x8A #define STMPE1601_REG_GPIO_RE_LSB 0x8D #define STMPE1601_REG_GPIO_FE_LSB 0x8F +#define STMPE1601_REG_GPIO_PU_LSB 0x91 #define STMPE1601_REG_GPIO_AF_U_MSB0x92 #define STMPE1601_SYS_CTRL_ENABLE_GPIO (1 << 3) @@ -276,6 +277,8 @@ int stmpe_remove(struct stmpe *stmpe); #define STMPE24XX_REG_GPEDR_MSB0x8C #define STMPE24XX_REG_GPRER_LSB0x91 #define STMPE24XX_REG_GPFER_LSB0x94 +#define STMPE24XX_REG_GPPUR_LSB0x97 +#define STMPE24XX_REG_GPPDR_LSB0x9a #define STMPE24XX_REG_GPAFR_U_MSB 0x9B #define STMPE24XX_SYS_CTRL_ENABLE_GPIO (1 << 3) diff --git a/include/linux/mfd/stmpe.h b/include/linux/mfd/stmpe.h index af9e1b07a630..976e1a390177 100644 --- a/include/linux/mfd/stmpe.h +++ b/include/linux/mfd/stmpe.h @@ -50,6 +50,8 @@ enum { STMPE_IDX_GPEDR_MSB, STMPE_IDX_GPRER_LSB, STMPE_IDX_GPFER_LSB, + STMPE_IDX_GPPUR_LSB, + STMPE_IDX_GPPDR_LSB, STMPE_IDX_GPAFR_U_MSB, STMPE_IDX_IEGPIOR_LSB, STMPE_IDX_ISGPIOR_LSB, -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] pinctrl: nomadik: amend MMC/SD pins
There is a missing MMC/SD pin for MCDATDIR2 which is routed as alt B, add it to the MMC/SD pin group and functions. Signed-off-by: Linus Walleij --- drivers/pinctrl/nomadik/pinctrl-nomadik-stn8815.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pinctrl/nomadik/pinctrl-nomadik-stn8815.c b/drivers/pinctrl/nomadik/pinctrl-nomadik-stn8815.c index ed39dcafd4f8..2cd71470f270 100644 --- a/drivers/pinctrl/nomadik/pinctrl-nomadik-stn8815.c +++ b/drivers/pinctrl/nomadik/pinctrl-nomadik-stn8815.c @@ -291,6 +291,7 @@ static const unsigned u0_a_1_pins[] = { STN8815_PIN_B4, STN8815_PIN_D5, static const unsigned mmcsd_a_1_pins[] = { STN8815_PIN_B10, STN8815_PIN_A10, STN8815_PIN_C11, STN8815_PIN_B11, STN8815_PIN_A11, STN8815_PIN_C12, STN8815_PIN_B12, STN8815_PIN_A12, STN8815_PIN_C13, STN8815_PIN_C15 }; +static const unsigned mmcsd_b_1_pins[] = { STN8815_PIN_D15 }; static const unsigned u1_a_1_pins[] = { STN8815_PIN_M2, STN8815_PIN_L1, STN8815_PIN_F3, STN8815_PIN_F2 }; static const unsigned i2c1_a_1_pins[] = { STN8815_PIN_L4, STN8815_PIN_L3 }; @@ -305,6 +306,7 @@ static const unsigned i2cusb_b_1_pins[] = { STN8815_PIN_C21, STN8815_PIN_C20 }; static const struct nmk_pingroup nmk_stn8815_groups[] = { STN8815_PIN_GROUP(u0_a_1, NMK_GPIO_ALT_A), STN8815_PIN_GROUP(mmcsd_a_1, NMK_GPIO_ALT_A), + STN8815_PIN_GROUP(mmcsd_b_1, NMK_GPIO_ALT_B), STN8815_PIN_GROUP(u1_a_1, NMK_GPIO_ALT_A), STN8815_PIN_GROUP(i2c1_a_1, NMK_GPIO_ALT_A), STN8815_PIN_GROUP(i2c0_a_1, NMK_GPIO_ALT_A), @@ -317,7 +319,7 @@ static const struct nmk_pingroup nmk_stn8815_groups[] = { static const char * const a##_groups[] = { b }; STN8815_FUNC_GROUPS(u0, "u0_a_1"); -STN8815_FUNC_GROUPS(mmcsd, "mmcsd_a_1"); +STN8815_FUNC_GROUPS(mmcsd, "mmcsd_a_1", "mmcsd_b_1"); STN8815_FUNC_GROUPS(u1, "u1_a_1", "u1_b_1"); STN8815_FUNC_GROUPS(i2c1, "i2c1_a_1"); STN8815_FUNC_GROUPS(i2c0, "i2c0_a_1"); -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] input: stmpe: bias keypad columns properly
All keypad column pins used as inputs should be pulled up on the STMPE24xx, but this is not done by the current driver. Add some logic that will do this properly. The STMPE1601 also has a keypad controller, but explicitly does *NOT* require you to set up any pull-ups. Signed-off-by: Linus Walleij --- drivers/input/keyboard/stmpe-keypad.c | 37 +-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/input/keyboard/stmpe-keypad.c b/drivers/input/keyboard/stmpe-keypad.c index d46391f48310..fe6e3f22eed7 100644 --- a/drivers/input/keyboard/stmpe-keypad.c +++ b/drivers/input/keyboard/stmpe-keypad.c @@ -52,6 +52,7 @@ * struct stmpe_keypad_variant - model-specific attributes * @auto_increment: whether the KPC_DATA_BYTE register address * auto-increments on multiple read + * @set_pullup: whether the pins need to have their pull-ups set * @num_data: number of data bytes * @num_normal_data: number of normal keys' data bytes * @max_cols: maximum number of columns supported @@ -61,6 +62,7 @@ */ struct stmpe_keypad_variant { boolauto_increment; + boolset_pullup; int num_data; int num_normal_data; int max_cols; @@ -81,6 +83,7 @@ static const struct stmpe_keypad_variant stmpe_keypad_variants[] = { }, [STMPE2401] = { .auto_increment = false, + .set_pullup = true, .num_data = 3, .num_normal_data= 2, .max_cols = 8, @@ -90,6 +93,7 @@ static const struct stmpe_keypad_variant stmpe_keypad_variants[] = { }, [STMPE2403] = { .auto_increment = true, + .set_pullup = true, .num_data = 5, .num_normal_data= 3, .max_cols = 8, @@ -185,7 +189,10 @@ static int stmpe_keypad_altfunc_init(struct stmpe_keypad *keypad) unsigned int col_gpios = variant->col_gpios; unsigned int row_gpios = variant->row_gpios; struct stmpe *stmpe = keypad->stmpe; + u8 pureg = stmpe->regs[STMPE_IDX_GPPUR_LSB]; unsigned int pins = 0; + unsigned int pu_pins = 0; + int ret; int i; /* @@ -202,8 +209,10 @@ static int stmpe_keypad_altfunc_init(struct stmpe_keypad *keypad) for (i = 0; i < variant->max_cols; i++) { int num = __ffs(col_gpios); - if (keypad->cols & (1 << i)) + if (keypad->cols & (1 << i)) { pins |= 1 << num; + pu_pins |= 1 << num; + } col_gpios &= ~(1 << num); } @@ -217,7 +226,31 @@ static int stmpe_keypad_altfunc_init(struct stmpe_keypad *keypad) row_gpios &= ~(1 << num); } - return stmpe_set_altfunc(stmpe, pins, STMPE_BLOCK_KEYPAD); + ret = stmpe_set_altfunc(stmpe, pins, STMPE_BLOCK_KEYPAD); + if (ret) + return ret; + + /* +* On STMPE24xx, set pin bias to pull-up on all keypad input +* pins (columns), this incidentally happen to be maximum 8 pins +* and placed at GPIO0-7 so only the LSB of the pull up register +* ever needs to be written. +*/ + if (variant->set_pullup) { + u8 val; + + ret = stmpe_reg_read(stmpe, pureg); + if (ret) + return ret; + + /* Do not touch unused pins, may be used for GPIO */ + val = ret & ~pu_pins; + val |= pu_pins; + + ret = stmpe_reg_write(stmpe, pureg, val); + } + + return 0; } static int stmpe_keypad_chip_init(struct stmpe_keypad *keypad) -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PULL REQUEST] i2c for 3.18
Linus, Highlights from the I2C subsystem for 3.18: * new drivers for Axxia AM55xx, and Hisilicon hix5hd2 SoC. * designware driver gained AMD support, exynos gained exynos7 support The rest is usual driver stuff. Hopefully no lowlights this time. Please pull. Thanks, Wolfram The following changes since commit fe82dcec644244676d55a1384c958d5f67979adb: Linux 3.17-rc7 (2014-09-28 14:29:07 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-next for you to fetch changes up to 3e27a8445c21f8056517f188303827450590d868: i2c: i801: Add Device IDs for Intel Sunrise Point PCH (2014-10-16 09:16:22 +0200) Anders Berg (1): i2c: axxia: Add I2C driver for AXM55xx Carl Peng (1): i2c: designware: Add support for AMD I2C controller Doug Anderson (2): i2c: rk3x: Remove unlikely() annotations i2c: cros_ec: Remove EC_I2C_FLAG_10BIT Fabio Estevam (1): i2c-imx: Disable the clock on probe failure Fan Du (1): i2c: ismt: Use minimum descriptor size Haibo Chen (1): i2c: imx: Add arbitration lost check Janusz Użycki (1): i2c: mxs: detect No Slave Ack on SELECT in PIO mode Mika Westerberg (2): i2c: designware: Default to fast mode in case of ACPI i2c: designware: Rework probe() to get clock a bit later Naveen Krishna Ch (1): i2c: exynos: add support for HSI2C module on Exynos7 Romain Baeriswyl (1): i2c: designware: add support of I2C standard mode Sergei Shtylyov (3): i2c: rcar: simplify check for last message i2c: rcar: make rcar_i2c_prepare_msg() *void* i2c: rcar: check for no IRQ in rcar_i2c_irq() Sjoerd Simons (1): i2c: cros-ec-tunnel: Add of match table Tan, Raymond (1): i2c: designware: add support of platform data to set I2C mode Wei Yan (1): i2c: hix5hd2: add i2c controller driver Wolfram Sang (1): i2c: rcar: remove sign-compare flaw james.d.rals...@intel.com (1): i2c: i801: Add Device IDs for Intel Sunrise Point PCH .../devicetree/bindings/i2c/i2c-axxia.txt | 30 ++ .../devicetree/bindings/i2c/i2c-exynos5.txt| 2 + .../devicetree/bindings/i2c/i2c-hix5hd2.txt| 24 + Documentation/i2c/busses/i2c-i801 | 1 + drivers/i2c/busses/Kconfig | 25 +- drivers/i2c/busses/Makefile| 2 + drivers/i2c/busses/i2c-axxia.c | 559 + drivers/i2c/busses/i2c-cros-ec-tunnel.c| 15 +- drivers/i2c/busses/i2c-designware-platdrv.c| 96 +++- drivers/i2c/busses/i2c-exynos5.c | 71 ++- drivers/i2c/busses/i2c-hix5hd2.c | 557 drivers/i2c/busses/i2c-i801.c | 3 + drivers/i2c/busses/i2c-imx.c | 16 +- drivers/i2c/busses/i2c-ismt.c | 2 +- drivers/i2c/busses/i2c-mxs.c | 3 + drivers/i2c/busses/i2c-rcar.c | 21 +- drivers/i2c/busses/i2c-rk3x.c | 4 +- include/linux/mfd/cros_ec_commands.h | 3 - include/linux/platform_data/i2c-designware.h | 21 + 19 files changed, 1404 insertions(+), 51 deletions(-) create mode 100644 Documentation/devicetree/bindings/i2c/i2c-axxia.txt create mode 100644 Documentation/devicetree/bindings/i2c/i2c-hix5hd2.txt create mode 100644 drivers/i2c/busses/i2c-axxia.c create mode 100644 drivers/i2c/busses/i2c-hix5hd2.c create mode 100644 include/linux/platform_data/i2c-designware.h signature.asc Description: Digital signature
Re: [PATCH] mm/zbud: init user ops only when it is needed
On Wed, Oct 15, 2014 at 6:00 AM, Heesub Shin wrote: > When zbud is initialized through the zpool wrapper, pool->ops which > points to user-defined operations is always set regardless of whether it > is specified from the upper layer. This causes zbud_reclaim_page() to > iterate its loop for evicting pool pages out without any gain. > > This patch sets the user-defined ops only when it is needed, so that > zbud_reclaim_page() can bail out the reclamation loop earlier if there > is no user-defined operations specified. Though the only current user (zswap) always passes an ops param, other future users may not and this is the right way to handle it. thanks! > > Signed-off-by: Heesub Shin Acked-by: Dan Streetman > --- > mm/zbud.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/zbud.c b/mm/zbud.c > index ecf1dbe..db8de74 100644 > --- a/mm/zbud.c > +++ b/mm/zbud.c > @@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = { > > static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops) > { > - return zbud_create_pool(gfp, &zbud_zpool_ops); > + return zbud_create_pool(gfp, zpool_ops ? &zbud_zpool_ops : NULL); > } > > static void zbud_zpool_destroy(void *pool) > -- > 1.9.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 00/13] Add ACPI _DSD and unified device properties support
On Thu, 16 Oct 2014 16:55:56 +0200 , David Woodhouse wrote: > On Wed, 2014-10-15 at 17:43 +0200, Darren Hart wrote: > > > > So my objection here is that by keeping the of_* terms in the driver we > > are required to include of, although it does safely convert to returning > > NULL if !CONFIG_OF I suppose. > > New version removes everything but the of_match_id bits which we need to > match ACPI devices too. Perhaps they ought to be renamed, but I'm not > sure it's worth it. > > This also removes the call to platform_get_resource(IORESOURCE_MEM) and > fall back to platform_get_resource(IORESOURCE_IO) as discussed IRL with > Rafael. I'm not sure it's much of an improvement, mind you :) > > Still untested. I think it's OK to switch to platform_get_irq() and then > drop the irq_dispose_mapping() call, right? The platform_device takes > care of all of that for us? > > diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig > index 26cec64..be95a4c 100644 > --- a/drivers/tty/serial/Kconfig > +++ b/drivers/tty/serial/Kconfig > @@ -1094,14 +1094,14 @@ config SERIAL_NETX_CONSOLE > you can make it the console by answering Y to this option. > > config SERIAL_OF_PLATFORM > - tristate "Serial port on Open Firmware platform bus" > - depends on OF > + tristate "Serial port on firmware platform bus" > + depends on OF || ACPI > depends on SERIAL_8250 || SERIAL_OF_PLATFORM_NWPSERIAL > help > - If you have a PowerPC based system that has serial ports > - on a platform specific bus, you should enable this option. > - Currently, only 8250 compatible ports are supported, but > - others can easily be added. > + If you have a system which advertises its serial ports through > + devicetree or ACPI, you should enable this option. Currently > + only 8250 compatible and NWP ports and are supported, but others > + can easily be added. > > config SERIAL_OMAP > tristate "OMAP serial port support" > diff --git a/drivers/tty/serial/of_serial.c b/drivers/tty/serial/of_serial.c > index 68d4455..cc6c99b 100644 > --- a/drivers/tty/serial/of_serial.c > +++ b/drivers/tty/serial/of_serial.c > @@ -14,8 +14,7 @@ > #include > #include > #include > -#include > -#include > +#include > #include > #include > #include > @@ -53,22 +52,22 @@ static inline void tegra_serial_handle_break(struct > uart_port *port) > /* > * Fill a struct uart_port for a given device node > */ > -static int of_platform_serial_setup(struct platform_device *ofdev, > +static int of_platform_serial_setup(struct platform_device *pdev, > int type, struct uart_port *port, > struct of_serial_info *info) > { > - struct resource resource; > - struct device_node *np = ofdev->dev.of_node; > u32 clk, spd, prop; > - int ret; > + int iotype = -1; > + u32 res_start; > + int ret, i; > > memset(port, 0, sizeof *port); > - if (of_property_read_u32(np, "clock-frequency", &clk)) { > + if (device_property_read_u32(&pdev->dev, "clock-frequency", &clk)) { > > /* Get clk rate through clk driver if present */ > - info->clk = clk_get(&ofdev->dev, NULL); > + info->clk = clk_get(&pdev->dev, NULL); > if (IS_ERR(info->clk)) { > - dev_warn(&ofdev->dev, > + dev_warn(&pdev->dev, > "clk or clock-frequency not defined\n"); > return PTR_ERR(info->clk); > } > @@ -77,57 +76,63 @@ static int of_platform_serial_setup(struct > platform_device *ofdev, > clk = clk_get_rate(info->clk); > } > /* If current-speed was set, then try not to change it. */ > - if (of_property_read_u32(np, "current-speed", &spd) == 0) > + if (device_property_read_u32(&pdev->dev, "current-speed", &spd) == 0) > port->custom_divisor = clk / (16 * spd); > > - ret = of_address_to_resource(np, 0, &resource); > - if (ret) { > - dev_warn(&ofdev->dev, "invalid address\n"); > + /* Check for shifted address mapping */ > + if (device_property_read_u32(&pdev->dev, "reg-offset", &prop) != 0) > + prop = 0; > + > + for (i = 0; iotype == -1 && i < pdev->num_resources; i++) { > + struct resource *resource = &pdev->resource[i]; > + if (resource_type(resource) == IORESOURCE_MEM) { > + iotype = UPIO_MEM; > + port->mapbase = res_start + prop; > + } else if (resource_type(resource) == IORESOURCE_IO) { > + iotype = UPIO_PORT; > + port->iobase = res_start + prop; > + } > + > + res_start = resource->start; > + } > + if (iotype == -1) { > + dev_warn(&pdev->dev, "invalid address\n"); > goto out; > } > > spin_lock_in
Re: [PATCH v5 00/12] Add ACPI _DSD and unified device properties support
On Sat, 18 Oct 2014 00:50 +0200 , "Rafael J. Wysocki" wrote: > On Friday, October 17, 2014 08:04:52 PM Arnd Bergmann wrote: > > > > On October 17, 2014 2:01:33 PM CEST, "Rafael J. Wysocki" > > wrote: > > >Hi Everyone, > > > > > >Hving had a couple of chats with Grant and Arnd during LinuxCon EU/LPC, > > >we > > >now have version 5 taking all feedback into account (hopefully). > > > > Awesome, that was really fast. I'm currently on my way his me in > > the train, replying from my phone, but it looks good now. I'll have a more > > detailed look next week but I'm definitely happy to see this go in (to next > > and 3.19) now, any details we still find can be fixed on top. > > > > > In > > >short, if > > >we are passed a struct fwnode_handle pointer, we can get from it to the > > >appropriate device node pointer (either struct acpi_device or struct > > >device_node) > > >using container_of() after we've checked the type. This is needed for > > >the code > > >that needs to access child nodes of a device in case when they don't > > >have > > >struct device representations (whatever the reason). This has been > > >suggested > > >by Grant and pretty much everyone involved agrees that it's better that > > >the > > >alternatives presented so far. > > > > Yes, it's nice enough that I now take back all the objections I had for the > > child accessory API. > > Cool, thanks! > > I've just refreshed the device-properties branch of the linux-pm.git tree > and it contains the current material now (including a couple of build > fixes reported by the autobuild robot in patches [02/12] and [09/12]). > > My plan is to merge this into the linux-next branch when 3.18-rc1 is out. Wait for my ack please, but I should be able to review it this weekend or on Monday. I'm actually reviewing now, but I'm on a plane near the end of the flight. As soon as I get on the ground SIGFAMILY will interrupt me for a bit. :-) g. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 00/13] Add ACPI _DSD and unified device properties support
On Wed, 15 Oct 2014 17:43:01 +0200 , Darren Hart wrote: > > > On 10/15/14 17:17, Mark Rutland wrote: > > On Wed, Oct 15, 2014 at 03:46:39PM +0100, Darren Hart wrote: > > >> Mark, what would you propose we do differently to enable this driver to > >> be firmware-type agnostic? > > > > For this particular driver, all I'm asking for is that the > > "used-by-rtas" property is not moved over from of_find_property to > > device_get_property. It is irrelevant for all ACPI systems. Evidently my > > comment was unclear; I apologise for that. > > So my objection here is that by keeping the of_* terms in the driver we > are required to include of, although it does safely convert to returning > NULL if !CONFIG_OF I suppose. This shouldn't be that controversial. There will be things that only make sense for DT or only ACPI. Allowing the property to be processed when the other interface is being used may tempt firmware authors to use the property because it just happens to have a side effect that looks right to them. I don't see any problem with factoring out those bits into a function that is only called (or built) when the associated firmware interface is used. In these situations, the driver isn't 100% generic, so having small per-firmware hooks is absolutely okay and not a burden to maintain. g. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 00/13] Add ACPI _DSD and unified device properties support
On Thu, 16 Oct 2014 16:55:56 +0200 , David Woodhouse wrote: > On Wed, 2014-10-15 at 17:43 +0200, Darren Hart wrote: > > > > So my objection here is that by keeping the of_* terms in the driver we > > are required to include of, although it does safely convert to returning > > NULL if !CONFIG_OF I suppose. > > New version removes everything but the of_match_id bits which we need to > match ACPI devices too. Perhaps they ought to be renamed, but I'm not > sure it's worth it. > > This also removes the call to platform_get_resource(IORESOURCE_MEM) and > fall back to platform_get_resource(IORESOURCE_IO) as discussed IRL with > Rafael. I'm not sure it's much of an improvement, mind you :) > > Still untested. I think it's OK to switch to platform_get_irq() and then > drop the irq_dispose_mapping() call, right? The platform_device takes > care of all of that for us? Well, the irq management code is all messed up, but what you are doing is indeed okay. Unfortunately for platform devices we can never free an IRQ once we've claimed it. That's a completely separate problem and you don't need to worry about it for this patch. g. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] um: Remove unused bp stack-frame pointer
The pointer to bp stack-frame is no longer used. Removed it. This also removes a corresponding compiler-warning. Signed-off-by: Manfred Schlaegl --- arch/um/kernel/sysrq.c |6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/um/kernel/sysrq.c b/arch/um/kernel/sysrq.c index 894c8d3..aa1b56f 100644 --- a/arch/um/kernel/sysrq.c +++ b/arch/um/kernel/sysrq.c @@ -29,7 +29,7 @@ static const struct stacktrace_ops stackops = { void show_stack(struct task_struct *task, unsigned long *stack) { - unsigned long *sp = stack, bp = 0; + unsigned long *sp = stack; struct pt_regs *segv_regs = current->thread.segv_regs; int i; @@ -39,10 +39,6 @@ void show_stack(struct task_struct *task, unsigned long *stack) return; } -#ifdef CONFIG_FRAME_POINTER - bp = get_frame_pointer(task, segv_regs); -#endif - if (!stack) sp = get_stack_pointer(task, segv_regs); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
new GPG key
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 My backpack was stolen in Dusseldorf airport. I have started changing passwords, and will also revoke my current GPG key soon. If you have signed my previous key, or if you have an account on kernel.org, please contact me so that I can have my new key signed soon. Advice to people that use GPG routinely... If you are not doing it yet, do the following, in increasing order of importance: 0) do not forget that you need a way to create a revocation certificate (of course I had no problem with this). Paper, isolated machine (my choice), USB key, whatever, but do it. 1) never put any 2-factor authentication tokens (which includes phones!) in your backpack. Luckily I had my token and passport on myself. Everything would have been **extremely** more complicated if I hadn't. It also makes two factor authentication much more effective, since a laptop after all is one of the easiest things to steal. 2) in addition to the usual encryption subkey, create one for signing and use that instead of the master key; 3) put the master key on a USB key, and replace it with a stub. These two steps are very easy to do and enough to avoid having to rebuild the whole trust chain. Unfortunately, it was on my todo list for, ehm, next week. 4) No, putting the master key and revocation certificate on the same USB key is not a good idea. 5) Get a smartcard or a Yubikey NEO and put the subkeys on it; replace subkeys with stubs on your usual working machines, especially laptops. It gives you two factor authentication for free, and can also be used for SSH if you add a third subkey. This tutorial covers most of the above steps: http://blog.josefsson.org/2014/06/23/offline-gnupg-master-key-and-subkeys-on-yubikey-neo-smartcard/ Thanks for your understanding, Paolo -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBAgAGBQJUQnbQAAoJEBRUblpOawnXi60H/1gd7YEc9OHDwvPVPIAe7bUk TfpITHNVD+wTVrelSW5i+w6Hv2i/EeKMhn26Z5RP5oWKHPQNJ5QCLS1i2JDraC7R 2KkOoBBKypHLYg1p2O7NxZB4Jh7ltYPHOQ3yqUDgEeofubF7Sj+kdo8c+eEFOJdl ScALtdy99WoH7oWrXJIm7UmNQSvkKfF99Ur5PMuGGEP57RbgJGFYWihbgeyYRS9g fFTCWC8Rka/BDsoFQJaFNQVhvWQLT14JJ6pRMNuCT4744wzX9ygRWZk34iwx4tNo 9Ys0QEvOR6ue7i/OvwDvUa5jL7uDJw/X0lg8qJiV/ZiSBuY3aIWBEPb5lnx0/uWJ ARwEAQECAAYFAlRCdtAACgkQv/vSX3jHroOLrQf6A3SkeXlEt26F3E2AcmBbP9T1 ArIPMQ1uJXQWBai4hj0BpvzuUeIvvT6/jlQpkfspn09iD9TDYyNQz5n37NCVfzh2 yHKzDfXj6Hu5uQ13zbw8EvZj4cPQUHtKCT7wH+BPCmwd2Jd68MrscGyOz5emIGtZ VHxM7c2FMR8C2LtOGJq/WbunqBSZLBECTXE8dyusW0ZDnnT72ZmSs7DDLqEk5Fy6 KJbLpfHLw9QTwDE9Ed/KauHZ+Sgdz50Lbv1MwWCT2Ep+2HS8nQJ71oXgAiB6vLSI njB1bQfLGYS/k/sE/rlC1f+PEAquIbGXI6nSsiCFQdZnH6flkY/b8SWZe1uawg== =SOwE -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] staging: comedi: drivers: comedi_bond.c: Changed from using strncat to strlcat
Changed from using strncat to strlcat to simplify the code Signed-off-by: Rickard Strandqvist Reviewed-by: Ian Abbott Reviewed-by: H Hartley Sweeten --- drivers/staging/comedi/drivers/comedi_bond.c |9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/staging/comedi/drivers/comedi_bond.c b/drivers/staging/comedi/drivers/comedi_bond.c index 8450c99..85b2f4a 100644 --- a/drivers/staging/comedi/drivers/comedi_bond.c +++ b/drivers/staging/comedi/drivers/comedi_bond.c @@ -61,8 +61,7 @@ struct bonded_device { }; struct comedi_bond_private { -# define MAX_BOARD_NAME 256 - char name[MAX_BOARD_NAME]; + char name[256]; struct bonded_device **devs; unsigned ndevs; unsigned nchans; @@ -262,12 +261,10 @@ static int do_dev_config(struct comedi_device *dev, struct comedi_devconfig *it) { /* Append dev:subdev to devpriv->name */ char buf[20]; - int left = - MAX_BOARD_NAME - strlen(devpriv->name) - 1; snprintf(buf, sizeof(buf), "%u:%u ", bdev->minor, bdev->subdev); - buf[sizeof(buf) - 1] = 0; - strncat(devpriv->name, buf, left); + strlcat(devpriv->name, buf, + sizeof(devpriv->name)); } } -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] um: Remove unused bp stack-frame pointer
Am 18.10.2014 um 16:23 schrieb Manfred Schlaegl: > The pointer to bp stack-frame is no longer used. Removed it. Good catch! > This also removes a corresponding compiler-warning. Which warning exactly? Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Core block changes for 3.18
Hi Linus, This is the core block IO pull request for 3.18. Apart from the new and improve flush machinery for blk-mq, this is all mostly bug fixes and cleanups. - blk-mq timeout updates and fixes from Christoph. - Removal of REQ_END, also from Christoph. We pass it through the ->queue_rq() hook for blk-mq instead, freeing up one of the request bits. The space was overly tight on 32-bit, so Martin also killed REQ_KERNEL since it's no longer used. - blk integrity updates and fixes from Martin and Gu Zheng. - Update to the flush machinery for blk-mq from Ming Lei. Now we have a per hardware context flush request, which both cleans up the code should scale better for flush intensive workloads on blk-mq. - Improve the error printing, from Rob Elliott. - Backing device improvements and cleanups from Tejun. - Fixup of a misplaced rq_complete() tracepoint from Hannes. - Make blk_get_request() return error pointers, fixing up issues where we NULL deref when a device goes bad or missing. From Joe Lawrence. - Prep work for drastically reducing the memory consumption of dm devices from Junichi Nomura. This allows creating clone bio sets without preallocating a lot of memory. - Fix a blk-mq hang on certain combinations of queue depths and hardware queues from me. - Limit memory consumption for blk-mq devices for crash dump scenarios and drivers that use crazy high depths (certain SCSI shared tag setups). We now just use a single queue and limited depth for that. Please pull! git://git.kernel.dk/linux-block.git for-3.18/core Bart Van Assche (1): blk-mq: Make bt_clear_tag() easier to read Christoph Hellwig (8): blk-mq: remove REQ_END blk-mq: call blk_mq_start_request from ->queue_rq blk-mq: rename blk_mq_end_io to blk_mq_end_request blk-mq: fix and simplify tag iteration for the timeout handler blk-mq: unshared timeout handler blk-mq: pass a reserved argument to the timeout handler block: fix blk_abort_request on blk-mq scsi: move blk_mq_start_request call earlier Gu Zheng (1): bio-integrity: remove the needless fail handle of bip_slab creating Hannes Reinecke (1): block: misplaced rq_complete tracepoint Jens Axboe (6): bsg: fix potential error pointer dereference Merge branch 'for-linus' into for-3.18/core Merge branch 'for-linus' into for-3.18/core blk-mq: limit memory consumption if a crash dump is active blk-mq: fix potential hang if rolling wakeup depth is too high blk-mq: allocate cpumask on the home node Joe Lawrence (2): block,scsi: verify return pointer from blk_get_request block,scsi: fixup blk_get_request dead queue scenarios Junichi Nomura (2): block: use bio_clone_fast() in blk_rq_prep_clone() block: add bioset_create_nobvec() Martin K. Petersen (15): block: Get rid of bdev_integrity_enabled() block: Replace bi_integrity with bi_special block: Remove integrity tagging functions block: Remove bip_buf block: Deprecate the use of the term sector in the context of block integrity block: Make protection interval calculation generic block: Clean up the code used to generate and verify integrity metadata block: Add prefix to block integrity profile flags block: Add a disk flag to block integrity profile block: Relocate bio integrity flags block: Integrity checksum flag block: Don't merge requests if integrity flags differ block: Add T10 Protection Information functions sd: Honor block layer integrity handling flags block: Remove REQ_KERNEL Mike Snitzer (1): block: fix alignment_offset math that assumes io_min is a power-of-2 Ming Lei (13): blk-mq: remove unnecessary blk_clear_rq_complete() blk-timeout: fix blk_add_timer blk-mq: handle failure path for initializing hctx blk-mq: allocate flush_rq in blk_mq_init_flush() block: introduce blk_init_flush and its pair block: move flush initialization to blk_flush_init block: avoid to use q->flush_rq directly block: introduce blk_flush_queue to drive flush machinery block: remove blk_init_flush() and its pair block: flush: avoid to figure out flush queue unnecessarily block: introduce 'blk_mq_ctx' parameter to blk_get_flush_queue blk-mq: support per-distpatch_queue flush machinery blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio Rasmus Villemoes (1): block: Replace strnicmp with strncasecmp Robert Elliott (2): block: make blk_update_request print prefix match ratelimited prefix block: include func name in __get_request prints Tejun Heo (7): blkcg: remove blkcg->id block, bdi: an active gendisk always has a request_queue associated with it bdi: remove unused stuff bdi: remove bdi->wb_lock locking ar
Re: [PATCH 2/4] mm: introduce new VM_NOZEROPAGE flag
On Fri, 17 Oct 2014 15:04:21 -0700 Dave Hansen wrote: > Is there ever a time where the VMAs under an mm have mixed VM_NOZEROPAGE > status? Reading the patches, it _looks_ like it might be an all or > nothing thing. Currently it is an all or nothing thing, but for a future change we might want to just tag the guest memory instead of the complete user address space. > Full disclosure: I've got an x86-specific feature I want to steal a flag > for. Maybe we should just define another VM_ARCH bit. > So you think of something like: #if defined(CONFIG_S390) # define VM_NOZEROPAGE VM_ARCH_1 #endif #ifndef VM_NOZEROPAGE # define VM_NOZEROPAGE VM_NONE #endif right? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Block driver pull for 3.18
Hi Linus, This is the block driver pull request for 3.18. Not a lot in there this round, and nothing earth shattering. - A round of drbd fixes from the linbit team, and an improvement in asender performance. - Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and hd from Michael Opdenacker. - Disable entropy collection from flash devices by default, from Mike Snitzer. - A small collection of xen blkfront/back fixes from Roger Pau Monné and Vitaly Kuznetsov. Please pull! git://git.kernel.dk/linux-block.git for-3.18/drivers Andreas Gruenbacher (5): drbd: Use better variable names drbd: Use consistent names for all the bi_end_io callbacks drbd: Avoid inconsistent locking warning drbd: Get rid of the __no_warn and __cond_lock macros drbd: Get rid of the WORK_PENDING macro Arianna Avanzini (1): xen, blkfront: factor out flush-related checks from do_blkif_request() Jens Axboe (1): Merge branch 'stable/for-jens-3.18' of git://git.kernel.org/.../konrad/xen into for-3.18/drivers Lai Jiangshan (2): drbd: compute the end before rb_insert_augmented() drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks Lars Ellenberg (2): drbd: Improve asender performance drbd: reduce lock contention in drbd_worker Michael Opdenacker (2): block: hd: remove deprecated IRQF_DISABLED rsxx: Remove deprecated IRQF_DISABLED Mike Snitzer (1): block: disable entropy contributions for nonrot devices Philipp Marek (1): drbd: Remove superfluous newline from "resync_extents" debugfs entry. Philipp Reisner (1): drbd: Add missing newline in resync progress display in /proc/drbd Roger Pau Monné (1): xen-blkback: fix leak on grant map error path Roger Pau Monné (1): Vitaly Kuznetsov (1): xen/blkback: unmap all persistent grants when frontend gets disconnected drivers/block/drbd/drbd_actlog.c| 4 +-- drivers/block/drbd/drbd_bitmap.c| 6 ++--- drivers/block/drbd/drbd_debugfs.c | 2 +- drivers/block/drbd/drbd_int.h | 19 +++--- drivers/block/drbd/drbd_interval.c | 40 +--- drivers/block/drbd/drbd_main.c | 28 ++-- drivers/block/drbd/drbd_proc.c | 4 ++- drivers/block/drbd/drbd_receiver.c | 52 + drivers/block/drbd/drbd_req.c | 2 +- drivers/block/drbd/drbd_state.c | 18 ++--- drivers/block/drbd/drbd_worker.c| 51 +++- drivers/block/hd.c | 12 + drivers/block/mtip32xx/mtip32xx.c | 1 + drivers/block/nbd.c | 1 + drivers/block/null_blk.c| 1 + drivers/block/nvme-core.c | 1 + drivers/block/rsxx/core.c | 2 +- drivers/block/rsxx/dev.c| 1 + drivers/block/skd_main.c| 1 + drivers/block/xen-blkback/blkback.c | 1 + drivers/block/xen-blkback/xenbus.c | 6 ++--- drivers/block/xen-blkfront.c| 12 ++--- drivers/block/zram/zram_drv.c | 1 + drivers/ide/ide-disk.c | 4 ++- drivers/md/bcache/super.c | 1 + drivers/mmc/card/queue.c| 1 + drivers/mtd/mtd_blkdevs.c | 1 + drivers/s390/block/scm_blk.c| 1 + drivers/s390/block/xpram.c | 1 + drivers/scsi/sd.c | 4 ++- 30 files changed, 132 insertions(+), 147 deletions(-) -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 09/12] Driver core: Unified interface for firmware node properties
On Fri, 17 Oct 2014 14:14:53 +0200 , "Rafael J. Wysocki" wrote: > From: Rafael J. Wysocki > > Add new generic routines are provided for retrieving properties from > device description objects in the platform firmware in case there are > no struct device objects for them (either those objects have not been > created yet or they do not exist at all). > > The following functions are provided: > > fwnode_property_present() > fwnode_property_read_u8() > fwnode_property_read_u16() > fwnode_property_read_u32() > fwnode_property_read_u64() > fwnode_property_read_string() > fwnode_property_read_u8_array() > fwnode_property_read_u16_array() > fwnode_property_read_u32_array() > fwnode_property_read_u64_array() > fwnode_property_read_string_array() > > in analogy with the corresponding functions for struct device added > previously. For all of them, the first argument is a pointer to struct > fwnode_handle (new type) that allows a device description object > (depending on what platform firmware interface is in use) to be > obtained. > > Add a new macro device_for_each_child_node() for iterating over the > children of the device description object associated with a given > device and a new function device_get_child_node_count() returning the > number of a given device's child nodes. > > The interface covers both ACPI and Device Trees. This is all *so much* better. I'm a lot happier. I was about to make the comment that the implementation for device_property_read_*() should merely be wrappers around fwnode_property_read_*(), but when when I actually looked at it, I saw this: In patch 2: int device_property_read_u8(struct device *dev, const char *propname, u8 *val) { if (IS_ENABLED(CONFIG_OF) && dev->of_node) return of_property_read_u8(dev->of_node, propname, val); return acpi_dev_prop_read(ACPI_COMPANION(dev), propname, DEV_PROP_U8, val); } And in this patch: int fwnode_property_read_u8(struct fwnode_handle *fwnode, const char *propname, u8 *val) { if (is_of_node(fwnode)) return of_property_read_u8(of_node(fwnode), propname, val); else if (is_acpi_node(fwnode)) return acpi_dev_prop_read(acpi_node(fwnode), propname, DEV_PROP_U8, val); return -ENXIO; } Making the device_property functions wrappers around fwnode_property_* wouldn't actually be great since it would need to decode the fwnode pointer twice. I do still think the functions above should be macro generated, just in terms of keeping the line count down, and I would suggest merging patches #2 and #9. Something like: #define define_fwnode_accessors(__type, __devprop_type) \ int device_property_read_##__type(struct device *dev, \ const char *propname, __type *val) \ { \ if (IS_ENABLED(CONFIG_OF) && dev->of_node) \ return of_property_read_##__type(dev->of_node, propname, val); \ return acpi_dev_prop_read(ACPI_COMPANION(dev), propname, \ __devprop_type, val); \ } \ int fwnode_property_read_##__type(struct fwnode_handle *fwnode, \ const char *propname, __type *val) \ { \ if (IS_ENABLED(CONFIG_OF) && is_of_node(fwnode)) \ return of_property_read_##__type(of_node(fwnode), propname, val); \ else if (IS_ENABLED(CONFIG_ACPI) && is_acpi_node(fwnode)) \ return acpi_dev_prop_read(acpi_node(fwnode), propname, \ __devprop_type, val); \ return -ENXIO; \ } define_fwnode_accessors(u8, DEV_PROP_U8); define_fwnode_accessors(u16, DEV_PROP_U16); define_fwnode_accessors(u32, DEV_PROP_U32); define_fwnode_accessors(u64, DEV_PROP_U64); That significantly reduces the code size for these things. Also, can the non-array versions be implemented as a wrapper around the array versions? That also will reduce the sheer number of lines of code a lot. Maybe this: #define define_fwnode_accessors(__type, __devprop_type) \ int device_property_read_##__type##_array(struct device *dev, \ const char *propname, __type *val, \ size_t nval) \ { \ if (IS_ENABLED(CONFIG_OF) && dev->of_node) \ return of_property_read_##__type##_array(dev->of_node, \ propname, val, nval); \ return acpi_dev_prop_read_array(ACPI_COMPANION(dev), propname, \ __devprop_type, val, nval); \ } \ static inline int device_property_read_##__type(struct device *dev, \ const char *propname, __type *val) \ { \ return device_property_read_##__type##_array(dev, propname, val, 1) \ } \ int fwnode_property_read_##__type##_array(struct fwnode_handle *fwnode, \ const char *propname, __type *val, \
Re: [PATCH] scsi: Resolve some missing-field-initializers warnings
On Fri, Oct 17, 2014 at 10:44:36PM +, Rustad, Mark D wrote: > The warning appears in W=2 builds. I had another way to silence it by using > diagnostic control macros, but those macros were not accepted. Using a single > designated initialization also silences it. Oh well. I think the earlier version was slightly cleaner, but if it helps people to catch real bugs better I'm happy to apply it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] um: Remove unused bp stack-frame pointer
On 2014-10-18 16:42, Richard Weinberger wrote: > Am 18.10.2014 um 16:23 schrieb Manfred Schlaegl: >> The pointer to bp stack-frame is no longer used. Removed it. > > Good catch! Thank you. > >> This also removes a corresponding compiler-warning. > > Which warning exactly? On "normal" (defconfig) builds the warning does not show up because CONFIG_FRAME_POINTER is set. I've found the unused bp because CONFIG_FRAME_POINTER was not set in my configuration. CC arch/um/kernel/sysrq.o arch/um/kernel/sysrq.c: In function ‘show_stack’: arch/um/kernel/sysrq.c:32:29: warning: unused variable ‘bp’ [-Wunused-variable] It's reproducible on linus-tree (0429fbc0b) by make mrproper make tinyconfig ARCH=um SUBARCH=i386 make ARCH=um SUBARCH=i386 > > Thanks, > //richard > best regards, manfred -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] block: add function to issue compare and write
On Fri, Oct 17, 2014 at 07:38:37PM -0400, Martin K. Petersen wrote: > The problem with this is that, as it stands, a bio has no type. And it > would suck if we couldn't keep bio rw and request flags in sync. > > I wonder if it would make more sense to move the remaining rq types to > cmd_flags after I'm done with the 64-bit conversion? I'd prefer adding a cmd_type to the bio as well and avoid the 64-bit flag conversion. While we'll probably grow more types of I/Os (e.g. copy offload) I hope we can actually reduce the number of real flags, and it's easier to read for sure if we can switch on the command type in the driver. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] um: Remove unused bp stack-frame pointer
Am 18.10.2014 um 17:12 schrieb Manfred Schlaegl: > On 2014-10-18 16:42, Richard Weinberger wrote: >> Am 18.10.2014 um 16:23 schrieb Manfred Schlaegl: >>> The pointer to bp stack-frame is no longer used. Removed it. >> >> Good catch! > Thank you. > >> >>> This also removes a corresponding compiler-warning. >> >> Which warning exactly? > > On "normal" (defconfig) builds the warning does not show up because > CONFIG_FRAME_POINTER is set. > I've found the unused bp because CONFIG_FRAME_POINTER was not set in my > configuration. > > CC arch/um/kernel/sysrq.o > arch/um/kernel/sysrq.c: In function ‘show_stack’: > arch/um/kernel/sysrq.c:32:29: warning: unused variable ‘bp’ > [-Wunused-variable] Looks like my gcc need's an upgrade. :D Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] ecryptfs: add fadvise/set_flags calbacks
CC: tyhi...@canonical.com CC: ecryp...@vger.kernel.org Signed-off-by: Dmitry Monakhov --- fs/ecryptfs/file.c | 62 1 files changed, 62 insertions(+), 0 deletions(-) diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c index 4ffa35e..c84df35 100644 --- a/fs/ecryptfs/file.c +++ b/fs/ecryptfs/file.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include "ecryptfs_kernel.h" @@ -333,6 +334,63 @@ ecryptfs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) } #endif +static int +ecryptfs_fadvise(struct file *file, loff_t offset, loff_t len, int advice) +{ + struct file *lower_file = NULL; + long rc = 0; + + if (ecryptfs_file_to_private(file)) + lower_file = ecryptfs_file_to_lower(file); + + if (!lower_file || !lower_file->f_op) + return rc; + + if (lower_file->f_op && lower_file->f_op->fadvise) + rc = lower_file->f_op->fadvise(lower_file, offset, len, advice); + else + rc = generic_fadvise(lower_file, offset, len, advice); + if (!rc) + generic_fadvise(file, offset, len, advice); + + return rc; +} + +#define ECRYPTFS_FL_MASK (O_NONBLOCK | O_NDELAY | O_NOATIME) +static int ecryptfs_set_flags(struct file *file, unsigned flags) +{ + struct ecryptfs_mount_crypt_stat *mount_crypt_stat; + struct dentry *ecryptfs_dentry = file->f_path.dentry; + struct file *lower_file = NULL; + int rc = 0; + + mount_crypt_stat = &ecryptfs_superblock_to_private( + ecryptfs_dentry->d_sb)->mount_crypt_stat; + if ((mount_crypt_stat->flags & ECRYPTFS_ENCRYPTED_VIEW_ENABLED) + && (flags & O_APPEND)) { + printk(KERN_WARNING "Mount has encrypted view enabled; " + "files may only be read\n"); + rc = -EPERM; + goto out; + } + + if (ecryptfs_file_to_private(file)) + lower_file = ecryptfs_file_to_lower(file); + if (!lower_file) + goto out; + + if (lower_file->f_op && lower_file->f_op->set_flags) { + rc = lower_file->f_op->set_flags(lower_file, +flags & ECRYPTFS_FL_MASK); + if (rc) + return rc; + } else + generic_file_set_flags(file, flags & ECRYPTFS_FL_MASK); + + generic_file_set_flags(file, flags & ECRYPTFS_FL_MASK); +out: + return rc; +} const struct file_operations ecryptfs_dir_fops = { .iterate = ecryptfs_readdir, .read = generic_read_dir, @@ -347,6 +405,8 @@ const struct file_operations ecryptfs_dir_fops = { .fasync = ecryptfs_fasync, .splice_read = generic_file_splice_read, .llseek = default_llseek, + .set_flags = ecryptfs_set_flags, + .fadvise = ecryptfs_fadvise, }; const struct file_operations ecryptfs_main_fops = { @@ -367,4 +427,6 @@ const struct file_operations ecryptfs_main_fops = { .fsync = ecryptfs_fsync, .fasync = ecryptfs_fasync, .splice_read = generic_file_splice_read, + .set_flags = ecryptfs_set_flags, + .fadvise = ecryptfs_fadvise, }; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] fs: fcntl add set_flags wrapper -v2
fcntl(F_SETFL) performs direct f_flags manipulation which may be not be suitable for some filesytems (mostly stack-fs like ecryptfs, unionfs, etc) For example O_DIRECT toggling may require some actions (page cache flush) Let's introduce new ->set_flags() callback for that purpose. This callback is responsible for flags check so ->check_flags() no longer needed. changes from v1: - add generic_file_set_flags helper Signed-off-by: Dmitry Monakhov --- fs/bad_inode.c |4 ++-- fs/fcntl.c | 14 -- fs/nfs/dir.c |5 ++--- fs/nfs/file.c | 12 fs/nfs/internal.h |2 +- fs/nfs/nfs4file.c |2 +- include/linux/fs.h | 10 +- 7 files changed, 27 insertions(+), 22 deletions(-) diff --git a/fs/bad_inode.c b/fs/bad_inode.c index afd2b44..1977f10 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -121,7 +121,7 @@ static unsigned long bad_file_get_unmapped_area(struct file *file, return -EIO; } -static int bad_file_check_flags(int flags) +static int bad_file_set_flags(struct file *file, int flags) { return -EIO; } @@ -166,7 +166,7 @@ static const struct file_operations bad_file_ops = .lock = bad_file_lock, .sendpage = bad_file_sendpage, .get_unmapped_area = bad_file_get_unmapped_area, - .check_flags= bad_file_check_flags, + .set_flags = bad_file_set_flags, .flock = bad_file_flock, .splice_write = bad_file_splice_write, .splice_read= bad_file_splice_read, diff --git a/fs/fcntl.c b/fs/fcntl.c index 22d1c3d..2c1b3d7 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -27,8 +27,6 @@ #include #include -#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME) - static int setfl(int fd, struct file * filp, unsigned long arg) { struct inode * inode = file_inode(filp); @@ -57,11 +55,6 @@ static int setfl(int fd, struct file * filp, unsigned long arg) return -EINVAL; } - if (filp->f_op->check_flags) - error = filp->f_op->check_flags(arg); - if (error) - return error; - /* * ->fasync() is responsible for setting the FASYNC bit. */ @@ -72,10 +65,11 @@ static int setfl(int fd, struct file * filp, unsigned long arg) if (error > 0) error = 0; } - spin_lock(&filp->f_lock); - filp->f_flags = (arg & SETFL_MASK) | (filp->f_flags & ~SETFL_MASK); - spin_unlock(&filp->f_lock); + if (filp->f_op && filp->f_op->set_flags) + error = filp->f_op->set_flags(filp, arg); + else + generic_file_set_flags(filp, arg); out: return error; } diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 06e8cfc..8b5d9dc 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1480,9 +1480,8 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, dfprintk(VFS, "NFS: atomic_open(%s/%lu), %pd\n", dir->i_sb->s_id, dir->i_ino, dentry); - err = nfs_check_flags(open_flags); - if (err) - return err; + if ((open_flags & (O_APPEND | O_DIRECT)) == (O_APPEND | O_DIRECT)) + return -EINVAL; /* NFS only supports OPEN on regular files */ if ((open_flags & O_DIRECTORY)) { diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 524dd80..b68d272 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -48,14 +48,18 @@ static const struct vm_operations_struct nfs_file_vm_ops; # define IS_SWAPFILE(inode)(0) #endif -int nfs_check_flags(int flags) +#define NFS_FL_MASK (O_NONBLOCK | O_NDELAY | O_NOATIME) +int nfs_set_flags(struct file *filp, int flags) { if ((flags & (O_APPEND | O_DIRECT)) == (O_APPEND | O_DIRECT)) return -EINVAL; + spin_lock(&filp->f_lock); + filp->f_flags = (flags & NFS_FL_MASK) | (filp->f_flags & ~NFS_FL_MASK); + spin_unlock(&filp->f_lock); return 0; } -EXPORT_SYMBOL_GPL(nfs_check_flags); +EXPORT_SYMBOL_GPL(nfs_set_flags); /* * Open file @@ -68,7 +72,7 @@ nfs_file_open(struct inode *inode, struct file *filp) dprintk("NFS: open file(%pD2)\n", filp); nfs_inc_stats(inode, NFSIOS_VFSOPEN); - res = nfs_check_flags(filp->f_flags); + res = nfs_set_flags(filp, filp->f_flags); if (res) return res; @@ -917,7 +921,7 @@ const struct file_operations nfs_file_operations = { .flock = nfs_flock, .splice_read= nfs_file_splice_read, .splice_write = iter_file_splice_write, - .check_flags= nfs_check_flags, + .set_flags = nfs_set_flags, .setlease = nfs_setlease, }; EXPORT_SYMBOL_GPL(nfs_file_operations); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 9056622..00cf588 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -345,7 +345,7 @@ ssize_t nfs_file_w
[PATCH 2/4] fs: add fadvise file_operation
sys_fadvise result in direct f_mode modification, which may be not suitable for some unusual filesytems where file mode invariant is more complex. In order to support such filesystems we have to delegate fadvise logic to filesystem layer. Signed-off-by: Dmitry Monakhov --- include/linux/fs.h |4 ++ mm/fadvise.c | 81 --- 2 files changed, 55 insertions(+), 30 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 4ce1414..0fe06f5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1518,6 +1518,7 @@ struct file_operations { long (*fallocate)(struct file *file, int mode, loff_t offset, loff_t len); int (*show_fdinfo)(struct seq_file *m, struct file *f); + int (*fadvise)(struct file *file, loff_t off, loff_t len, int advice); }; struct inode_operations { @@ -2081,6 +2082,9 @@ extern int finish_open(struct file *file, struct dentry *dentry, int *opened); extern int finish_no_open(struct file *file, struct dentry *dentry); +/* fs/fadvise.c */ +extern int generic_fadvise(struct file *file, loff_t off, loff_t len, int adv); + /* fs/ioctl.c */ extern int ioctl_preallocate(struct file *filp, void __user *argp); diff --git a/mm/fadvise.c b/mm/fadvise.c index 3bcfd81..a568ba6 100644 --- a/mm/fadvise.c +++ b/mm/fadvise.c @@ -7,6 +7,7 @@ * Initial version. */ +#include #include #include #include @@ -25,10 +26,9 @@ * POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could * deactivate the pages and clear PG_Referenced. */ -SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) +int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) { - struct fd f = fdget(fd); - struct address_space *mapping; + struct address_space *mapping = file->f_mapping; struct backing_dev_info *bdi; loff_t endbyte; /* inclusive */ pgoff_t start_index; @@ -36,20 +36,6 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) unsigned long nrpages; int ret = 0; - if (!f.file) - return -EBADF; - - if (S_ISFIFO(file_inode(f.file)->i_mode)) { - ret = -ESPIPE; - goto out; - } - - mapping = f.file->f_mapping; - if (!mapping || len < 0) { - ret = -EINVAL; - goto out; - } - if (mapping->a_ops->get_xip_mem) { switch (advice) { case POSIX_FADV_NORMAL: @@ -77,21 +63,21 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) switch (advice) { case POSIX_FADV_NORMAL: - f.file->f_ra.ra_pages = bdi->ra_pages; - spin_lock(&f.file->f_lock); - f.file->f_mode &= ~FMODE_RANDOM; - spin_unlock(&f.file->f_lock); + file->f_ra.ra_pages = bdi->ra_pages; + spin_lock(&file->f_lock); + file->f_mode &= ~FMODE_RANDOM; + spin_unlock(&file->f_lock); break; case POSIX_FADV_RANDOM: - spin_lock(&f.file->f_lock); - f.file->f_mode |= FMODE_RANDOM; - spin_unlock(&f.file->f_lock); + spin_lock(&file->f_lock); + file->f_mode |= FMODE_RANDOM; + spin_unlock(&file->f_lock); break; case POSIX_FADV_SEQUENTIAL: - f.file->f_ra.ra_pages = bdi->ra_pages * 2; - spin_lock(&f.file->f_lock); - f.file->f_mode &= ~FMODE_RANDOM; - spin_unlock(&f.file->f_lock); + file->f_ra.ra_pages = bdi->ra_pages * 2; + spin_lock(&file->f_lock); + file->f_mode &= ~FMODE_RANDOM; + spin_unlock(&file->f_lock); break; case POSIX_FADV_WILLNEED: /* First and last PARTIAL page! */ @@ -107,7 +93,7 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) * Ignore return value because fadvise() shall return * success even if filesystem can't retrieve a hint, */ - force_page_cache_readahead(mapping, f.file, start_index, + force_page_cache_readahead(mapping, file, start_index, nrpages); break; case POSIX_FADV_NOREUSE: @@ -142,15 +128,50 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice) ret = -EINVAL; } out: + return ret; +} +EXPORT_SYMBOL(generic_fadvise); + +static int do_fadvise(int fd, loff_t offset, loff_t len, int advice) +{ + struct fd f = fdget(fd); + int (*fadvise)(struct file *, loff_t, loff_t, int) = generic_fadvise; + int
[PATCH 4/4] cifs: add set_flag callback
Add set_flag callback which is called from fcntl(F_SETFL) Share common logic for cifs_open and cifs_set_flags I'm not cifs expert, but it is looks like toggling O_DIRECT on file is unsafe operation so disable it temporally. Signed-off-by: Dmitry Monakhov --- fs/cifs/cifsfs.c |6 ++ fs/cifs/cifsfs.h |1 + fs/cifs/file.c | 37 ++--- 3 files changed, 37 insertions(+), 7 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 889b984..18ad412 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -918,6 +918,7 @@ const struct file_operations cifs_file_ops = { .mmap = cifs_file_mmap, .splice_read = generic_file_splice_read, .llseek = cifs_llseek, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ @@ -938,6 +939,7 @@ const struct file_operations cifs_file_strict_ops = { .mmap = cifs_file_strict_mmap, .splice_read = generic_file_splice_read, .llseek = cifs_llseek, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ @@ -958,6 +960,7 @@ const struct file_operations cifs_file_direct_ops = { .flush = cifs_flush, .mmap = cifs_file_mmap, .splice_read = generic_file_splice_read, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ @@ -978,6 +981,7 @@ const struct file_operations cifs_file_nobrl_ops = { .mmap = cifs_file_mmap, .splice_read = generic_file_splice_read, .llseek = cifs_llseek, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ @@ -997,6 +1001,7 @@ const struct file_operations cifs_file_strict_nobrl_ops = { .mmap = cifs_file_strict_mmap, .splice_read = generic_file_splice_read, .llseek = cifs_llseek, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ @@ -1016,6 +1021,7 @@ const struct file_operations cifs_file_direct_nobrl_ops = { .flush = cifs_flush, .mmap = cifs_file_mmap, .splice_read = generic_file_splice_read, + .set_flags = cifs_set_flags, #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 002e0c1..353190d 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -92,6 +92,7 @@ extern const struct file_operations cifs_file_strict_ops; /* if strictio mnt */ extern const struct file_operations cifs_file_nobrl_ops; /* no brlocks */ extern const struct file_operations cifs_file_direct_nobrl_ops; extern const struct file_operations cifs_file_strict_nobrl_ops; +extern int cifs_set_flags(struct file *filp, unsigned arg); extern int cifs_open(struct inode *inode, struct file *file); extern int cifs_close(struct inode *inode, struct file *file); extern int cifs_closedir(struct inode *inode, struct file *file); diff --git a/fs/cifs/file.c b/fs/cifs/file.c index dc3c7e6..5167ecb 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -431,6 +431,33 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file) kfree(cifs_file); } +int cifs_set_flags(struct file *file, unsigned arg) +{ + struct cifs_sb_info *cifs_sb = CIFS_SB(file->f_path.dentry->d_sb); + int rc = 0; + + spin_lock(&file->f_lock); + if ((file->f_flags ^ arg) & O_DIRECT) { + /* +* TODO: Toggling O_DIRECT is not obvious task, +* temproraly disabled for safity reason +*/ + rc = -EINVAL; + goto out; + } + file->f_flags = (arg & SETFL_MASK) | (file->f_flags & ~SETFL_MASK); + if (file->f_flags & O_DIRECT && + cifs_sb->mnt_cifs_flags & CIFS_MOUNT_STRICT_IO) { + if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_BRL) + file->f_op = &cifs_file_direct_nobrl_ops; + else + file->f_op = &cifs_file_direct_ops; + } +out: + spin_unlock(&file->f_lock); + return rc; +} + int cifs_open(struct inode *inode, struct file *file) { @@ -467,13 +494,9 @@ int cifs_open(struct inode *inode, struct file *file) cifs_dbg(FYI, "inode = 0x%p file flags are 0x%x for %s\n", inode, file->f_flags, full_path); - if (file->f_flags & O_DIRECT && - cifs_sb->mnt_cifs_flags & CIFS_MOUNT_STRICT_IO) { - if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_BRL) - file->f_op = &cifs_file_direct_nobrl_ops; - else - file->f_op = &cifs_file_direct_ops; - } + rc = cifs_set_flags(file, file->f_flags); + if (rc) + goto out
[PATCH 0/4] fs: fcntl/fadvice fixes v2
fcntl(F_SETFL) and fadvise performs direct manipulation with file's internals. w/o notifying to fs layer. This behavior be not be suitable for some filesystems (mostly stack-fs like ecryptfs, unionfs, etc). Let's introduce new ->set_flags() callback for that purpose. This callback is responsible for flags check so ->check_flags() no longer required. TOC: fs: fcntl add set_flags wrapper -v2 fs: add fadvise file_operation ecryptfs: add fadvise/set_flags calbacks cifs: add set_flag callback *OPEN ISSUE REMAINS* This series does not fix all issues related with set_flags. Race between fcntl(toggling O_DIRECT) vs write() is still possible Usually O_DIRECT checked twice during call chain: ->xxx_file_write_iter --->__generic_file_write_iter So we may end-up up with two different values. Some filesystems (btrfs/xfs) avoid this issue by copy-pasting __generic_file_write_iter. One of possible way to fix this issue it to save flags in kiocb->ki_flags as we already do with ->ki_pos. And fixup all places accordingly. I've calculated numbers of direct access to ->f_flags it is close to 150, half of that number is ->open() methods. So patch would not be gigantic. And finally here is my question to AlViro and Christoph and other VFS-people: *Are you agree with that approach?* Please say your word. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] um: Remove unused bp stack-frame pointer
On 2014-10-18 17:19, Richard Weinberger wrote: > Am 18.10.2014 um 17:12 schrieb Manfred Schlaegl: >> On 2014-10-18 16:42, Richard Weinberger wrote: >>> Am 18.10.2014 um 16:23 schrieb Manfred Schlaegl: The pointer to bp stack-frame is no longer used. Removed it. >>> >>> Good catch! >> Thank you. >> >>> This also removes a corresponding compiler-warning. >>> >>> Which warning exactly? >> >> On "normal" (defconfig) builds the warning does not show up because >> CONFIG_FRAME_POINTER is set. >> I've found the unused bp because CONFIG_FRAME_POINTER was not set in my >> configuration. >> >> CC arch/um/kernel/sysrq.o >> arch/um/kernel/sysrq.c: In function ‘show_stack’: >> arch/um/kernel/sysrq.c:32:29: warning: unused variable ‘bp’ >> [-Wunused-variable] > > Looks like my gcc need's an upgrade. :D > > Thanks, > //richard > I'm using gcc version 4.7.2 (Debian 4.7.2-5). -> not THAT new ;-) best regards manfred -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Fri, Oct 17, 2014 at 11:54 PM, Mike Galbraith wrote: > > The barrier fixing up my problematic box smells a lot like evidence. Is this a "tested-by"? Did you actuallyu verify that the patch ends up fixing the problem you saw? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Fri, Oct 17, 2014 at 1:55 AM, Miklos Szeredi wrote: > > The problem with those "count elevated by other things" is that they are > actually bugs. No they aren't. You think they are, and then you find one case, and ignore all the others. Look around for AIO. Look around for the loop driver. Look around for a number of things that do "fget()" and that you completely ignored. So no, your patch doesn't change the fundamental issue in any way, shape, or form. I asked you to stop emailing me with these broken "let's fix one special case, because I can't be bothered to understand the big picture" patches. This was another one of those hacky sh*t patches that doesn't actually change the deeper issue. Stop it. Seriously. This idiotic thread has been going on for too long. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 8:35 AM, Linus Torvalds wrote: > > Look around for AIO. Look around for the loop driver. Look around for > a number of things that do "fget()" and that you completely ignored. .. actually, there are more instances of "get_file()" than of "fget()", the aio one just happened to be the latter form. Lots and lots of ways to get ahold of a file descriptor that keeps it open past the "last close". Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] um: Remove unused bp stack-frame pointer
Am 18.10.2014 um 17:24 schrieb Manfred Schlaegl: > On 2014-10-18 17:19, Richard Weinberger wrote: >> Am 18.10.2014 um 17:12 schrieb Manfred Schlaegl: >>> On 2014-10-18 16:42, Richard Weinberger wrote: Am 18.10.2014 um 16:23 schrieb Manfred Schlaegl: > The pointer to bp stack-frame is no longer used. Removed it. Good catch! >>> Thank you. >>> > This also removes a corresponding compiler-warning. Which warning exactly? >>> >>> On "normal" (defconfig) builds the warning does not show up because >>> CONFIG_FRAME_POINTER is set. >>> I've found the unused bp because CONFIG_FRAME_POINTER was not set in my >>> configuration. >>> >>> CC arch/um/kernel/sysrq.o >>> arch/um/kernel/sysrq.c: In function ‘show_stack’: >>> arch/um/kernel/sysrq.c:32:29: warning: unused variable ‘bp’ >>> [-Wunused-variable] >> >> Looks like my gcc need's an upgrade. :D >> >> Thanks, >> //richard >> > > I'm using gcc version 4.7.2 (Debian 4.7.2-5). -> not THAT new ;-) With a cup of coffee applied I managed it to read your mail correctly. The warning triggers only with CONFIG_FRAME_POINTER=n. Now it makes sense. Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new GPG key
On 18.10.2014, Paolo Bonzini wrote: > 5) Get a smartcard or a Yubikey NEO and put the subkeys on it; replace > subkeys with stubs on your usual working machines, especially laptops. It > gives you two factor authentication for free, and can also be used for > SSH if you add a third subkey. AFAICS, a lot of the lkml people use the mutt MUA, which does not have any password encryption natively. In this case, the smartcard has another advantage: you can have your email password encrypted and use it without having to enter a long and complicated passphrase. In case your laptop gets stolen while travelling, the password to your email is protected. Here's what I did: 1. Generate a password file and assign the password to a variable. touch .my-pw echo "set my_pw_imap = \"your-long-and-random-password\"" > .my-pw 2. Encrypt this file to your own public key and shred the unencrypted textfile 3. Source the password file into .muttrc and set the imap password variable by writing something like this into your .muttrc: source "gpg2 -dq $HOME/.my-pw.asc |" set imap_pass=$my_pw_imap Now, if you start mutt and it connects to your IMAP server, you'll be prompted for your smartcards PIN, and that's it. In case your laptop gets stolen while you're travelling and you don't have access to the net (because all the other things in your bag like your mobile also got stolen), it will spare you the situation where the thief already had logged into your email and changed your password when you finally managed to connect to the net again. Sorry for being OT, but I have encountered such a situation before and it got me into serious trouble, so I dared to share this with you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Discussion on Linus Trovalds claim of userspace cluelessness, spiralling into other analysis and choice of women. Also systemd.
There was a discussion recently about Mr Trovalds claiming he didn't know anything about user-space at debconf '14, a stunning claim. Now since the discussion is likely to be deleted or edited at the source, well here it is: I guess people really can't handle the obvious truth being stated. --- (In response to another post) With Trovalds there is also the recent "I don't know anything about userspace whatsoever" claim (debconf '14) we saw*, when not so long ago he was railing on various userspace projects (KDE etc). Who in the world would believe that the man who started linux... does nothing in userspace and has no clue about what goes on there. It's just a lie. *Which he used to "back up" his tacit endorsement of systemd. Then yesterday there was him apologizing for his language etc, a few days after he said he didn't care what poettering thought, (and awhile after defending his style in debconf '14) Guess he got a call from the companies behind the linux foundation which pays him. Also, maybe this has nothing to do with anything but it bothers me for some reason: He took his life and what he did with it was... He married a rather large karate champ woman who was not pretty (all this when he married her, it wasn't a bait and switch kind of thing that is so common). I do not get that either. Guess he lies to himself aswell (or maybe his eyes do?). Really makes everyone in software look bad. Like: don't admit publicly you have anything to do with linux, use it, program for it, etc. Usually men want pretty girls. Usually thin girls if the man is white/european. Usually don't really want someone who "can beat them up", that's not a plus. --- Response: The former post is typical of your posting style. You started with good arguments to criticise, or at least cast suspicion on, Torvalds with critical observation, then switched to criticising him for his choice of wife, because she is not pretty (in your eyes). Then finished with a bunch of crap about what kinds of women men of different ethnicities prefer, and using that nonsense as evidence that there is something wrong with Torvalds. That kind of posting is why no one takes you seriously any longer. --- I program games, record music, and work on models and levels etc. I'm not a serious guy. I don't program interfaces or deal with hardware not following the specs it claims. Trovald's wife* is not pretty. My cuzin's black friend (west african decent, not east african like farrell) likes big asses: he was salivating while while sitting in a vehicle with me at the rolly pollys with extra everything. I don't. *(wife means woman. Can you even call her "his woman", it's not like he controls her, she can divorce him and take all he has at the drop of the hat, oh and it seems to have been made a point that she's a karate champion, so does she also imagine she could beat his ass if she so chose?). I think trovalds has poor tastes in women. Even RMS has come out in support of marrying (or some analogue there of) the young girls; I suspect because he likes pretty things. Not liking pretty things is strange in a man, atleast in a european-decendant man. Trovalds seems to be willing to sacrifice it all for a woman who is (at the time of their marraige): *Fat. *Not good looking in the face. *Somehow also a karate champion. To throw away a chance at happiness for what? Fitting in? Being respectable? Very progressive of him. He doesn't look that terrible. His daughters look worse than him because of his decision of mate. I hope the other sons of his father have chosen better. If he's the only one than that's the end of the line for that family. They're in the gutter forever now. He could have chosen a pretty woman atleast, and kicked things up a notch with the next generation of his family. Instead he chose to sqaunder his family's lineage up to the point where it gave life to him. What you'll do with the most important aspect of your life says something about you. Trovalds chose to throw it in the trash and tell his father and mother and his grandfather and grandmother and his ancestors generations past to go fk themselves basically. I suspect that says something about him, along with the other things. Calls for censorship ensue. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Discussion on Linus Trovalds claim of userspace cluelessness, spiralling into other analysis and choice of women. Also systemd.
--- The truth stands alone. Some can't handle it and need it hidden. As I said. Software people, western men in general, are weak and proud to be so. (Except when it comes to bombing cultures that keep the old ways and old ideas: western men are "strong" at that: You show the same tendency, but you don't have bombs, only the censors mark) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Sat, 2014-10-18 at 08:28 -0700, Linus Torvalds wrote: > On Fri, Oct 17, 2014 at 11:54 PM, Mike Galbraith > wrote: > > > > The barrier fixing up my problematic box smells a lot like evidence. > > Is this a "tested-by"? Did you actuallyu verify that the patch ends up > fixing the problem you saw? Yup, it definitely fixed it up. Weird that it didn't show at all on the 2 socket 20 core box the problem I was looking into was reported on, but was 100% busted on the similar 2 socket 28 core box I borrowed to have a peek, and only that box. My (crippled/slow) 64 core DL980 was perfectly happy, as were my 3 home boxen, not a peep from anywhere but that one 28 core box. Hohum.. Tested-by: Mike Galbraith -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] mm: introduce new VM_NOZEROPAGE flag
On 10/18/2014 07:49 AM, Dominik Dingel wrote: > On Fri, 17 Oct 2014 15:04:21 -0700 > Dave Hansen wrote: >> Is there ever a time where the VMAs under an mm have mixed VM_NOZEROPAGE >> status? Reading the patches, it _looks_ like it might be an all or >> nothing thing. > > Currently it is an all or nothing thing, but for a future change we might > want to just > tag the guest memory instead of the complete user address space. I think it's a bad idea to reserve a flag for potential future use. If you _need_ it in the future, let's have the discussion then. For now, I think it should probably just be stored in the mm somewhere. >> Full disclosure: I've got an x86-specific feature I want to steal a flag >> for. Maybe we should just define another VM_ARCH bit. >> > > So you think of something like: > > #if defined(CONFIG_S390) > # define VM_NOZEROPAGEVM_ARCH_1 > #endif > > #ifndef VM_NOZEROPAGE > # define VM_NOZEROPAGEVM_NONE > #endif > > right? Yeah, something like that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: m_can: add CONFIG_HAS_IOMEM dependence
On Wed, Oct 15, 2014 at 02:41:50PM -0700, David Cohen wrote: > m_can uses io memory which makes it not compilable on architectures > without HAS_IOMEM such as UML: > > drivers/built-in.o: In function `m_can_plat_probe': > m_can.c:(.text+0x218cc5): undefined reference to `devm_ioremap_resource' > m_can.c:(.text+0x218df9): undefined reference to `devm_ioremap' > > Signed-off-by: David Cohen Applied to the can tree. Thanks, Marc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] NTB bug fixes for v3.18
Hi Linus, Below are a few NTB changes for v3.18. Please consider pulling them. Thanks, Jon The following changes since commit bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9: Linux 3.17 (2014-10-05 12:23:04 -0700) are available in the git repository at: git://github.com/jonmason/ntb tags/ntb-3.18 for you to fetch changes up to ab760a0c5667519b375ea9c5ab3a23501c4817ef: ntb: Adding split BAR support for Haswell platforms (2014-10-17 07:08:51 -0400) Add support for Haswell NTB split BARs, a debugfs entry for basic debugging info, and some code clean-ups. Dave Jiang (4): ntb: move platform detection to separate function ntb: conslidate reading of PPD to move platform detection earlier ntb: use errata flag set via DID to implement workaround ntb: Adding split BAR support for Haswell platforms Jon Mason (1): NTB: debugfs device entry drivers/ntb/ntb_hw.c | 567 +++-- drivers/ntb/ntb_hw.h | 19 +- drivers/ntb/ntb_regs.h | 38 +++- 3 files changed, 501 insertions(+), 123 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: can: esd_usb2: fix memory leak on disconnect
On Sat, Oct 11, 2014 at 12:31:07AM +0400, Alexey Khoroshilov wrote: > It seems struct esd_usb2 dev is not deallocated on disconnect. > > The patch adds the deallocation. > > Found by Linux Driver Verification project (linuxtesting.org). > > Signed-off-by: Alexey Khoroshilov Applied to the can tree. Thanks, Marc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] staging: lustre: lnet: lnet: do not initialise 0
Hello. On 10/18/2014 8:03 AM, Balavasu wrote: This patch fixes the checkpatch.pl issue Error:do not initialise statics to 0 or NULL Signed-off-by: Balavasu --- drivers/staging/lustre/lnet/lnet/router.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index b5b8fb5..b6edf1f 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c [...] @@ -80,7 +80,7 @@ lnet_peer_buffer_credits(lnet_ni_t *ni) #endif -static int check_routers_before_use = 0; +static int check_routers_before_use = {0}; Eh? I thought {} is only for arrays/structures... [...] WBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] iio: inkern: Add of_xlate function to struct iio_dev
On Sat, 2014-10-18 at 12:42 +0100, Jonathan Cameron wrote: > On 02/10/14 13:32, Ivan T. Ivanov wrote: > > When #iio-cells is greater than '0', the driver could provide > > a custom of_xlate function that reads the *args* and returns > > the appropriate index in registered IIO channels array. > > > > Add simple translation function, suitable for the most 1:1 > > mapped channels in IIO chips, and use it when driver did not > > provide custom implementation. > > > > Signed-off-by: Ivan T. Ivanov > Any more comments on this? Been sat a while and the discussions seems > to have died out. > > As Ivan has pointed out, very similar approaches are used > elsewhere (gpio for example). Looks useful to me. Thanks, Srinivas > > > --- > > drivers/iio/inkern.c| 32 +++- > > include/linux/iio/iio.h | 8 > > 2 files changed, 35 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c > > index f084610..6c3e478 100644 > > --- a/drivers/iio/inkern.c > > +++ b/drivers/iio/inkern.c > > @@ -100,6 +100,28 @@ static int iio_dev_node_match(struct device *dev, void > > *data) > > return dev->of_node == data && dev->type == &iio_device_type; > > } > > > > +/** > > + * __of_iio_simple_xlate - translate iiospec to the IIO channel index > > + * @indio_dev: pointer to the iio_dev structure > > + * @iiospec: IIO specifier as found in the device tree > > + * > > + * This is simple translation function, suitable for the most 1:1 mapped > > + * channels in IIO chips. This function performs only one sanity check: > > + * whether IIO index is less than num_channels (that is specified in the > > + * iio_dev). > > + */ > > +static int __of_iio_simple_xlate(struct iio_dev *indio_dev, > > + const struct of_phandle_args *iiospec) > > +{ > > + if (!iiospec->args_count) > > + return 0; > > + > > + if (iiospec->args[0] >= indio_dev->num_channels) > > + return -EINVAL; > > + > > + return iiospec->args[0]; > > +} > > + > > static int __of_iio_channel_get(struct iio_channel *channel, > > struct device_node *np, int index) > > { > > @@ -122,18 +144,18 @@ static int __of_iio_channel_get(struct iio_channel > > *channel, > > > > indio_dev = dev_to_iio_dev(idev); > > channel->indio_dev = indio_dev; > > - index = iiospec.args_count ? iiospec.args[0] : 0; > > - if (index >= indio_dev->num_channels) { > > - err = -EINVAL; > > + if (!indio_dev->of_xlate) > > + indio_dev->of_xlate = __of_iio_simple_xlate; > > + index = indio_dev->of_xlate(indio_dev, &iiospec); > > + if (index < 0) > > goto err_put; > > - } > > channel->channel = &indio_dev->channels[index]; > > > > return 0; > > > > err_put: > > iio_device_put(indio_dev); > > - return err; > > + return index; > > } > > > > static struct iio_channel *of_iio_channel_get(struct device_node *np, int > > index) > > diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h > > index 15dc6bc..d5bb219 100644 > > --- a/include/linux/iio/iio.h > > +++ b/include/linux/iio/iio.h > > @@ -13,6 +13,7 @@ > > #include > > #include > > #include > > +#include > > /* IIO TODO LIST */ > > /* > > * Provide means of adjusting timer accuracy. > > @@ -413,6 +414,11 @@ struct iio_buffer_setup_ops { > > * @currentmode: [DRIVER] current operating mode > > * @dev: [DRIVER] device structure, should be assigned a parent > > * and owner > > + * @of_xlate: [DRIVER] function pointer to obtain channel > > specifier > > + * index. When #iio-cells is greater than '0', the driver > > + * could provide a custom of_xlate function that reads > > + * the *args* and returns the appropriate index in > > + * registered IIO channels array. > > * @event_interface: [INTERN] event chrdevs associated with > > interrupt lines > > * @buffer:[DRIVER] any buffer present > > * @buffer_list: [INTERN] list of all buffers currently attached > > @@ -451,6 +457,8 @@ struct iio_dev { > > int currentmode; > > struct device dev; > > > > + int (*of_xlate)(struct iio_dev *indio_dev, > > + const struct of_phandle_args *iiospec); > > struct iio_event_interface *event_interface; > > > > struct iio_buffer *buffer; > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unaligned accesses in SLAB etc.
From: Meelis Roos Date: Fri, 17 Oct 2014 14:12:09 +0300 (EEST) > However, on top of mainline HEAD 3.17.0-09670-g0429fbc it explodes with > scheduler BUG - just reported to LKML + sched maintainers. task_stack_end_corrupted() cannot work properly on sparc64. It stores the magic value at "task_thread_info(p) + 1", but on sparc64 that's where we store the nested array of FPU register saves. In fact this facility could be corrupting FPU register state in certain circumstances. The current sparc64 design is intentional, the CPU stack grows down toward the thread_info, and the FPU stack saving area grows up from the end of thread_info. I don't want to define the array size of the fpregs save area explicitly and thereby placing an artificial limit there. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 5:40 PM, Linus Torvalds wrote: > On Sat, Oct 18, 2014 at 8:35 AM, Linus Torvalds > wrote: >> >> Look around for AIO. Look around for the loop driver. Look around for >> a number of things that do "fget()" and that you completely ignored. > > .. actually, there are more instances of "get_file()" than of > "fget()", the aio one just happened to be the latter form. Lots and > lots of ways to get ahold of a file descriptor that keeps it open past > the "last close". And what you don't get is that there's a deep difference between those and the /proc file access case. And the difference is that one is done because of an explicit action by the holder of the open file. And the other is done by some random process doing non-invasive examination of the holder of the open-file. So basically: we simply don't care if last close does not happen to release the file *iff* it was because of some explicit action that obviously has or could have such a side effect. Is that so hard to understand? In other words, we care about doing that last release synchronously if it provably is the last release of that file and happens to be done from close() (or munmap()). And then all your examples of loop driver and aio are pointless, because we *know* they will be holding onto that descriptor, the same as we know, that after dup(), close() will not release the file and the (non-IDIOTIX) locks together with the file. Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH resend] [media] rc-core: fix protocol_change regression in ir_raw_event_register
Em Sat, 18 Oct 2014 13:10:01 +0300 Tomas Melin escreveu: > On Thu, Oct 16, 2014 at 11:49 PM, David Härdeman wrote: > > I think this is already addressed in this thread: > > http://www.spinics.net/lists/linux-media/msg79865.html > The patch in that thread would have broken things since the > store_protocol function is not changed at the same time. The patch I > sent also takes that into account. > > My concern is still that user space behaviour changes. > In my case, lirc simply does not work anymore. Yeah, lirc should be enabled by default. > More generically, > anyone now using e.g. nuvoton-cir with anything other than RC6_MCE > will not get their devices working without first explictly enabling > the correct protocol from sysfs or with ir-keytable. The right behavior here is to enable the protocol as soon as the new keycode table is written by userspace. Except for LIRC and the protocol of the current table enabled is not a good idea because: 1) It misread the code from some other IR; 2) It will be just spending power without need, running several tasks (one for each IR type) with no reason, as the keytable won't match the codes for other IRs (and if it is currently matching, then this is a bad behavior). > Correct me if I'm wrong but the change_protocol function in struct > rc_dev is meant for changing hardware decoder protocols which means > only a few drivers actually use it. Actually, most drivers are for hardware decoders. > So the added empty function > change_protocol into rc-ir-raw.c doesnt really make sense in the first > place. > > Tomas -- Cheers, Mauro -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 08:40:05AM -0700, Linus Torvalds wrote: > On Sat, Oct 18, 2014 at 8:35 AM, Linus Torvalds > wrote: > > > > Look around for AIO. Look around for the loop driver. Look around for > > a number of things that do "fget()" and that you completely ignored. > > .. actually, there are more instances of "get_file()" than of > "fget()", the aio one just happened to be the latter form. Lots and > lots of ways to get ahold of a file descriptor that keeps it open past > the "last close". FWIW, procfs patch touches a very annoying issue: ->show_fdinfo() being blocking. I would really like to get rid of that particular get_file() and even more so - of get_files_struct() in there. I certainly agree that anyone who expects that close() means the end of IO is completely misguided. Mappings don't disappear on close(), neither does a descriptor returned by dup(), or one that child got over fork(), or something sent over in SCM_RIGHTS datagram, or, as you suggested, made backing store for /dev/loop, etc. What's more, in the example given upthread, somebody might've spotted that file in /proc//fd/* and *opened* it. At which point umount would have to fail with EBUSY. And the same lsof(8) might've done just that. It's not a matter of correctness or security, especially since somebody who could do that, could've stopped your process, PTRACE_POKEd a fairly short series of syscalls that would connect to AF_UNIX socket, send the file over to them and clean after itself, then single-stepped through all of that, restored the original state and resumed your process. It is a QoI matter, though. And get_files_struct() in there is a lot more annoying than get_file()/fput(). Suppose you catch the process during exit(). All of a sudden, read from /proc//fdinfo/ ends up doing shitloads of filp_close(). It would be nice to avoid that. Folks, how much pain would it be to make ->show_fdinfo() non-blocking? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unaligned accesses in SLAB etc.
From: David Miller Date: Sat, 18 Oct 2014 13:59:07 -0400 (EDT) > I don't want to define the array size of the fpregs save area > explicitly and thereby placing an artificial limit there. Nevermind, it seems we have a hard limit of 7 FPU save areas anyways. Meelis, please try this patch: diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h index f85dc85..cc6275c 100644 --- a/arch/sparc/include/asm/thread_info_64.h +++ b/arch/sparc/include/asm/thread_info_64.h @@ -63,7 +63,8 @@ struct thread_info { struct pt_regs *kern_una_regs; unsigned intkern_una_insn; - unsigned long fpregs[0] __attribute__ ((aligned(64))); + unsigned long fpregs[(7 * 256) / sizeof(unsigned long)] + __attribute__ ((aligned(64))); }; #endif /* !(__ASSEMBLY__) */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 08:01:13PM +0200, Miklos Szeredi wrote: > And what you don't get is that there's a deep difference between those > and the /proc file access case. > > And the difference is that one is done because of an explicit action > by the holder of the open file. And the other is done by some random > process doing non-invasive examination of the holder of the open-file. Such as ls -l /proc/*/fd/*? That will give you the same transient EBUSY from umount, if it hits the right moment... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 11:01 AM, Miklos Szeredi wrote: > > And what you don't get is that there's a deep difference between those > and the /proc file access case. No there isn't. Your "action by the holder" argument is pure and utter garbage, for a very simple and core reason: the *filesystem* doesn't know or care. So from a fuse standpoint, the difference is totally and entirely irrelevant. Your "synchronous fput" fails. It fails totally regardless of that "who incremented the file user count" issue. Face it, your patch is broken. And it's *fundamentally* broken, which is why I'm so tired of your stupid ad-hoc hacks that cannot possibly work. Your umount EBUSY case is somewhat relevant to the "who holds the file open", but quite frankly, that's not a filesystem issue, it's more of a system management issue. So umount might get EBUSY. That's not something the filesystem should/could care about, that's a MIS issue that is entirely irrelevant, and the answer is "if somebody does lsof, that might keep the filesystem busy". Tough. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] fuse: handle release synchronously (v4)
On Sat, Oct 18, 2014 at 07:24:54PM +0100, Al Viro wrote: > On Sat, Oct 18, 2014 at 08:01:13PM +0200, Miklos Szeredi wrote: > > > And what you don't get is that there's a deep difference between those > > and the /proc file access case. > > > > And the difference is that one is done because of an explicit action > > by the holder of the open file. And the other is done by some random > > process doing non-invasive examination of the holder of the open-file. > > Such as ls -l /proc/*/fd/*? That will give you the same transient EBUSY > from umount, if it hits the right moment... stat -L /proc/*/fd/*, that is... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [alsa-devel] [PATCH v2 5/6] sound/usb: pcm changes to use media token api
Em Thu, 16 Oct 2014 08:59:29 -0600 Shuah Khan escreveu: > On 10/16/2014 08:48 AM, Takashi Iwai wrote: > > At Thu, 16 Oct 2014 08:39:14 -0600, > > Shuah Khan wrote: > >> > >> On 10/16/2014 08:16 AM, Takashi Iwai wrote: > >>> At Thu, 16 Oct 2014 08:10:52 -0600, > >>> Shuah Khan wrote: > > On 10/16/2014 08:01 AM, Takashi Iwai wrote: > > At Thu, 16 Oct 2014 07:10:37 -0600, > > Shuah Khan wrote: > >> > >> On 10/16/2014 06:00 AM, Lars-Peter Clausen wrote: > >>> On 10/14/2014 04:58 PM, Shuah Khan wrote: > >>> [...] > switch (cmd) { > case SNDRV_PCM_TRIGGER_START: > +err = media_get_audio_tkn(&subs->dev->dev); > +if (err == -EBUSY) { > +dev_info(&subs->dev->dev, "%s device is busy\n", > +__func__); > >>> > >>> In my opinion this should not dev_info() as this is out of band error > >>> signaling and also as the potential to spam the log. The userspace > >>> application is already properly notified by the return code. > >>> > >> > >> Yes it has the potential to flood the dmesg especially with alsa, > >> I will remove the dev_info(). > > > > Yes. And, I think doing this in the trigger isn't the best. > > Why not in open & close? > > My first cut of this change was in open and close. I saw pulseaudio > application go into this loop trying to open the device. To avoid > such problems, I went with trigger stat and stop. That made all the > pulseaudio continues attempts to open problems go away. > >>> > >>> But now starting the stream gives the error, and PA would loop it > >>> again, no? > >>> > >>> > > Also, is this token restriction needed only for PCM? No mixer or > > other controls? > > snd_pcm_ops are the only ones media drivers implement and use. So > I don't think mixer is needed. If it is needed, it is not to hard > to add for that case. > >>> > >>> Well, then I wonder what resource does actually conflict with > >>> usb-audio and media drivers at all...? > >>> > >> > >> audio for dvb/v4l apps gets disrupted when audio app starts. For > >> example, dvb or v4l app tuned to a channel, and when an audio app > >> starts. audio path needs protected to avoid conflicts between > >> digital and analog applications as well. > > > > OK, then concentrating on only PCM is fine. > > > > But, I'm still not convinced about doing the token management in the > > trigger. The reason -EBUSY doesn't work is that it's the very same > > error code when a PCM device is blocked by other processes. And > > -EAGAIN is interpreted by PCM core to -EBUSY for historical reasons. > > ah. ok your recommendation is to go with open and close. > Mauro has some reservations with holding at open when I discussed > my observations with pulseaudio when I was holding token in open > instead of trigger start. Maybe he can chime with his concerns. > I think his concern was breaking applications if token is held in > open(). Yes. My concern is that PA has weird behaviors, and it tries to open and keep opened all audio devices. Is there a way for avoiding it to keep doing it for V4L devices? > > Based on what you are seeing trigger could be worse. > > > > > How applications (e.g. PA) should behave if the token get fails? > > Shouldn't it retry or totally give up? > > > > It would be up to the application I would think. I see that arecord > quits right away when it finds the device busy. pluseaudio on the other > hand appears to retry. I downloaded pulseaudio sources to understand > what it is doing, however I didn't get too far. The way it does audio > handling is complex for me to follow without spending a lot of time. > > thanks, > -- Shuah > -- Cheers, Mauro -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On 10/17/14 11:38, Catalin Marinas wrote: > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's > nothing to wake up) changes the futex code to avoid taking a lock when > there are no waiters. This code has been subsequently fixed in commit > 11d4616bd07f (futex: revert back to the explicit waiter counting code). > Both the original commit and the fix-up rely on get_futex_key_refs() to > always imply a barrier. > > However, for private futexes, none of the cases in the switch statement > of get_futex_key_refs() would be hit and the function completes without > a memory barrier as required before checking the "waiters" in > futex_wake() -> hb_waiters_pending(). The consequence is a race with a > thread waiting on a futex on another CPU, allowing the waker thread to > read "waiters == 0" while the waiter thread to have read "futex_val == > locked" (in kernel). Verified that this is: a) how it is documented to work b) not how it actually works currently Nice catch indeed. ... > diff --git a/kernel/futex.c b/kernel/futex.c > index 815d7af2ffe8..f3a3a071283c 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key) > case FUT_OFF_MMSHARED: > futex_get_mm(key); /* implies MB (B) */ > break; > + default: A comment here indicating this covers the PROCESS_PRIVATE futex case would be welcome, given the complexity involved. > + smp_mb(); /* explicit MB (B) */ Also, the "Basic" futex operation and ordering guarantees documentation currently reads: * Where (A) orders the waiters increment and the futex value read through * atomic operations (see hb_waiters_inc) and where (B) orders the write * to futex and the waiters read -- this is done by the barriers in * get_futex_key_refs(), through either ihold or atomic_inc, depending on the * futex type. Which is not incomplete (lacking the explicit smp_mb()) added by this patch. Perhaps the MB implementation of get_futex_key_refs() need not be explicitly enumerated here? -- Darren Hart Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mtd: orion_nand: fix error code path in probe
On Thu, Oct 16, 2014 at 06:58:35AM +0200, Michael Opdenacker wrote: > This replaces kzalloc() and ioremap() calls by devm_ functions > in the probe() routine, which automatically release the corresponding > resources when probe() fails or when the device is removed. > > This simplifies simplifies the error management code, and brings > the below improvements or changes: > > A. Fixing a bug reported by "make coccicheck": > > If "board = devm_kzalloc()" fails, the probe() function jumps > incorrectly to label "no_res" and therefore returns without > running iounmap(). > > B. Requesting the memory region > > Using devm_ioremap_resource() makes the probe() function request > the corresponding memory region before running ioremap(), as > it is supposed to do. > > C. Standardizing the error codes: > > The use of devm_ioremap_resource() changes the return value: > * -ENOMEM instead of -EIO in case of ioremap() failure, > * -EINVAL instead of -ENODEV in case of platform_get_resource() >failure. > > Signed-off-by: Michael Opdenacker Acked-by: Andrew Lunn I wanted to test it, but i don't have easy access to a device using nand. All mine are SPI based :-( Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
On Sat, Oct 18, 2014 at 12:33:27PM +0400, Kirill Tkhai wrote: > How about this? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index b78280c..d46427e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1165,7 +1165,21 @@ static void task_numa_compare(struct task_numa_env > *env, > > rcu_read_lock(); > cur = ACCESS_ONCE(dst_rq->curr); > - if (cur->pid == 0) /* idle */ > + /* > + * No need to move the exiting task, and this ensures that ->curr > + * wasn't reaped and thus get_task_struct() in task_numa_assign() > + * is safe; note that rcu_read_lock() can't protect from the final > + * put_task_struct() after the last schedule(). > + */ > + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) > + cur = NULL; > + /* > + * Check once again to be sure curr is still on dst_rq. Even if > + * it points on a new task, which is using the memory of freed > + * cur, it's OK, because we've locked RCU before > + * delayed_put_task_struct() callback is called to put its struct. > + */ > + if (cur != ACCESS_ONCE(dst_rq->curr)) > cur = NULL; > > /* So you worry about the refcount doing 0->1 ? In which case the above is still wrong and we should be using atomic_inc_not_zero() in order to acquire the reference count. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Sat, 2014-10-18 at 00:33 -0700, Davidlohr Bueso wrote: > On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: > > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's > > nothing to wake up) changes the futex code to avoid taking a lock when > > there are no waiters. This code has been subsequently fixed in commit > > 11d4616bd07f (futex: revert back to the explicit waiter counting code). > > Both the original commit and the fix-up rely on get_futex_key_refs() to > > always imply a barrier. > > > > However, for private futexes, none of the cases in the switch statement > > of get_futex_key_refs() would be hit and the function completes without > > a memory barrier as required before checking the "waiters" in > > futex_wake() -> hb_waiters_pending(). > > Good catch, glad I ran into this thread (my email recently changed). > Private process futex (PTHREAD_PROCESS_PRIVATE) have no reference on an > inode or mm so it would need the explicit barrier in those cases. And [get/put]_futex_keys() shouldn't even be called for private futexes. The following patch had some very minor testing on a 60 core box last night, but passes both Darren's and perf's tests. So I *think* this is right, but lack of sleep and I overall just don't trust them futexes! 8<-- From: Davidlohr Bueso Date: Sat, 18 Oct 2014 12:30:37 -0700 Subject: [PATCH 2/1] futex: No key referencing for private futexes Because private futexes do not hold references on either an inode or mm, they should not be calling key referencing operations (even though they are practically a nop). However, we need to call the get part only because we need the barrier in order to maintain correct ordering guarantees for the lockless waiter checking. In addition, we can avoid calling the put part for private futexes altogether, as it serves no purpose in the ordering. This patch 1) documents the situation, 2) explicitly avoids calling drop_futex_key_refs() when calling put_futex_keys() for private futexes and 3) changes the interface of the function to pass the 'fshared' variable, similarly to get_futex_key_refs(). In theory this should apply to all drop_futex_key_refs() callers, but just keep it simple and apply it as the get/put alternatives when calling futex(2). Signed-off-by: Davidlohr Bueso --- kernel/futex.c | 99 -- 1 file changed, 55 insertions(+), 44 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 815d7af..21f7e41 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -415,6 +415,11 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw) if (!fshared) { key->private.mm = mm; key->private.address = address; + /* +* Private futexes do not hold reference on an inode or +* mm, therefore the only purpose of calling get_futex_key_refs +* is because we need the barrier for the lockless waiter check. +*/ get_futex_key_refs(key); /* implies MB (B) */ return 0; } @@ -530,9 +535,14 @@ out: return err; } -static inline void put_futex_key(union futex_key *key) +static inline void put_futex_key(int fshared, union futex_key *key) { - drop_futex_key_refs(key); + /* +* See comment in get_futex_key() about key +* referencing when dealing with private futexes. +*/ + if (fshared) + drop_futex_key_refs(key); } /** @@ -1202,12 +1212,12 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) struct futex_hash_bucket *hb; struct futex_q *this, *next; union futex_key key = FUTEX_KEY_INIT; - int ret; + int ret, fshared = flags & FLAGS_SHARED; if (!bitset) return -EINVAL; - ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_READ); + ret = get_futex_key(uaddr, fshared, &key, VERIFY_READ); if (unlikely(ret != 0)) goto out; @@ -1238,7 +1248,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) spin_unlock(&hb->lock); out_put_key: - put_futex_key(&key); + put_futex_key(fshared, &key); out: return ret; } @@ -1254,13 +1264,13 @@ futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT; struct futex_hash_bucket *hb1, *hb2; struct futex_q *this, *next; - int ret, op_ret; + int ret, op_ret, fshared = flags & FLAGS_SHARED; retry: - ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1, VERIFY_READ); + ret = get_futex_key(uaddr1, fshared, &key1, VERIFY_READ); if (unlikely(ret != 0)) goto out; - ret = get_futex_key(uaddr2, flags & FLAGS_SHARED
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Sat, 2014-10-18 at 14:32 -0500, Darren Hart wrote: > Which is not incomplete (lacking the explicit smp_mb()) added by this > patch. Perhaps the MB implementation of get_futex_key_refs() need not be > explicitly enumerated here? Agreed, how about this: diff --git a/kernel/futex.c b/kernel/futex.c index 21f7e41..7a0805a 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -143,9 +143,8 @@ * * Where (A) orders the waiters increment and the futex value read through * atomic operations (see hb_waiters_inc) and where (B) orders the write - * to futex and the waiters read -- this is done by the barriers in - * get_futex_key_refs(), through either ihold or atomic_inc, depending on the - * futex type. + * to futex and the waiters read -- this is done by the barriers for both + * shared and private futexes in get_futex_key_refs(). * * This yields the following case (where X:=waiters, Y:=futex): * -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v11 04/21] mm: Allow page fault handlers to perform the COW
On Fri, Oct 17, 2014 at 03:35:01PM +, Mathieu Desnoyers wrote: > > > The page fault handler being very much performance sensitive, I'm > > > wondering if it would not be better to move cow_page near the end of > > > struct vm_fault, so that the "page" field can stay on the first > > > cache line. > > Although it's pretty much always true that recent architectures L2 cache > lines are 64 bytes, I was more thinking about L1 cache lines, which are, > at least on moderately old Intel Pentium HW, 32 bytes in size (AFAIK > Pentium II and III). > > It remains to be seen whether we care about performance that much on this > kind of HW though. Oh, I just remembered ... this data structure is on the stack, so if it's not cache-hot, something has gone horribly wrong. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v11 19/21] dax: Add dax_zero_page_range
On Fri, Oct 17, 2014 at 03:49:39PM +, Mathieu Desnoyers wrote: > > I kind of wonder if we shouldn't just declare the function. It's called > > like this: > > > > if (IS_DAX(inode)) > > return dax_zero_page_range(inode, from, length, > > ext4_get_block); > > return __ext4_block_zero_page_range(handle, mapping, from, length); > > > > and if CONFIG_DAX is not set, IS_DAX evaluates to 0 at compile time, so > > the compiler will optimise out the call to dax_zero_page_range() anyway. > > I strongly prefer to implement "unimplemented stub" as static inlines > rather than defining to 0, because the compiler can check that the types > passed to the function are valid, even in the #else configuration which > uses the stubs. I think my explanation was unclear. This is what I meant: +++ b/include/linux/fs.h @@ -2473,7 +2473,6 @@ extern loff_t fixed_size_llseek(struct file *file, loff_t offset, extern int generic_file_open(struct inode * inode, struct file * filp); extern int nonseekable_open(struct inode * inode, struct file * filp); -#ifdef CONFIG_FS_DAX int dax_clear_blocks(struct inode *, sector_t block, long size); int dax_zero_page_range(struct inode *, loff_t from, unsigned len, get_block_t) ; int dax_truncate_page(struct inode *, loff_t from, get_block_t); #define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb) -#else -static inline int dax_clear_blocks(struct inode *i, sector_t blk, long sz) -{ - return 0; -} - -static inline int dax_truncate_page(struct inode *i, loff_t frm, get_block_t gb) -{ - return 0; -} - -static inline int dax_zero_page_range(struct inode *i, loff_t frm, - unsigned len, get_block_t gb) -{ - return 0; -} - -static inline ssize_t dax_do_io(int rw, struct kiocb *iocb, - struct inode *inode, struct iov_iter *iter, loff_t pos, - get_block_t get_block, dio_iodone_t end_io, int flags) -{ - return -ENOTTY; -} -#endif #ifdef CONFIG_BLOCK typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode, So after the preprocessor has run, the compiler will see: if (0) return dax_zero_page_range(inode, from, length, ext4_get_block); and it will still do type checking on the call, even though it will eliminate the call. I think what you're really complaining about is that the argument to IS_DAX() is not checked for being an inode. We could solve that this way: #ifdef CONFIG_FS_DAX #define S_DAX 8192 #else #define S_DAX 0 #endif ... #define IS_DAX(inode) ((inode)->i_flags & S_DAX) After preprocessing, the compiler than sees: if (((inode)->i_flags & 0)) return dax_zero_page_range(inode, from, length, ext4_get_block); and successfully deduces that the condition evaluates to 0, and still elide the reference to dax_zero_page_range (checked with 'nm'). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
On Sat, Oct 18, 2014 at 12:58 PM, Davidlohr Bueso wrote: > > And [get/put]_futex_keys() shouldn't even be called for private futexes. > The following patch had some very minor testing on a 60 core box last > night, but passes both Darren's and perf's tests. So I *think* this is > right, but lack of sleep and I overall just don't trust them futexes! Hmm. I don't see the advantage of making the code more complex in order to avoid the functions that are no-ops for the !fshared case? IOW, as far as I can tell, this patch doesn't actually really *change* anything. Am I missing something? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC] platform: hp_accel: add a i8042 filter to remove accelerometer data
Hello, this patch fixes bug #84941 from the kernel bugzilla. Basically, it seems that the accelerometer sends some signals as button presses through the keyboard bus. The keys in the report are 0xa5-0xa8 but in the filter function they are reported as 0x25-0x28. This patch adds a i8042 filter that removes these scancodes from the keyboard stream in a similar fashion to how idealpad_sidebar.c does this. I've done a RFC because I'm not sure if there is more portable way to do this and if these codes are the same for all machines. So could please someone respond who uses this driver and tell which invalid keypresses appear (if they do) in your `dmesg` that are reported by atkbd? Also, I'm not sure if there is a publicly available documentation for hp 3d driveguard (I couldn't find it). That would definetly make it clear if this patch is correct or not. Adding a signed off by line incase you find this patch good and want to apply it. Signed-off-by: Giedrius Statkevičius --- drivers/platform/x86/hp_accel.c | 42 - 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/hp_accel.c b/drivers/platform/x86/hp_accel.c index 13e14ec..b1bea8c 100644 --- a/drivers/platform/x86/hp_accel.c +++ b/drivers/platform/x86/hp_accel.c @@ -38,6 +38,8 @@ #include #include #include "../../misc/lis3lv02d/lis3lv02d.h" +#include +#include #define DRIVER_NAME "hp_accel" #define ACPI_MDPS_CLASS "accelerometer" @@ -73,6 +75,13 @@ static inline void delayed_sysfs_set(struct led_classdev *led_cdev, /* HP-specific accelerometer driver */ +/* Codes of signals that the accelerometer sends + * through the keyboard bus */ +#define ACCEL_1 0x25 +#define ACCEL_2 0x26 +#define ACCEL_3 0x27 +#define ACCEL_4 0x28 + /* For automatic insertion of the module */ static const struct acpi_device_id lis3lv02d_device_ids[] = { {"HPQ0004", 0}, /* HP Mobile Data Protection System PNP */ @@ -82,7 +91,6 @@ static const struct acpi_device_id lis3lv02d_device_ids[] = { }; MODULE_DEVICE_TABLE(acpi, lis3lv02d_device_ids); - /** * lis3lv02d_acpi_init - ACPI _INI method: initialize the device. * @lis3: pointer to the device struct @@ -294,6 +302,32 @@ static void lis3lv02d_enum_resources(struct acpi_device *device) printk(KERN_DEBUG DRIVER_NAME ": Error getting resources\n"); } +static bool hp_accel_i8042_filter(unsigned char data, unsigned char str, + struct serio *port) +{ + static bool extended; + + if (str & I8042_STR_AUXDATA) + return false; + + if (data == 0xe0) { + extended = true; + return true; + } + + if (!extended) + return false; + + extended = false; + if (likely(data != ACCEL_1) && likely(data != ACCEL_2) && + likely(data != ACCEL_3) && likely(data != ACCEL_4)) { + serio_interrupt(port, 0xe0, 0); + return false; + } + + return true; +} + static int lis3lv02d_add(struct acpi_device *device) { int ret; @@ -326,6 +360,11 @@ static int lis3lv02d_add(struct acpi_device *device) if (ret) return ret; + /* filter to remove accelerometer data from keyboard bus stream */ + ret = i8042_install_filter(hp_accel_i8042_filter); + if (ret) + i8042_remove_filter(hp_accel_i8042_filter); + INIT_WORK(&hpled_led.work, delayed_set_status_worker); ret = led_classdev_register(NULL, &hpled_led.led_classdev); if (ret) { @@ -343,6 +382,7 @@ static int lis3lv02d_remove(struct acpi_device *device) if (!device) return -EINVAL; + i8042_remove_filter(hp_accel_i8042_filter); lis3lv02d_joystick_disable(&lis3_dev); lis3lv02d_poweroff(&lis3_dev); -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
On 10/18, Kirill Tkhai wrote: > > 18.10.2014, 01:40, "Oleg Nesterov" : > > ... > > The > > task_struct itself can't go away, > > ... > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -1158,7 +1158,13 @@ static void task_numa_compare(struct task_numa_env > > *env, > > > > rcu_read_lock(); > > cur = ACCESS_ONCE(dst_rq->curr); > > - if (cur->pid == 0) /* idle */ > > + /* > > + * No need to move the exiting task, and this ensures that ->curr > > + * wasn't reaped and thus get_task_struct() in task_numa_assign() > > + * is safe; note that rcu_read_lock() can't protect from the final > > + * put_task_struct() after the last schedule(). > > + */ > > + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) > > cur = NULL; > > > > /* > > Oleg, I've looked once again, and now it's not good for me. Ah. Thanks a lot Kirill for correcting me! I was looking at this rcu_read_lock() and I didn't even try to think what it can actually protect. Nothing. > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1165,7 +1165,21 @@ static void task_numa_compare(struct task_numa_env > *env, >^^ > rcu_read_lock(); > cur = ACCESS_ONCE(dst_rq->curr); > - if (cur->pid == 0) /* idle */ > + /* > + * No need to move the exiting task, and this ensures that ->curr > + * wasn't reaped and thus get_task_struct() in task_numa_assign() > + * is safe; note that rcu_read_lock() can't protect from the final > + * put_task_struct() after the last schedule(). > + */ > + if (is_idle_task(cur) || (cur->flags & PF_EXITING)) > + cur = NULL; > + /* > + * Check once again to be sure curr is still on dst_rq. Even if > + * it points on a new task, which is using the memory of freed > + * cur, it's OK, because we've locked RCU before > + * delayed_put_task_struct() callback is called to put its struct. > + */ > + if (cur != ACCESS_ONCE(dst_rq->curr)) No, I don't think this can work. Let's look at the current code: rcu_read_lock(); cur = ACCESS_ONCE(dst_rq->curr); if (cur->pid == 0) /* idle */ And any dereference, even reading ->pid is not safe. This memory can be freed, unmapped, reused, etc. Looks like, task_numa_compare() needs to take dst_rq->lock and get the refernce first. Or, perhaps, we need to change the rules to ensure that any "task_struct *" pointer is rcu-safe. Perhaps we have more similar problems... I'd like to avoid this if possible. Hmm. I'll try to think more. Thanks! Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mmc: dw_mmc: Reset DMA before enabling IDMAC
On Fri, Oct 17, 2014 at 1:26 AM, Jaehoon Chung wrote: > Hi, Sonny. > > On 10/17/2014 01:58 AM, Sonny Rao wrote: >> We've already got a reset of DMA after it's done. Add one before we >> start DMA too. This fixes a data corruption on Rockchip SoCs which >> will get bad data when doing a DMA transfer after doing a PIO transfer. >> >> We tested this on an Exynos 5800 with HS200 and didn't notice any >> difference in sequential read throughput. > > Didn't affect the write throughput? Write is usually much slower than read, but I went ahead and re-tested and saw no difference on writes. > I tested this on exynos3/4 with DDR50 and HS200. > > Acked-by: Jaehoon Chung > Tested-by: Jaehoon Chung > >> >> Signed-off-by: Sonny Rao >> Signed-off-by: Doug Anderson >> Tested-by: Doug Anderson >> --- >> drivers/mmc/host/dw_mmc.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c >> index 69f0cc6..ca67f69 100644 >> --- a/drivers/mmc/host/dw_mmc.c >> +++ b/drivers/mmc/host/dw_mmc.c >> @@ -83,6 +83,7 @@ struct idmac_desc { >> #endif /* CONFIG_MMC_DW_IDMAC */ >> >> static bool dw_mci_reset(struct dw_mci *host); >> +static bool dw_mci_ctrl_reset(struct dw_mci *host, u32 reset); >> >> #if defined(CONFIG_DEBUG_FS) >> static int dw_mci_req_show(struct seq_file *s, void *v) >> @@ -448,6 +449,10 @@ static void dw_mci_idmac_start_dma(struct dw_mci *host, >> unsigned int sg_len) >> >> dw_mci_translate_sglist(host, host->data, sg_len); >> >> + /* Make sure to reset DMA in case we did PIO before this */ >> + dw_mci_ctrl_reset(host, SDMMC_CTRL_DMA_RESET); >> + dw_mci_idmac_reset(host); >> + >> /* Select IDMAC interface */ >> temp = mci_readl(host, CTRL); >> temp |= SDMMC_CTRL_USE_IDMAC; >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v11 19/21] dax: Add dax_zero_page_range
- Original Message - > From: "Matthew Wilcox" > To: "Mathieu Desnoyers" > Cc: "Matthew Wilcox" , "Matthew Wilcox" > , > linux-fsde...@vger.kernel.org, linux...@kvack.org, > linux-kernel@vger.kernel.org, "Ross Zwisler" > > Sent: Saturday, October 18, 2014 7:41:00 PM > Subject: Re: [PATCH v11 19/21] dax: Add dax_zero_page_range > > On Fri, Oct 17, 2014 at 03:49:39PM +, Mathieu Desnoyers wrote: > > > I kind of wonder if we shouldn't just declare the function. It's called > > > like this: > > > > > > if (IS_DAX(inode)) > > > return dax_zero_page_range(inode, from, length, > > > ext4_get_block); > > > return __ext4_block_zero_page_range(handle, mapping, from, > > > length); > > > > > > and if CONFIG_DAX is not set, IS_DAX evaluates to 0 at compile time, so > > > the compiler will optimise out the call to dax_zero_page_range() anyway. > > > > I strongly prefer to implement "unimplemented stub" as static inlines > > rather than defining to 0, because the compiler can check that the types > > passed to the function are valid, even in the #else configuration which > > uses the stubs. > > I think my explanation was unclear. This is what I meant: > > +++ b/include/linux/fs.h > @@ -2473,7 +2473,6 @@ extern loff_t fixed_size_llseek(struct file *file, > loff_t > offset, > extern int generic_file_open(struct inode * inode, struct file * filp); > extern int nonseekable_open(struct inode * inode, struct file * filp); > > -#ifdef CONFIG_FS_DAX > int dax_clear_blocks(struct inode *, sector_t block, long size); > int dax_zero_page_range(struct inode *, loff_t from, unsigned len, > get_block_t) > ; > int dax_truncate_page(struct inode *, loff_t from, get_block_t); > #define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb) > -#else > -static inline int dax_clear_blocks(struct inode *i, sector_t blk, long sz) > -{ > - return 0; > -} > - > -static inline int dax_truncate_page(struct inode *i, loff_t frm, get_block_t > gb) > -{ > - return 0; > -} > - > -static inline int dax_zero_page_range(struct inode *i, loff_t frm, > - unsigned len, get_block_t gb) > -{ > - return 0; > -} > - > -static inline ssize_t dax_do_io(int rw, struct kiocb *iocb, > - struct inode *inode, struct iov_iter *iter, loff_t pos, > - get_block_t get_block, dio_iodone_t end_io, int flags) > -{ > - return -ENOTTY; > -} > -#endif > > #ifdef CONFIG_BLOCK > typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode, > > > So after the preprocessor has run, the compiler will see: > > if (0) > return dax_zero_page_range(inode, from, length, ext4_get_block); > > and it will still do type checking on the call, even though it will eliminate > the call. > Indeed, since Linux is always compiled in O2 or Os, it will work. > I think what you're really complaining about is that the argument to > IS_DAX() is not checked for being an inode. > > We could solve that this way: > > #ifdef CONFIG_FS_DAX > #define S_DAX 8192 > #else > #define S_DAX 0 > #endif > ... > #define IS_DAX(inode) ((inode)->i_flags & S_DAX) > > After preprocessing, the compiler than sees: > > if (((inode)->i_flags & 0)) > return dax_zero_page_range(inode, from, length, ext4_get_block); > > and successfully deduces that the condition evaluates to 0, and still > elide the reference to dax_zero_page_range (checked with 'nm'). Sounds good, Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
On 10/18, Peter Zijlstra wrote: > > So you worry about the refcount doing 0->1 ? In which case the above is > still wrong and we should be using atomic_inc_not_zero() in order to > acquire the reference count. It is actually worse, please see my reply to Kirill. We simply can't dereference foreign_rq->curr lockless. Again, task_struct is only protected by RCU if it was found on a RCU protected list. rq->curr is not protected by rcu. Perhaps we have to change this... but this will be a bit unfortunate. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Arch: ia64: kernel: acpi: fixed a brace coding style issue.
Fixed a coding style issue. Signed-off-by: Joseph Wessner --- arch/ia64/kernel/acpi.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 615ef81..12c032e 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -273,9 +273,8 @@ acpi_parse_plat_int_src(struct acpi_subtable_header * header, IOSAPIC_EDGE : IOSAPIC_LEVEL); platform_intr_list[plintsrc->type] = vector; - if (acpi_madt_rev > 1) { + if (acpi_madt_rev > 1) acpi_cpei_override = plintsrc->flags & ACPI_MADT_CPEI_OVERRIDE; - } /* * Save the physical id, so we can check when its being removed -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Machine crashes right *after* ~successful resume
On Thu, Oct 16, 2014 at 2:08 PM, Wilmer van der Gaast wrote: > Did that on this run, no difference either. For full completeness, I > reproduced this problem with no modules loaded (done from initramfs) at all, > with a kernel with your workaround included, logs are here: > http://gaast.net/~wilmer/.lkml/bad3.17-patched-debug-initramfs.txt Yes, those output are good. Please apply attached debug patch on top of v3.17 and boot with "debug ignore_loglevel initcall_debug no_console_suspend". Hope we can find out which nb notifier cause problem. Thanks Yinghai --- kernel/notifier.c |9 + kernel/power/main.c |4 +++- 2 files changed, 12 insertions(+), 1 deletion(-) Index: linux-2.6/kernel/power/main.c === --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -24,16 +24,18 @@ DEFINE_MUTEX(pm_mutex); /* Routines for PM-transition notifications */ -static BLOCKING_NOTIFIER_HEAD(pm_chain_head); +BLOCKING_NOTIFIER_HEAD(pm_chain_head); int register_pm_notifier(struct notifier_block *nb) { + pr_info("PM: registering nb %pF\n", nb->notifier_call); return blocking_notifier_chain_register(&pm_chain_head, nb); } EXPORT_SYMBOL_GPL(register_pm_notifier); int unregister_pm_notifier(struct notifier_block *nb) { + pr_info("PM: unregistering nb %pF\n", nb->notifier_call); return blocking_notifier_chain_unregister(&pm_chain_head, nb); } EXPORT_SYMBOL_GPL(unregister_pm_notifier); Index: linux-2.6/kernel/notifier.c === --- linux-2.6.orig/kernel/notifier.c +++ linux-2.6/kernel/notifier.c @@ -59,6 +59,9 @@ static int notifier_chain_unregister(str return -ENOENT; } +extern struct blocking_notifier_head pm_chain_head; +#define PM_POST_SUSPEND 0x0004 /* Suspend finished */ + /** * notifier_call_chain - Informs the registered notifiers about an event. * @nl: Pointer to head of the blocking notifier chain @@ -90,8 +93,14 @@ static int notifier_call_chain(struct no continue; } #endif + if (nl == &pm_chain_head.head && val == PM_POST_SUSPEND) + pr_info("PM: calling nb %pF\n", nb->notifier_call); + ret = nb->notifier_call(nb, val, v); + if (nl == &pm_chain_head.head && val == PM_POST_SUSPEND) + pr_info("PM: ... nb %pF done\n", nb->notifier_call); + if (nr_calls) (*nr_calls)++;
Re: [PATCH 3/7] wait.[ch]: Introduce the simple waitqueue (swait) implementation
On Fri, Oct 17, 2014 at 08:22:58PM -0400, Paul Gortmaker wrote: > @@ -75,6 +123,32 @@ static void __cwake_up_common(struct cwait_head *q, > unsigned int mode, > } > } > > +static void __swake_up_common(struct swait_head *q, unsigned int mode, > + int nr_exclusive) > +{ > + struct swait *curr, *next; > + int woken = 0; > + > + list_for_each_entry_safe(curr, next, &q->task_list, node) { > + if (wake_up_state(curr->task, mode)) { /* <-- calls ttwu() */ > + __remove_swait(q, curr); > + curr->task = NULL; > + /* > + * The waiting task can free the waiter as > + * soon as curr->task = NULL is written, > + * without taking any locks. A memory barrier > + * is required here to prevent the following > + * store to curr->task from getting ahead of > + * the dequeue operation. > + */ > + smp_wmb(); > + if (++woken == nr_exclusive) > + break; > + } > + > + } > +} > + > /** > * __cwake_up - wake up threads blocked on a waitqueue. > * @q: the complex waitqueue > @@ -96,6 +170,19 @@ void __cwake_up(struct cwait_head *q, unsigned int mode, > int nr_exclusive, > } > EXPORT_SYMBOL(__cwake_up); > > +void __swake_up(struct swait_head *q, unsigned int mode, int nr_exclusive) > +{ > + unsigned long flags; > + > + if (!swait_active(q)) > + return; > + > + raw_spin_lock_irqsave(&q->lock, flags); > + __swake_up_common(q, mode, nr_exclusive); > + raw_spin_unlock_irqrestore(&q->lock, flags); > +} > +EXPORT_SYMBOL(__swake_up); Same comment as before, that is an unbounded loop in a non preemptible section and therefore violates RT design principles. We actually did talk about ways of fixing that. Also, I'm not entirely sure we want to do the cwait thing, it looks painful. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging: android: binder: move to the "real" part of the kernel
> Do we really need someone to do more work that has been done on it in > the past as an official "maintainer"? I'll be glad to do it, as I doubt > it will require any time at all. Well every time in the past that Al Viro looked in its direction he broke it so probably. Someone is going to have to clean up or fix the fact it pokes around in the depths of the low level fd I/O code and calls stuff like __fd_install and __alloc_fd directly, or mend it if it breaks. I'm curious what Al Viro thinks of it > > Currently in the android space no one but libbinder should use the > > kernel interface. > > That is correct. If you do that, you deserve all of the pain and > suffering and rooted machines you will get. So what is the Android side model for its security. That probably also should be described so nobody goes off and uses it for something like systemd because "it looked neat". > But all of the changes will be in new code. Be it kdbus, or something > else if that doesn't work out. This existing binder.c file will not be > changing at all. This existing ABI, and codebase, is something that we > have to maintain forever for those millions of devices out there in the > real world today. 95% of those devices are locked down, most of them have non replaceable batteries that will dead and irreplacable (sanely anyway) in 3-5 years. "Forever" in the phone world is mercifully rather short. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] FS: aio: Fixed some coding style issues.
Fixed some coding style issues in aio.c Signed-off-by: Joseph Wessner --- fs/aio.c | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 84a7510..eab03a6 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -171,6 +171,7 @@ static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) struct file *file; struct path path; struct inode *inode = alloc_anon_inode(aio_mnt->mnt_sb); + if (IS_ERR(inode)) return ERR_CAST(inode); @@ -221,7 +222,7 @@ static int __init aio_setup(void) panic("Failed to create aio fs mount."); kiocb_cachep = KMEM_CACHE(kiocb, SLAB_HWCACHE_ALIGN|SLAB_PANIC); - kioctx_cachep = KMEM_CACHE(kioctx,SLAB_HWCACHE_ALIGN|SLAB_PANIC); + kioctx_cachep = KMEM_CACHE(kioctx, SLAB_HWCACHE_ALIGN|SLAB_PANIC); pr_debug("sizeof(struct page) = %zu\n", sizeof(struct page)); @@ -232,6 +233,7 @@ __initcall(aio_setup); static void put_aio_ring_file(struct kioctx *ctx) { struct file *aio_ring_file = ctx->aio_ring_file; + if (aio_ring_file) { truncate_setsize(aio_ring_file->f_inode, 0); @@ -256,6 +258,7 @@ static void aio_free_ring(struct kioctx *ctx) for (i = 0; i < ctx->nr_pages; i++) { struct page *page; + pr_debug("pid(%d) [%d] page->count=%d\n", current->pid, i, page_count(ctx->ring_pages[i])); page = ctx->ring_pages[i]; @@ -405,6 +408,7 @@ static int aio_setup_ring(struct kioctx *ctx) for (i = 0; i < nr_pages; i++) { struct page *page; + page = find_or_create_page(file->f_inode->i_mapping, i, GFP_HIGHUSER | __GFP_ZERO); if (!page) @@ -871,7 +875,7 @@ out: * called holding ctx->completion_lock. */ static void refill_reqs_available(struct kioctx *ctx, unsigned head, - unsigned tail) + unsigned tail) { unsigned events_in_ring, completed; @@ -1234,10 +1238,10 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, * Create an aio_context capable of receiving at least nr_events. * ctxp must not point to an aio_context that already exists, and * must be initialized to 0 prior to the call. On successful - * creation of the aio_context, *ctxp is filled in with the resulting + * creation of the aio_context, *ctxp is filled in with the resulting * handle. May fail with -EINVAL if *ctxp is not initialized, - * if the specified nr_events exceeds internal limits. May fail - * with -EAGAIN if the specified nr_events exceeds the user's limit + * if the specified nr_events exceeds internal limits. May fail + * with -EAGAIN if the specified nr_events exceeds the user's limit * of available events. May fail with -ENOMEM if insufficient kernel * resources are available. May fail with -EFAULT if an invalid * pointer is passed for ctxp. Will fail with -ENOSYS if not @@ -1256,7 +1260,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) ret = -EINVAL; if (unlikely(ctx || nr_events == 0)) { pr_debug("EINVAL: io_setup: ctx %lu nr_events %u\n", -ctx, nr_events); +ctx, nr_events); goto out; } @@ -1274,7 +1278,7 @@ out: } /* sys_io_destroy: - * Destroy the aio_context specified. May cancel any outstanding + * Destroy the aio_context specified. May cancel any outstanding * AIOs and block on completion. Will fail with -ENOSYS if not * implemented. May fail with -EINVAL if the context pointed to * is invalid. @@ -1282,6 +1286,7 @@ out: SYSCALL_DEFINE1(io_destroy, aio_context_t, ctx) { struct kioctx *ioctx = lookup_ioctx(ctx); + if (likely(NULL != ioctx)) { struct completion requests_done = COMPLETION_INITIALIZER_ONSTACK(requests_done); @@ -1568,7 +1573,7 @@ long do_io_submit(aio_context_t ctx_id, long nr, * AKPM: should this return a partial result if some of the IOs were * successfully submitted? */ - for (i=0; iactive_reqs) { struct kiocb *kiocb = list_kiocb(pos); + if (kiocb->ki_obj.user == iocb) return kiocb; } -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/