Re: RFC: WMI Enhancements
On Mon, Apr 17, 2017 at 03:03:29PM -0700, Andy Lutomirski wrote: > On Fri, Apr 14, 2017 at 4:05 PM, Darren Hartwrote: > > On Sat, Apr 15, 2017 at 12:45:30AM +0200, Rafael Wysocki wrote: > >> On Wednesday, April 12, 2017 04:08:54 PM Darren Hart wrote: > >> > Hi All, > >> > > >> > There are a few parallel efforts involving the Windows Management > >> > Instrumentation (WMI)[1] and dependent/related drivers. I'd like to have > >> > a round of > >> > discussion among those of you that have been involved in this space > >> > before we > >> > decide on a direction. > >> > > >> > The WMI support in the kernel today fairly narrowly supports a handful of > >> > systems. Andy L. has a work-in-progress series [2] which converts wmi > >> > into a > >> > platform device and a proper bus, providing devices for dependent > >> > drivers to > >> > bind to, and a mechanism for sibling devices to communicate with each > >> > other. > >> > I've reviewed the series and feel like the approach is sound, I plan to > >> > carry > >> > this series forward and merge it (with Andy L's permission). > >> > > >> > Are there any objections to this? > >> > > >> > In Windows, applications interact with WMI more or less directly. We > >> > don't do > >> > this in Linux currently, although it has been discussed in the past [3]. > >> > Some > >> > vendors will work around this by performing SMI/SMM, which is > >> > inefficient at > >> > best. Exposing WMI methods to userspace would bring parity to WMI for > >> > Linux and > >> > Windows. > >> > > >> > There are two principal concerns I'd appreciate your thoughts on: > >> > > >> > a) As an undiscoverable interface (you need to know the method > >> > signatures ahead > >> > of time), universally exposing every WMI "device" to userspace seems > >> > like "a bad > >> > idea" from a security and stability perspective. While access would > >> > certainly be > >> > privileged, it seems more prudent to make this exposure opt-in. We also > >> > handle > >> > some of this with kernel drivers and exposing those "devices" to > >> > userspace would > >> > enable userspace and the kernel to fight over control. So - if we expose > >> > WMI > >> > devices to userspace, I believe this should be done on a case by case > >> > basis, > >> > opting in, and not by default as part of the WMI driver (although it can > >> > provide > >> > the mechanism for a sub-driver to use), and possibly a devmode to do so > >> > by > >> > default. > >> > >> A couple of loose thoughts here. > >> > >> In principle there could be a "generic default WMI driver" or similar that > >> would > >> "claim" all WMI "devices" that have not been "claimed" by anyone else and > >> would > >> simply expose them to user space somehow (e.g. using a chardev interface). > >> > >> Then, depending on how that thing is implemented, opt-in etc should be > >> possible > >> too. > >> > > > > I think we agree this would be an ideal approach. > > > > As we look into this more, it is becoming clear that the necessary > > functionality > > is not nicely divided into GUIDs for what is necessary in userspace and > > what is > > handled in the kernel. A single WMI METHOD GUID may be needed by userspace > > for > > certain functionality, while the kernel drivers may use it for something > > else. > > > > :-( > > > > The input to a WMI method is just a buffer, so it is very free form. One > > approach Mario has mentioned was to audit the user space WMI METHOD calls > > in the > > kernel platform drivers and reject those calls with arguments matching those > > issued by the kernel driver. This is likely to be complex and error prone > > in my > > opinion. However, I have not yet thought of another means to meet the > > requirement of having disjoint feature sets for userspace and kernel space > > via a > > mechanism that was effectively designed to be used solely from user space > > with > > vendor defined method signatures. > > > > Next step is to look at just how complex it would be to audit the method > > calls > > the kernel currently uses. > > I'm wondering whether it's really worth it. We already allow doing > darned near anything using dcdbas. Maybe the world won't end if we > expose a complete-ish ioctl interface to WMI. > > Also, dcdbas is, to put it mildly, a bit ridiculous. It seems to be a > seriously awkward sysfs interface that allows you to, drumroll please, > issue outb and inb instructions. It doesn't even check that it's > running on a Dell system. It might be nice to deprecate it some day > in favor of a real interface. I'd consider a low-level WMI ioctl > interface to be a vast improvement. > I've been reluctantly arriving here as well. Given that every WMI interface will be vendor specific, and non-discoverable, it seems unlikely developers will eagerly duplicate kernel functionality in user-space. And if they do, it will affect very few platforms. I still think it makes sense
Re: RFC: WMI Enhancements
On Mon, Apr 17, 2017 at 03:03:29PM -0700, Andy Lutomirski wrote: > On Fri, Apr 14, 2017 at 4:05 PM, Darren Hart wrote: > > On Sat, Apr 15, 2017 at 12:45:30AM +0200, Rafael Wysocki wrote: > >> On Wednesday, April 12, 2017 04:08:54 PM Darren Hart wrote: > >> > Hi All, > >> > > >> > There are a few parallel efforts involving the Windows Management > >> > Instrumentation (WMI)[1] and dependent/related drivers. I'd like to have > >> > a round of > >> > discussion among those of you that have been involved in this space > >> > before we > >> > decide on a direction. > >> > > >> > The WMI support in the kernel today fairly narrowly supports a handful of > >> > systems. Andy L. has a work-in-progress series [2] which converts wmi > >> > into a > >> > platform device and a proper bus, providing devices for dependent > >> > drivers to > >> > bind to, and a mechanism for sibling devices to communicate with each > >> > other. > >> > I've reviewed the series and feel like the approach is sound, I plan to > >> > carry > >> > this series forward and merge it (with Andy L's permission). > >> > > >> > Are there any objections to this? > >> > > >> > In Windows, applications interact with WMI more or less directly. We > >> > don't do > >> > this in Linux currently, although it has been discussed in the past [3]. > >> > Some > >> > vendors will work around this by performing SMI/SMM, which is > >> > inefficient at > >> > best. Exposing WMI methods to userspace would bring parity to WMI for > >> > Linux and > >> > Windows. > >> > > >> > There are two principal concerns I'd appreciate your thoughts on: > >> > > >> > a) As an undiscoverable interface (you need to know the method > >> > signatures ahead > >> > of time), universally exposing every WMI "device" to userspace seems > >> > like "a bad > >> > idea" from a security and stability perspective. While access would > >> > certainly be > >> > privileged, it seems more prudent to make this exposure opt-in. We also > >> > handle > >> > some of this with kernel drivers and exposing those "devices" to > >> > userspace would > >> > enable userspace and the kernel to fight over control. So - if we expose > >> > WMI > >> > devices to userspace, I believe this should be done on a case by case > >> > basis, > >> > opting in, and not by default as part of the WMI driver (although it can > >> > provide > >> > the mechanism for a sub-driver to use), and possibly a devmode to do so > >> > by > >> > default. > >> > >> A couple of loose thoughts here. > >> > >> In principle there could be a "generic default WMI driver" or similar that > >> would > >> "claim" all WMI "devices" that have not been "claimed" by anyone else and > >> would > >> simply expose them to user space somehow (e.g. using a chardev interface). > >> > >> Then, depending on how that thing is implemented, opt-in etc should be > >> possible > >> too. > >> > > > > I think we agree this would be an ideal approach. > > > > As we look into this more, it is becoming clear that the necessary > > functionality > > is not nicely divided into GUIDs for what is necessary in userspace and > > what is > > handled in the kernel. A single WMI METHOD GUID may be needed by userspace > > for > > certain functionality, while the kernel drivers may use it for something > > else. > > > > :-( > > > > The input to a WMI method is just a buffer, so it is very free form. One > > approach Mario has mentioned was to audit the user space WMI METHOD calls > > in the > > kernel platform drivers and reject those calls with arguments matching those > > issued by the kernel driver. This is likely to be complex and error prone > > in my > > opinion. However, I have not yet thought of another means to meet the > > requirement of having disjoint feature sets for userspace and kernel space > > via a > > mechanism that was effectively designed to be used solely from user space > > with > > vendor defined method signatures. > > > > Next step is to look at just how complex it would be to audit the method > > calls > > the kernel currently uses. > > I'm wondering whether it's really worth it. We already allow doing > darned near anything using dcdbas. Maybe the world won't end if we > expose a complete-ish ioctl interface to WMI. > > Also, dcdbas is, to put it mildly, a bit ridiculous. It seems to be a > seriously awkward sysfs interface that allows you to, drumroll please, > issue outb and inb instructions. It doesn't even check that it's > running on a Dell system. It might be nice to deprecate it some day > in favor of a real interface. I'd consider a low-level WMI ioctl > interface to be a vast improvement. > I've been reluctantly arriving here as well. Given that every WMI interface will be vendor specific, and non-discoverable, it seems unlikely developers will eagerly duplicate kernel functionality in user-space. And if they do, it will affect very few platforms. I still think it makes sense to only expose a WMI
[PATCH 2/3] perf hists browser: add callchain-specific annotation
By pressing 'a' in the hists browser, the user gets an annotated view of the code of the selected symbols. This patch adds the case 'A'; if this key is pressed (and if the call-graph option was enabled at record time), perf will build the callchain from the top frame until the selected symbol and select the corresponding annotation histogram. Thus, the user will get an annotated view of the symbol code specific to the callchain displayed in the hists browser. Signed-off-by: Alexis Berlemont--- tools/perf/ui/browsers/hists.c | 188 - 1 file changed, 185 insertions(+), 3 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index da24072bb76e..9ed7fdc4dc51 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -2543,17 +2543,157 @@ struct popup_action { struct thread *thread; struct map_symbol ms; int socket; + boolcontext_annotate; int (*fn)(struct hist_browser *browser, struct popup_action *act); }; +static int copy_call_list_entry(struct list_head *callchain, + struct callchain_list *call) +{ + struct call_list_entry *entry = zalloc(sizeof(*entry)); + + if (!entry) { + perror("not enough memory to scan callchains"); + return -1; + } + + entry->ip = entry->sym_start = call->ip; + if (call->ms.sym != NULL) + entry->sym_start = call->ms.sym->start; + + list_add_tail(>list, callchain); + + return 0; +} + +static int __hist_browser_build_callchain(struct list_head *callchain, + struct rb_root *root, + struct callchain_list *target) +{ + struct callchain_node *tmp_node; + struct rb_node *node; + struct callchain_list *call; + struct call_list_entry *new_call, *tmp; + + node = rb_first(root); + + while (node) { + char folded_sign = ' '; + size_t added_count = 0; + + tmp_node = rb_entry(node, struct callchain_node, rb_node); + + /* +* If the callchain display mode is "flat", the list +* "parent_val" may contain the entries in common. +*/ + + list_for_each_entry(call, _node->parent_val, list) { + + /* +* If we have not found the highlighted callchain +* entry... +*/ + + if (target == call) + return 0; + + /* +* ...we need to keep the current element: the next +* one could be the right one and we need to build a +* callchain. +*/ + + if (copy_call_list_entry(callchain, call) < 0) + return -1; + + added_count++; + } + + /* +* If the callchain display mode is "graph", "fractal" or even +* "flat", the callchain entries (the last one for "flat" are in +* the list "val". +*/ + + list_for_each_entry(call, _node->val, list) { + + /* +* If we have not found the highlighted callchain +* entry... +*/ + + if (target == call) + return 0; + + /* +* ...we need to keep the current element: the next +* one could be the right one and we need to build a +* callchain. +*/ + + if (copy_call_list_entry(callchain, call) < 0) + return -1; + + added_count++; + + /* +* If we meet the folded sign '+' (and if the current +* element does not match), there is no need to go +* further, the callchain elements below cannot be the +* ones we are looking for. +*/ + + folded_sign = callchain_list__folded(call); + if (folded_sign == '+') + break; + } + + /* +* If the last scanned entry is unfolded, the callchain element +* we are looking for may be behing; so, let's scan its tree of +* callchain nodes. +*/ + + if (folded_sign == '-' && +
[PATCH 3/3] perf report: fill per-callchain symbol annotation histograms
The per-callchain histograms must be fed at some points with the profiling samples. A solution is to fill them right after having filled the per-symbol ones: in the callback hist_iter__report_callback. Signed-off-by: Alexis Berlemont--- tools/perf/builtin-report.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c18158b83eb1..d825a599d4b4 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -136,6 +136,14 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter, if (single) err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr); + if (err == 0) { + struct callchain_cursor *cursor = _cursor; + + err = hist_entry_cxt__inc_addr_samples(he, + evsel->idx, + al->addr, + cursor); + } } else { err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr); } -- 2.12.2
[PATCH 2/3] perf hists browser: add callchain-specific annotation
By pressing 'a' in the hists browser, the user gets an annotated view of the code of the selected symbols. This patch adds the case 'A'; if this key is pressed (and if the call-graph option was enabled at record time), perf will build the callchain from the top frame until the selected symbol and select the corresponding annotation histogram. Thus, the user will get an annotated view of the symbol code specific to the callchain displayed in the hists browser. Signed-off-by: Alexis Berlemont --- tools/perf/ui/browsers/hists.c | 188 - 1 file changed, 185 insertions(+), 3 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index da24072bb76e..9ed7fdc4dc51 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -2543,17 +2543,157 @@ struct popup_action { struct thread *thread; struct map_symbol ms; int socket; + boolcontext_annotate; int (*fn)(struct hist_browser *browser, struct popup_action *act); }; +static int copy_call_list_entry(struct list_head *callchain, + struct callchain_list *call) +{ + struct call_list_entry *entry = zalloc(sizeof(*entry)); + + if (!entry) { + perror("not enough memory to scan callchains"); + return -1; + } + + entry->ip = entry->sym_start = call->ip; + if (call->ms.sym != NULL) + entry->sym_start = call->ms.sym->start; + + list_add_tail(>list, callchain); + + return 0; +} + +static int __hist_browser_build_callchain(struct list_head *callchain, + struct rb_root *root, + struct callchain_list *target) +{ + struct callchain_node *tmp_node; + struct rb_node *node; + struct callchain_list *call; + struct call_list_entry *new_call, *tmp; + + node = rb_first(root); + + while (node) { + char folded_sign = ' '; + size_t added_count = 0; + + tmp_node = rb_entry(node, struct callchain_node, rb_node); + + /* +* If the callchain display mode is "flat", the list +* "parent_val" may contain the entries in common. +*/ + + list_for_each_entry(call, _node->parent_val, list) { + + /* +* If we have not found the highlighted callchain +* entry... +*/ + + if (target == call) + return 0; + + /* +* ...we need to keep the current element: the next +* one could be the right one and we need to build a +* callchain. +*/ + + if (copy_call_list_entry(callchain, call) < 0) + return -1; + + added_count++; + } + + /* +* If the callchain display mode is "graph", "fractal" or even +* "flat", the callchain entries (the last one for "flat" are in +* the list "val". +*/ + + list_for_each_entry(call, _node->val, list) { + + /* +* If we have not found the highlighted callchain +* entry... +*/ + + if (target == call) + return 0; + + /* +* ...we need to keep the current element: the next +* one could be the right one and we need to build a +* callchain. +*/ + + if (copy_call_list_entry(callchain, call) < 0) + return -1; + + added_count++; + + /* +* If we meet the folded sign '+' (and if the current +* element does not match), there is no need to go +* further, the callchain elements below cannot be the +* ones we are looking for. +*/ + + folded_sign = callchain_list__folded(call); + if (folded_sign == '+') + break; + } + + /* +* If the last scanned entry is unfolded, the callchain element +* we are looking for may be behing; so, let's scan its tree of +* callchain nodes. +*/ + + if (folded_sign == '-' && + __hist_browser_build_callchain(callchain, +
[PATCH 3/3] perf report: fill per-callchain symbol annotation histograms
The per-callchain histograms must be fed at some points with the profiling samples. A solution is to fill them right after having filled the per-symbol ones: in the callback hist_iter__report_callback. Signed-off-by: Alexis Berlemont --- tools/perf/builtin-report.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c18158b83eb1..d825a599d4b4 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -136,6 +136,14 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter, if (single) err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr); + if (err == 0) { + struct callchain_cursor *cursor = _cursor; + + err = hist_entry_cxt__inc_addr_samples(he, + evsel->idx, + al->addr, + cursor); + } } else { err = hist_entry__inc_addr_samples(he, evsel->idx, al->addr); } -- 2.12.2
[PATCH 1/3] perf annotate: implement per-callchain annotation histogram
A symbol can be called from various points and according to its calling context, it can provide different results. This patch creates one histogram for every different callchain recorded and accumulates profiling samples into the appropriate one. Samples recorded with the same callchain are not mixed with other ones, The noise should be reduced. Signed-off-by: Alexis Berlemont--- tools/perf/util/annotate.c | 308 - tools/perf/util/annotate.h | 19 +++ 2 files changed, 324 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 30498a2d4a6f..fa7691a0f205 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -18,6 +18,7 @@ #include "annotate.h" #include "evsel.h" #include "block-range.h" +#include "callchain.h" #include "arch/common.h" #include #include @@ -554,7 +555,7 @@ int symbol__alloc_hist(struct symbol *sym) { struct annotation *notes = symbol__annotation(sym); const size_t size = symbol__size(sym); - size_t sizeof_sym_hist; + size_t sizeof_sym_hist, nr_sym_hist; /* Check for overflow when calculating sizeof_sym_hist */ if (size > (SIZE_MAX - sizeof(struct sym_hist)) / sizeof(u64)) @@ -567,12 +568,17 @@ int symbol__alloc_hist(struct symbol *sym) / symbol_conf.nr_events) return -1; - notes->src = zalloc(sizeof(*notes->src) + symbol_conf.nr_events * sizeof_sym_hist); + /* Allocate 1 histogram per event and 1 more for context annotation */ + nr_sym_hist = symbol_conf.nr_events + 1; + + notes->src = zalloc(sizeof(*notes->src) + + nr_sym_hist * sizeof_sym_hist); if (notes->src == NULL) return -1; notes->src->sizeof_sym_hist = sizeof_sym_hist; - notes->src->nr_histograms = symbol_conf.nr_events; + notes->src->nr_histograms = nr_sym_hist; INIT_LIST_HEAD(>src->source); + INIT_LIST_HEAD(>src->context_hists); return 0; } @@ -591,11 +597,18 @@ static int symbol__alloc_hist_cycles(struct symbol *sym) void symbol__annotate_zero_histograms(struct symbol *sym) { struct annotation *notes = symbol__annotation(sym); + struct cxt_hist_entry *hist_entry; pthread_mutex_lock(>lock); if (notes->src != NULL) { memset(notes->src->histograms, 0, notes->src->nr_histograms * notes->src->sizeof_sym_hist); + + list_for_each_entry(hist_entry, + >src->context_hists, list) { + list_del(_entry->list); + free(hist_entry); + } if (notes->src->cycles_hist) memset(notes->src->cycles_hist, 0, symbol__size(sym) * sizeof(struct cyc_hist)); @@ -681,6 +694,7 @@ static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles if (symbol__alloc_hist_cycles(sym) < 0) return NULL; } + return notes; } @@ -697,6 +711,281 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map, return __symbol__inc_addr_samples(sym, map, notes, evidx, addr); } +static bool __symbol_cxt__cmp_call_cursor(struct cxt_hist_entry *hist_entry, + struct callchain_cursor *cursor) +{ + struct callchain_cursor_node *node; + struct call_list_entry *entry; + u64 match_count = 0; + + callchain_cursor_commit(cursor); + + /* +* One element of the callchain cursor cannot be matched: the last +* element (or first according to the order) which corresponds to the +* real sampled address; so, let's skip it (if need be) and... +*/ + if (callchain_param.order == ORDER_CALLEE) + callchain_cursor_advance(cursor); + + /* ...consider it has matched */ + match_count++; + + list_for_each_entry(entry, _entry->callchain, list) { + + node = callchain_cursor_current(cursor); + if (node == NULL) + break; + + if (entry->ip == node->ip) + match_count++; + + callchain_cursor_advance(cursor); + } + + return match_count == cursor->nr; +} + +static int __symbol_cxt__copy_call_cursor(struct cxt_hist_entry *hist_entry, + struct callchain_cursor *cursor) +{ + struct callchain_cursor_node *node; + struct call_list_entry *entry, *n; + + callchain_cursor_commit(cursor); + node = callchain_cursor_current(cursor); + + /* +* For each entry in the callchain cursor, we need to copy 2 fields: the +* text addresses and the symbol in which the address is located...
[PATCH 1/3] perf annotate: implement per-callchain annotation histogram
A symbol can be called from various points and according to its calling context, it can provide different results. This patch creates one histogram for every different callchain recorded and accumulates profiling samples into the appropriate one. Samples recorded with the same callchain are not mixed with other ones, The noise should be reduced. Signed-off-by: Alexis Berlemont --- tools/perf/util/annotate.c | 308 - tools/perf/util/annotate.h | 19 +++ 2 files changed, 324 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 30498a2d4a6f..fa7691a0f205 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -18,6 +18,7 @@ #include "annotate.h" #include "evsel.h" #include "block-range.h" +#include "callchain.h" #include "arch/common.h" #include #include @@ -554,7 +555,7 @@ int symbol__alloc_hist(struct symbol *sym) { struct annotation *notes = symbol__annotation(sym); const size_t size = symbol__size(sym); - size_t sizeof_sym_hist; + size_t sizeof_sym_hist, nr_sym_hist; /* Check for overflow when calculating sizeof_sym_hist */ if (size > (SIZE_MAX - sizeof(struct sym_hist)) / sizeof(u64)) @@ -567,12 +568,17 @@ int symbol__alloc_hist(struct symbol *sym) / symbol_conf.nr_events) return -1; - notes->src = zalloc(sizeof(*notes->src) + symbol_conf.nr_events * sizeof_sym_hist); + /* Allocate 1 histogram per event and 1 more for context annotation */ + nr_sym_hist = symbol_conf.nr_events + 1; + + notes->src = zalloc(sizeof(*notes->src) + + nr_sym_hist * sizeof_sym_hist); if (notes->src == NULL) return -1; notes->src->sizeof_sym_hist = sizeof_sym_hist; - notes->src->nr_histograms = symbol_conf.nr_events; + notes->src->nr_histograms = nr_sym_hist; INIT_LIST_HEAD(>src->source); + INIT_LIST_HEAD(>src->context_hists); return 0; } @@ -591,11 +597,18 @@ static int symbol__alloc_hist_cycles(struct symbol *sym) void symbol__annotate_zero_histograms(struct symbol *sym) { struct annotation *notes = symbol__annotation(sym); + struct cxt_hist_entry *hist_entry; pthread_mutex_lock(>lock); if (notes->src != NULL) { memset(notes->src->histograms, 0, notes->src->nr_histograms * notes->src->sizeof_sym_hist); + + list_for_each_entry(hist_entry, + >src->context_hists, list) { + list_del(_entry->list); + free(hist_entry); + } if (notes->src->cycles_hist) memset(notes->src->cycles_hist, 0, symbol__size(sym) * sizeof(struct cyc_hist)); @@ -681,6 +694,7 @@ static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles if (symbol__alloc_hist_cycles(sym) < 0) return NULL; } + return notes; } @@ -697,6 +711,281 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map, return __symbol__inc_addr_samples(sym, map, notes, evidx, addr); } +static bool __symbol_cxt__cmp_call_cursor(struct cxt_hist_entry *hist_entry, + struct callchain_cursor *cursor) +{ + struct callchain_cursor_node *node; + struct call_list_entry *entry; + u64 match_count = 0; + + callchain_cursor_commit(cursor); + + /* +* One element of the callchain cursor cannot be matched: the last +* element (or first according to the order) which corresponds to the +* real sampled address; so, let's skip it (if need be) and... +*/ + if (callchain_param.order == ORDER_CALLEE) + callchain_cursor_advance(cursor); + + /* ...consider it has matched */ + match_count++; + + list_for_each_entry(entry, _entry->callchain, list) { + + node = callchain_cursor_current(cursor); + if (node == NULL) + break; + + if (entry->ip == node->ip) + match_count++; + + callchain_cursor_advance(cursor); + } + + return match_count == cursor->nr; +} + +static int __symbol_cxt__copy_call_cursor(struct cxt_hist_entry *hist_entry, + struct callchain_cursor *cursor) +{ + struct callchain_cursor_node *node; + struct call_list_entry *entry, *n; + + callchain_cursor_commit(cursor); + node = callchain_cursor_current(cursor); + + /* +* For each entry in the callchain cursor, we need to copy 2 fields: the +* text addresses and the symbol in which the address is located... +*/ + while
[PATCH 0/3] perf: callchain-specific annotation
Hi, These patches are a proposal to fulfill the following point of perf's TODO list (https://perf.wiki.kernel.org/index.php/Todo): * What I want is that if I am on bar*(), it annotates bar*(), no samples just the call site (obtained from the callchain) dissassembly. This is useful because in many cases there maybe multiple call sites within a function and there maybe inlines in between. Hard to track down if you cannot figure out the surrounding addresses of the call site. (Request made by Stephane Eranian). These patches are still at an early stage: * Per-callchain annotation is only available in perf-report; * Tests were not performed on real-world applications but on small basic ones. Alexis. Alexis Berlemont (3): perf annotate: implement per-callchain annotation histogram perf hists browser: add callchain-specific annotation perf report: fill per-callchain symbol annotation histograms tools/perf/builtin-report.c| 8 ++ tools/perf/ui/browsers/hists.c | 188 - tools/perf/util/annotate.c | 308 - tools/perf/util/annotate.h | 19 +++ 4 files changed, 517 insertions(+), 6 deletions(-) -- 2.12.2
[PATCH 0/3] perf: callchain-specific annotation
Hi, These patches are a proposal to fulfill the following point of perf's TODO list (https://perf.wiki.kernel.org/index.php/Todo): * What I want is that if I am on bar*(), it annotates bar*(), no samples just the call site (obtained from the callchain) dissassembly. This is useful because in many cases there maybe multiple call sites within a function and there maybe inlines in between. Hard to track down if you cannot figure out the surrounding addresses of the call site. (Request made by Stephane Eranian). These patches are still at an early stage: * Per-callchain annotation is only available in perf-report; * Tests were not performed on real-world applications but on small basic ones. Alexis. Alexis Berlemont (3): perf annotate: implement per-callchain annotation histogram perf hists browser: add callchain-specific annotation perf report: fill per-callchain symbol annotation histograms tools/perf/builtin-report.c| 8 ++ tools/perf/ui/browsers/hists.c | 188 - tools/perf/util/annotate.c | 308 - tools/perf/util/annotate.h | 19 +++ 4 files changed, 517 insertions(+), 6 deletions(-) -- 2.12.2
[PATCH] nl80211: Fix enum type of variable in nl80211_put_sta_rate()
rate_flg is of type 'enum nl80211_attrs', however it is assigned with 'enum nl80211_rate_info' values. Change the type of rate_flg accordingly. Signed-off-by: Matthias Kaehlcke--- net/wireless/nl80211.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 2312dc2ffdb9..9af21a21ea6b 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -4151,7 +4151,7 @@ static bool nl80211_put_sta_rate(struct sk_buff *msg, struct rate_info *info, struct nlattr *rate; u32 bitrate; u16 bitrate_compat; - enum nl80211_attrs rate_flg; + enum nl80211_rate_info rate_flg; rate = nla_nest_start(msg, attr); if (!rate) -- 2.12.2.762.g0e3151a226-goog
[PATCH] nl80211: Fix enum type of variable in nl80211_put_sta_rate()
rate_flg is of type 'enum nl80211_attrs', however it is assigned with 'enum nl80211_rate_info' values. Change the type of rate_flg accordingly. Signed-off-by: Matthias Kaehlcke --- net/wireless/nl80211.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 2312dc2ffdb9..9af21a21ea6b 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -4151,7 +4151,7 @@ static bool nl80211_put_sta_rate(struct sk_buff *msg, struct rate_info *info, struct nlattr *rate; u32 bitrate; u16 bitrate_compat; - enum nl80211_attrs rate_flg; + enum nl80211_rate_info rate_flg; rate = nla_nest_start(msg, attr); if (!rate) -- 2.12.2.762.g0e3151a226-goog
Re: [PATCH v2 08/17] fs: retrofit old error reporting API onto new infrastructure
On Wed, Apr 12 2017, Jeff Layton wrote: > On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: >> >> I suspect that the filemap_check_wb_error() will need to be moved >> into some parent of the current call site, which is essentially what you >> suggest below. It would be nice if we could do that first, rather than >> having the current rather odd code. But maybe this way is an easier >> transition. It isn't obviously wrong, it just isn't obviously right >> either. >> > > Yeah. It's just such a daunting task to have to change so much of the > existing code. I'm looking for ways to make this simpler. > > I think it probably is reasonable for filemap_write_and_wait* to just > sample it as early as possible in those functions. filemap_fdatawait is > the real questionable one, as you may have already had some writebacks > complete with errors. > > In any case, my thinking was that the old code is not obviously correct > either, so while this shortens the "error capture window" on these > calls, it seems like a reasonable place to start improving things. I agree. It wouldn't hurt to add a note to this effect in the patch comment so that people understand that the code isn't seen to be "correct" but only "no worse" with clear direction on what sort of improvement might be appropriate. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCH] ACPICA: Export mutex functions
On Tue, Apr 18, 2017 at 12:32 AM, Guenter Roeckwrote: > On Mon, Apr 17, 2017 at 11:29:38PM +0200, Rafael J. Wysocki wrote: >> On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeck wrote: >> > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: >> >> [cut] >> >> [Moore, Robert] >> >> >> >> I'm not an expert on the _DLM method, but I would point you to the >> >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). >> >> >> > >> > I did. However, not being an ACPI expert, that doesn't tell me anything. >> >> Basically, if the kernel and AML need to access a device concurrently, >> there should be a _DLM object under that device in the ACPI tables. >> In that case it is expected to return a list of (AML) mutexes that can >> be acquired by the kernel in order to synchronize device access with >> respect to AML (and for each mutex it may also return a description of >> the specific resources to be protected by it). >> >> Bottom line: without _DLM, the kernel cannot synchronize things with >> respect to AML properly, because it has no information how to do that >> then. > > That is all quite interesting. I do see the mutex in question used on various > motherboards from various vendors (I checked boards from Gigabyte, MSI, and > Intel). Interestingly, the naming seems to be consistent - it is always named > "MUT0". For the most part, it seems to be available on more recent > motherboards; older motherboards tend to use the resource without locking. > However, I don't see any mention of "_DLM" in any of the DSDTs. > > At the same time, access to ports 0x2e/0x2f is widely used in the kernel. > As mentioned before, it is used in watchdog, hardware monitoring, and gpio > drivers, but also in parallel port and infrared driver code. Effectively > that means that all this code is inherently unsafe on systems with ACPI > support. If AML accesses those ports via operation regions, then it is unsafe. Note, however, that the mere existence of operation regions covering them doesn't automatically mean that they are accessed by any AML on the given platform. > I had thought about implementing a set of utility functions to make the kernel > code safer to use if the mutex is found to exist. Right now I wonder, though, > if such code would have a chance to be accepted. Any thoughts on that ? You can argue that there should be _DLM objects in the ACPI tables on those platforms, but they are missing, so platform quirks (or something else to that effect) are needed to ensure mutual exclusion between AML accesses and direct accesses in drivers etc. Unlike in the DT case, we have no influence on what the vendors put into the ACPI tables on their systems, so we need to live with what is in there and possibly fix up things as needed. Thanks, Rafael
Re: [PATCH v2 08/17] fs: retrofit old error reporting API onto new infrastructure
On Wed, Apr 12 2017, Jeff Layton wrote: > On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: >> >> I suspect that the filemap_check_wb_error() will need to be moved >> into some parent of the current call site, which is essentially what you >> suggest below. It would be nice if we could do that first, rather than >> having the current rather odd code. But maybe this way is an easier >> transition. It isn't obviously wrong, it just isn't obviously right >> either. >> > > Yeah. It's just such a daunting task to have to change so much of the > existing code. I'm looking for ways to make this simpler. > > I think it probably is reasonable for filemap_write_and_wait* to just > sample it as early as possible in those functions. filemap_fdatawait is > the real questionable one, as you may have already had some writebacks > complete with errors. > > In any case, my thinking was that the old code is not obviously correct > either, so while this shortens the "error capture window" on these > calls, it seems like a reasonable place to start improving things. I agree. It wouldn't hurt to add a note to this effect in the patch comment so that people understand that the code isn't seen to be "correct" but only "no worse" with clear direction on what sort of improvement might be appropriate. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCH] ACPICA: Export mutex functions
On Tue, Apr 18, 2017 at 12:32 AM, Guenter Roeck wrote: > On Mon, Apr 17, 2017 at 11:29:38PM +0200, Rafael J. Wysocki wrote: >> On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeck wrote: >> > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: >> >> [cut] >> >> [Moore, Robert] >> >> >> >> I'm not an expert on the _DLM method, but I would point you to the >> >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). >> >> >> > >> > I did. However, not being an ACPI expert, that doesn't tell me anything. >> >> Basically, if the kernel and AML need to access a device concurrently, >> there should be a _DLM object under that device in the ACPI tables. >> In that case it is expected to return a list of (AML) mutexes that can >> be acquired by the kernel in order to synchronize device access with >> respect to AML (and for each mutex it may also return a description of >> the specific resources to be protected by it). >> >> Bottom line: without _DLM, the kernel cannot synchronize things with >> respect to AML properly, because it has no information how to do that >> then. > > That is all quite interesting. I do see the mutex in question used on various > motherboards from various vendors (I checked boards from Gigabyte, MSI, and > Intel). Interestingly, the naming seems to be consistent - it is always named > "MUT0". For the most part, it seems to be available on more recent > motherboards; older motherboards tend to use the resource without locking. > However, I don't see any mention of "_DLM" in any of the DSDTs. > > At the same time, access to ports 0x2e/0x2f is widely used in the kernel. > As mentioned before, it is used in watchdog, hardware monitoring, and gpio > drivers, but also in parallel port and infrared driver code. Effectively > that means that all this code is inherently unsafe on systems with ACPI > support. If AML accesses those ports via operation regions, then it is unsafe. Note, however, that the mere existence of operation regions covering them doesn't automatically mean that they are accessed by any AML on the given platform. > I had thought about implementing a set of utility functions to make the kernel > code safer to use if the mutex is found to exist. Right now I wonder, though, > if such code would have a chance to be accepted. Any thoughts on that ? You can argue that there should be _DLM objects in the ACPI tables on those platforms, but they are missing, so platform quirks (or something else to that effect) are needed to ensure mutual exclusion between AML accesses and direct accesses in drivers etc. Unlike in the DT case, we have no influence on what the vendors put into the ACPI tables on their systems, so we need to live with what is in there and possibly fix up things as needed. Thanks, Rafael
Re: [PATCH v2 07/17] fs: new infrastructure for writeback error handling and reporting
On Wed, Apr 12 2017, Jeff Layton wrote: > On Thu, 2017-04-13 at 07:55 +1000, NeilBrown wrote: >> On Wed, Apr 12 2017, Jeff Layton wrote: >> >> >> > +void __filemap_set_wb_error(struct address_space *mapping, int err) >> >> I was really hoping that this would be >> >> void __set_wb_error(wb_err_t *wb_err, int err) >> >> so >> >> Then nfs_context_set_write_error could become >> >> static void nfs_context_set_write_error(struct nfs_open_context *ctx, int >> error) >> { >> __set_wb_error(>wb_err, error); >> } >> >> and filemap_set_sb_error() would be: >> >> static inline void filemap_set_wb_error(struct address_space *mapping, int >> err) >> { >> /* Optimize for the common case of no error */ >> if (unlikely(err)) >> __set_wb_error(>f_wb_err, err); >> } >> >> Similarly we would have >> wb_err_t sample_wb_error(wb_err_t *wb_err) >> { >>... >> } >> >> and >> >> wb_err_t filemap_sample_wb_error(struct address_space *mapping) >> { >> return sample_wb_error(>f_wb_err); >> } >> >> so nfs_file_fsync_commit() could have >> ret = sample_wb_error(>wb_err); >> in place of >> ret = xchg(>error, 0); >> >> int filemap_report_wb_error(struct file *file) >> >> would become >> >> int filemap_report_wb_error(struct file *file, wb_err_t *err) >> >> or something. >> >> The address space is just one (obvious) place where the wb error can be >> stored. The filesystem might have a different place with finer >> granularity (nfs already does). >> >> > > I think it'd be much simpler to adapt NFS over to use the new > infrastructure (I have a draft patch for that already). You'd lose the > ability to track a different error for each nfs_open_context, but I'm > not sure how valuable that is anyway. I'll need to think about that > one... From a technical perspective, it might be "simpler" but I contest "much simpler". I think it would be easy to put one wb_err_t per nfs_open_context, if the former were designed well (which itself would be easy). From a political perspective, I doubt it would be simple. NFS is the way it is for a reason, and convincing an author that their reason is not valid tends to be harder than most technical issues. (looking to history... the 'error' field was added to the nfs_open_context in Commit: 6caf69feb23a ("NFSv2/v3/v4: Place NFS nfs_page shared data into a single structurethat hangs off filp->private_data. As a side effect, this alsocleans up the NFSv4 private file state info.") in 2.6.12. Prior to that file->f_error was used. Prior to commit 9ffb8c3a1955 ("Import 2.2.3pre1") (which has no comment) errors were ... interesting. Look for nfs_check_error in commit d9c0ffee4db7 ("Import 2.1.128") and notice the use of current->pid!! All commits from the history.git tree. ) It is quite possible for an NFS server to return different errors to different users. It might be odd, but it is possible. Should an error that affects one user pollute all other users? Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCH v2 07/17] fs: new infrastructure for writeback error handling and reporting
On Wed, Apr 12 2017, Jeff Layton wrote: > On Thu, 2017-04-13 at 07:55 +1000, NeilBrown wrote: >> On Wed, Apr 12 2017, Jeff Layton wrote: >> >> >> > +void __filemap_set_wb_error(struct address_space *mapping, int err) >> >> I was really hoping that this would be >> >> void __set_wb_error(wb_err_t *wb_err, int err) >> >> so >> >> Then nfs_context_set_write_error could become >> >> static void nfs_context_set_write_error(struct nfs_open_context *ctx, int >> error) >> { >> __set_wb_error(>wb_err, error); >> } >> >> and filemap_set_sb_error() would be: >> >> static inline void filemap_set_wb_error(struct address_space *mapping, int >> err) >> { >> /* Optimize for the common case of no error */ >> if (unlikely(err)) >> __set_wb_error(>f_wb_err, err); >> } >> >> Similarly we would have >> wb_err_t sample_wb_error(wb_err_t *wb_err) >> { >>... >> } >> >> and >> >> wb_err_t filemap_sample_wb_error(struct address_space *mapping) >> { >> return sample_wb_error(>f_wb_err); >> } >> >> so nfs_file_fsync_commit() could have >> ret = sample_wb_error(>wb_err); >> in place of >> ret = xchg(>error, 0); >> >> int filemap_report_wb_error(struct file *file) >> >> would become >> >> int filemap_report_wb_error(struct file *file, wb_err_t *err) >> >> or something. >> >> The address space is just one (obvious) place where the wb error can be >> stored. The filesystem might have a different place with finer >> granularity (nfs already does). >> >> > > I think it'd be much simpler to adapt NFS over to use the new > infrastructure (I have a draft patch for that already). You'd lose the > ability to track a different error for each nfs_open_context, but I'm > not sure how valuable that is anyway. I'll need to think about that > one... From a technical perspective, it might be "simpler" but I contest "much simpler". I think it would be easy to put one wb_err_t per nfs_open_context, if the former were designed well (which itself would be easy). From a political perspective, I doubt it would be simple. NFS is the way it is for a reason, and convincing an author that their reason is not valid tends to be harder than most technical issues. (looking to history... the 'error' field was added to the nfs_open_context in Commit: 6caf69feb23a ("NFSv2/v3/v4: Place NFS nfs_page shared data into a single structurethat hangs off filp->private_data. As a side effect, this alsocleans up the NFSv4 private file state info.") in 2.6.12. Prior to that file->f_error was used. Prior to commit 9ffb8c3a1955 ("Import 2.2.3pre1") (which has no comment) errors were ... interesting. Look for nfs_check_error in commit d9c0ffee4db7 ("Import 2.1.128") and notice the use of current->pid!! All commits from the history.git tree. ) It is quite possible for an NFS server to return different errors to different users. It might be odd, but it is possible. Should an error that affects one user pollute all other users? Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCH v8] nvme: improve performance for virtual NVMe devices
On Fri, Apr 14, 2017 at 03:10:30PM -0300, Helen Koike wrote: > + Add missing maintainers from scripts/get_maintainer.pl in the email thread > > Hi, > > I would like to know if it would be possible to get this patch for kernel > 4.12. > Should I send a pull request? Or do you usually get the patch from the email > thread? Hi Helen, This looks good to me. I pulled this into a branch for the next merge window here: http://git.infradead.org/nvme.git/shortlog/refs/heads/for-next There's only one other commit at the moment in addition to yours, so I'll just need to make sure we've got everything we want for 4.12 before submitting our next pull request.
Re: [PATCH v8] nvme: improve performance for virtual NVMe devices
On Fri, Apr 14, 2017 at 03:10:30PM -0300, Helen Koike wrote: > + Add missing maintainers from scripts/get_maintainer.pl in the email thread > > Hi, > > I would like to know if it would be possible to get this patch for kernel > 4.12. > Should I send a pull request? Or do you usually get the patch from the email > thread? Hi Helen, This looks good to me. I pulled this into a branch for the next merge window here: http://git.infradead.org/nvme.git/shortlog/refs/heads/for-next There's only one other commit at the moment in addition to yours, so I'll just need to make sure we've got everything we want for 4.12 before submitting our next pull request.
[cpu/hotplug] 6362ef376a: [ INFO: possible circular locking dependency detected ]
Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.hotplug commit 6362ef376aacbc309022ad12f08693d201cdda08 Author: Thomas GleixnerAuthorDate: Thu Apr 13 10:50:55 2017 +0200 Commit: Thomas Gleixner CommitDate: Mon Apr 17 16:12:27 2017 +0200 cpu/hotplug: Convert hotplug locking to percpu rwsem There are no more (known) nested calls to get_online_cpus() so it's possible to remove the nested call magic and convert the mutex to a percpu-rwsem, which speeds up get/put_online_cpus() significantly for the uncontended case. The contended case (write locked for hotplug operations) is slow anyway, so the slightly more expensive down_write of the percpu rwsem does not matter. Signed-off-by: Thomas Gleixner 00ebb15e90 perf/core: Remove redundant get_online_cpus() 6362ef376a cpu/hotplug: Convert hotplug locking to percpu rwsem 6362ef376a cpu/hotplug: Convert hotplug locking to percpu rwsem d36d99770e Merge branch 'timers/core' ++++++ || 00ebb15e90 | 6362ef376a | 6362ef376a | d36d99770e | ++++++ | boot_successes | 35 | 0 | 0 | 0 | | boot_failures | 0 | 13 | 13 | 11 | | INFO:possible_circular_locking_dependency_detected | 0 | 13 | 13 | 5 | | INFO:possible_recursive_locking_detected | 0 | 0 | 0 | 6 | ++++++ [init] Using pid_max = 32768 [init] Started watchdog process, PID is 8909 [main] Main thread is alive. [ 23.762049] [ 23.762311] == [ 23.763199] [ INFO: possible circular locking dependency detected ] [ 23.764101] 4.11.0-rc6-00237-g6362ef3 #1 Not tainted [ 23.764847] --- [ 23.765822] trinity-main/352 is trying to acquire lock: [ 23.766631] (jump_label_mutex){+.+...}, at: [] jump_label_lock+0x12/0x14 [ 23.767937] [ 23.767937] but task is already holding lock: [ 23.768778] (perf_sched_mutex){+.+...}, at: [] perf_event_alloc+0x563/0x65e [ 23.770002] [ 23.770002] which lock already depends on the new lock. [ 23.770002] [ 23.771156] [ 23.771156] the existing dependency chain (in reverse order) is: [ 23.772309] [ 23.772309] -> #3 (perf_sched_mutex){+.+...}: # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD git bisect start 98c5b9ad245d1735d95d66e463a44b449122581c 4f7d029b9bf009fbee76bb10c0c4351a1870d2f3 -- git bisect bad 9364dcd917be48868feab7d699b90ec8457e85b4 # 03:47 B 0 11 22 0 Merge 'net/master' into devel-catchup-201704180142 git bisect good c5de08f31b482b1ba1cf637d41921e41efe519bd # 03:56 G 11 00 0 Merge 'linux-review/Aneesh-Kumar-K-V/HugeTLB-migration-support-for-PPC64/20170418-011540' into devel-catchup-201704180142 git bisect good 7c90451c9c633b4c7fefc29272ba15f18e56b87c # 04:10 G 11 00 0 Merge 'net-next/master' into devel-catchup-201704180142 git bisect bad b5cea3442f45b5c98f45f8225978e0aa05f18da8 # 04:19 B 0 1 12 0 Merge 'tip/WIP.hotplug' into devel-catchup-201704180142 git bisect good a5cbdf693a60d5b86d4d21dfedd90f17754eb273 # 04:34 G 10 00 0 ACPI/processor: Fix error handling in __acpi_processor_start() git bisect good 92b636ebde0776e579cd16d47783b8b755a4c9ea # 04:42 G 11 00 0 KVM/PPC/Book3S HV: Use cpuhp_setup_state_nocalls_locked() git bisect good d21819c3b480fa0d24419f09abef05c3214ce72e # 04:59 G 11 00 0 kernel/hotplug: Use stop_machine_locked() in takedown_cpu() git bisect good 862a8fda91ef0575bdd559158f6db5b6e9a70fe7 # 05:26 G 11 00 0 PCI: Replace the racy recursion prevention git bisect good 00ebb15e900ebda8ee9036dddc63e3214cfad22b # 05:36 G 11 00 0 perf/core: Remove redundant get_online_cpus() git bisect bad 6362ef376aacbc309022ad12f08693d201cdda08 # 05:53 B 0 2 13 0 cpu/hotplug: Convert hotplug locking to percpu rwsem # first bad commit: [6362ef376aacbc309022ad12f08693d201cdda08] cpu/hotplug: Convert hotplug locking to percpu rwsem git bisect good 00ebb15e900ebda8ee9036dddc63e3214cfad22b # 06:00 G 31 00 0 perf/core: Remove redundant get_online_cpus() # extra tests with CONFIG_DEBUG_INFO_REDUCED git bisect bad
[cpu/hotplug] 6362ef376a: [ INFO: possible circular locking dependency detected ]
Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.hotplug commit 6362ef376aacbc309022ad12f08693d201cdda08 Author: Thomas Gleixner AuthorDate: Thu Apr 13 10:50:55 2017 +0200 Commit: Thomas Gleixner CommitDate: Mon Apr 17 16:12:27 2017 +0200 cpu/hotplug: Convert hotplug locking to percpu rwsem There are no more (known) nested calls to get_online_cpus() so it's possible to remove the nested call magic and convert the mutex to a percpu-rwsem, which speeds up get/put_online_cpus() significantly for the uncontended case. The contended case (write locked for hotplug operations) is slow anyway, so the slightly more expensive down_write of the percpu rwsem does not matter. Signed-off-by: Thomas Gleixner 00ebb15e90 perf/core: Remove redundant get_online_cpus() 6362ef376a cpu/hotplug: Convert hotplug locking to percpu rwsem 6362ef376a cpu/hotplug: Convert hotplug locking to percpu rwsem d36d99770e Merge branch 'timers/core' ++++++ || 00ebb15e90 | 6362ef376a | 6362ef376a | d36d99770e | ++++++ | boot_successes | 35 | 0 | 0 | 0 | | boot_failures | 0 | 13 | 13 | 11 | | INFO:possible_circular_locking_dependency_detected | 0 | 13 | 13 | 5 | | INFO:possible_recursive_locking_detected | 0 | 0 | 0 | 6 | ++++++ [init] Using pid_max = 32768 [init] Started watchdog process, PID is 8909 [main] Main thread is alive. [ 23.762049] [ 23.762311] == [ 23.763199] [ INFO: possible circular locking dependency detected ] [ 23.764101] 4.11.0-rc6-00237-g6362ef3 #1 Not tainted [ 23.764847] --- [ 23.765822] trinity-main/352 is trying to acquire lock: [ 23.766631] (jump_label_mutex){+.+...}, at: [] jump_label_lock+0x12/0x14 [ 23.767937] [ 23.767937] but task is already holding lock: [ 23.768778] (perf_sched_mutex){+.+...}, at: [] perf_event_alloc+0x563/0x65e [ 23.770002] [ 23.770002] which lock already depends on the new lock. [ 23.770002] [ 23.771156] [ 23.771156] the existing dependency chain (in reverse order) is: [ 23.772309] [ 23.772309] -> #3 (perf_sched_mutex){+.+...}: # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD git bisect start 98c5b9ad245d1735d95d66e463a44b449122581c 4f7d029b9bf009fbee76bb10c0c4351a1870d2f3 -- git bisect bad 9364dcd917be48868feab7d699b90ec8457e85b4 # 03:47 B 0 11 22 0 Merge 'net/master' into devel-catchup-201704180142 git bisect good c5de08f31b482b1ba1cf637d41921e41efe519bd # 03:56 G 11 00 0 Merge 'linux-review/Aneesh-Kumar-K-V/HugeTLB-migration-support-for-PPC64/20170418-011540' into devel-catchup-201704180142 git bisect good 7c90451c9c633b4c7fefc29272ba15f18e56b87c # 04:10 G 11 00 0 Merge 'net-next/master' into devel-catchup-201704180142 git bisect bad b5cea3442f45b5c98f45f8225978e0aa05f18da8 # 04:19 B 0 1 12 0 Merge 'tip/WIP.hotplug' into devel-catchup-201704180142 git bisect good a5cbdf693a60d5b86d4d21dfedd90f17754eb273 # 04:34 G 10 00 0 ACPI/processor: Fix error handling in __acpi_processor_start() git bisect good 92b636ebde0776e579cd16d47783b8b755a4c9ea # 04:42 G 11 00 0 KVM/PPC/Book3S HV: Use cpuhp_setup_state_nocalls_locked() git bisect good d21819c3b480fa0d24419f09abef05c3214ce72e # 04:59 G 11 00 0 kernel/hotplug: Use stop_machine_locked() in takedown_cpu() git bisect good 862a8fda91ef0575bdd559158f6db5b6e9a70fe7 # 05:26 G 11 00 0 PCI: Replace the racy recursion prevention git bisect good 00ebb15e900ebda8ee9036dddc63e3214cfad22b # 05:36 G 11 00 0 perf/core: Remove redundant get_online_cpus() git bisect bad 6362ef376aacbc309022ad12f08693d201cdda08 # 05:53 B 0 2 13 0 cpu/hotplug: Convert hotplug locking to percpu rwsem # first bad commit: [6362ef376aacbc309022ad12f08693d201cdda08] cpu/hotplug: Convert hotplug locking to percpu rwsem git bisect good 00ebb15e900ebda8ee9036dddc63e3214cfad22b # 06:00 G 31 00 0 perf/core: Remove redundant get_online_cpus() # extra tests with CONFIG_DEBUG_INFO_REDUCED git bisect bad 6362ef376aacbc309022ad12f08693d201cdda08 # 06:09 B 0
[PATCH] sparc64: Fix hugepage page table free
Make sure the start adderess is aligned to PMD_SIZE boundary when freeing page table backing a hugepage region. The issue was causing segfaults when a region backed by 64K pages was unmapped since such a region is in general not PMD_SIZE aligned. Signed-off-by: Nitin Gupta--- arch/sparc/mm/hugetlbpage.c | 16 1 file changed, 16 insertions(+) diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index ee5273a..7c29d38 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -461,6 +461,22 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, pgd_t *pgd; unsigned long next; + addr &= PMD_MASK; + if (addr < floor) { + addr += PMD_SIZE; + if (!addr) + return; + } + if (ceiling) { + ceiling &= PMD_MASK; + if (!ceiling) + return; + } + if (end - 1 > ceiling - 1) + end -= PMD_SIZE; + if (addr > end - 1) + return; + pgd = pgd_offset(tlb->mm, addr); do { next = pgd_addr_end(addr, end); -- 2.9.2
[PATCH] sparc64: Fix hugepage page table free
Make sure the start adderess is aligned to PMD_SIZE boundary when freeing page table backing a hugepage region. The issue was causing segfaults when a region backed by 64K pages was unmapped since such a region is in general not PMD_SIZE aligned. Signed-off-by: Nitin Gupta --- arch/sparc/mm/hugetlbpage.c | 16 1 file changed, 16 insertions(+) diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index ee5273a..7c29d38 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -461,6 +461,22 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, pgd_t *pgd; unsigned long next; + addr &= PMD_MASK; + if (addr < floor) { + addr += PMD_SIZE; + if (!addr) + return; + } + if (ceiling) { + ceiling &= PMD_MASK; + if (!ceiling) + return; + } + if (end - 1 > ceiling - 1) + end -= PMD_SIZE; + if (addr > end - 1) + return; + pgd = pgd_offset(tlb->mm, addr); do { next = pgd_addr_end(addr, end); -- 2.9.2
[PATCH v5 3/3] vfio/type1: Reduce repetitive calls in vfio_pin_pages_remote()
vfio_pin_pages_remote() is typically called to iterate over a range of memory. Testing CAP_IPC_LOCK is relatively expensive, so it makes sense to push it up to the caller, which can then repeatedly call vfio_pin_pages_remote() using that value. This can show nearly a 20% improvement on the worst case path through VFIO_IOMMU_MAP_DMA with contiguous page mapping disabled. Testing RLIMIT_MEMLOCK is much more lightweight, but we bring it along on the same principle and it does seem to show a marginal improvement. Signed-off-by: Alex Williamson--- drivers/vfio/vfio_iommu_type1.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 372e4f626138..8549cb111627 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -380,10 +380,10 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, * first page and all consecutive pages with the same locking. */ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, - long npage, unsigned long *pfn_base) + long npage, unsigned long *pfn_base, + bool lock_cap, unsigned long limit) { - unsigned long pfn = 0, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; - bool lock_cap = capable(CAP_IPC_LOCK); + unsigned long pfn = 0; long ret, pinned = 0, lock_acct = 0; bool rsvd; dma_addr_t iova = vaddr - dma->vaddr + dma->iova; @@ -924,13 +924,15 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, unsigned long vaddr = dma->vaddr; size_t size = map_size; long npage; - unsigned long pfn; + unsigned long pfn, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + bool lock_cap = capable(CAP_IPC_LOCK); int ret = 0; while (size) { /* Pin a contiguous chunk of memory */ npage = vfio_pin_pages_remote(dma, vaddr + dma->size, - size >> PAGE_SHIFT, ); + size >> PAGE_SHIFT, , + lock_cap, limit); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; @@ -1040,6 +1042,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, { struct vfio_domain *d; struct rb_node *n; + unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + bool lock_cap = capable(CAP_IPC_LOCK); int ret; /* Arbitrarily pick the first domain in the list for lookups */ @@ -1086,7 +1090,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, npage = vfio_pin_pages_remote(dma, vaddr, n >> PAGE_SHIFT, - ); + , lock_cap, + limit); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage;
[PATCH v5 3/3] vfio/type1: Reduce repetitive calls in vfio_pin_pages_remote()
vfio_pin_pages_remote() is typically called to iterate over a range of memory. Testing CAP_IPC_LOCK is relatively expensive, so it makes sense to push it up to the caller, which can then repeatedly call vfio_pin_pages_remote() using that value. This can show nearly a 20% improvement on the worst case path through VFIO_IOMMU_MAP_DMA with contiguous page mapping disabled. Testing RLIMIT_MEMLOCK is much more lightweight, but we bring it along on the same principle and it does seem to show a marginal improvement. Signed-off-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 372e4f626138..8549cb111627 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -380,10 +380,10 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, * first page and all consecutive pages with the same locking. */ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, - long npage, unsigned long *pfn_base) + long npage, unsigned long *pfn_base, + bool lock_cap, unsigned long limit) { - unsigned long pfn = 0, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; - bool lock_cap = capable(CAP_IPC_LOCK); + unsigned long pfn = 0; long ret, pinned = 0, lock_acct = 0; bool rsvd; dma_addr_t iova = vaddr - dma->vaddr + dma->iova; @@ -924,13 +924,15 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, unsigned long vaddr = dma->vaddr; size_t size = map_size; long npage; - unsigned long pfn; + unsigned long pfn, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + bool lock_cap = capable(CAP_IPC_LOCK); int ret = 0; while (size) { /* Pin a contiguous chunk of memory */ npage = vfio_pin_pages_remote(dma, vaddr + dma->size, - size >> PAGE_SHIFT, ); + size >> PAGE_SHIFT, , + lock_cap, limit); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; @@ -1040,6 +1042,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, { struct vfio_domain *d; struct rb_node *n; + unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + bool lock_cap = capable(CAP_IPC_LOCK); int ret; /* Arbitrarily pick the first domain in the list for lookups */ @@ -1086,7 +1090,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, npage = vfio_pin_pages_remote(dma, vaddr, n >> PAGE_SHIFT, - ); + , lock_cap, + limit); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage;
[PATCH v5 1/3] vfio/type1: Remove locked page accounting workqueue
If the mmap_sem is contented then the vfio type1 IOMMU backend will defer locked page accounting updates to a workqueue task. This has a few problems and depending on which side the user tries to play, they might be over-penalized for unmaps that haven't yet been accounted or race the workqueue to enter more mappings than they're allowed. The original intent of this workqueue mechanism seems to be focused on reducing latency through the ioctl, but we cannot do so at the cost of correctness. Remove this workqueue mechanism and update the callers to allow for failure. We can also now recheck the limit under write lock to make sure we don't exceed it. vfio_pin_pages_remote() also now necessarily includes an unwind path which we can jump to directly if the consecutive page pinning finds that we're exceeding the user's memory limits. This avoids the current lazy approach which does accounting and mapping up to the fault, only to return an error on the next iteration to unwind the entire vfio_dma. Cc: sta...@vger.kernel.org Signed-off-by: Alex Williamson--- drivers/vfio/vfio_iommu_type1.c | 110 ++- 1 file changed, 51 insertions(+), 59 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 32d2633092a3..a8a079ba9477 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -246,69 +246,46 @@ static int vfio_iova_put_vfio_pfn(struct vfio_dma *dma, struct vfio_pfn *vpfn) return ret; } -struct vwork { - struct mm_struct*mm; - longnpage; - struct work_struct work; -}; - -/* delayed decrement/increment for locked_vm */ -static void vfio_lock_acct_bg(struct work_struct *work) -{ - struct vwork *vwork = container_of(work, struct vwork, work); - struct mm_struct *mm; - - mm = vwork->mm; - down_write(>mmap_sem); - mm->locked_vm += vwork->npage; - up_write(>mmap_sem); - mmput(mm); - kfree(vwork); -} - -static void vfio_lock_acct(struct task_struct *task, long npage) +static int vfio_lock_acct(struct task_struct *task, long npage, bool *lock_cap) { - struct vwork *vwork; struct mm_struct *mm; bool is_current; + int ret; if (!npage) - return; + return 0; is_current = (task->mm == current->mm); mm = is_current ? task->mm : get_task_mm(task); if (!mm) - return; /* process exited */ + return -ESRCH; /* process exited */ - if (down_write_trylock(>mmap_sem)) { - mm->locked_vm += npage; - up_write(>mmap_sem); - if (!is_current) - mmput(mm); - return; - } + ret = down_write_killable(>mmap_sem); + if (!ret) { + if (npage > 0) { + if (lock_cap ? !*lock_cap : + !has_capability(task, CAP_IPC_LOCK)) { + unsigned long limit; + + limit = task_rlimit(task, + RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + if (mm->locked_vm + npage > limit) + ret = -ENOMEM; + } + } + + if (!ret) + mm->locked_vm += npage; - if (is_current) { - mm = get_task_mm(task); - if (!mm) - return; + up_write(>mmap_sem); } - /* -* Couldn't get mmap_sem lock, so must setup to update -* mm->locked_vm later. If locked_vm were atomic, we -* wouldn't need this silliness -*/ - vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); - if (WARN_ON(!vwork)) { + if (!is_current) mmput(mm); - return; - } - INIT_WORK(>work, vfio_lock_acct_bg); - vwork->mm = mm; - vwork->npage = npage; - schedule_work(>work); + + return ret; } /* @@ -405,7 +382,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, long npage, unsigned long *pfn_base) { - unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + unsigned long pfn = 0, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; bool lock_cap = capable(CAP_IPC_LOCK); long ret, pinned = 0, lock_acct = 0; bool rsvd; @@ -442,8 +419,6 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, /* Lock all the consecutive pages from pfn_base */ for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; pinned < npage; pinned++, vaddr += PAGE_SIZE, iova += PAGE_SIZE) { -
[PATCH v5 0/3] vfio/type1: Synchronous locked page accounting
v5: - patch 1/ Use bool* to cleanup vfio_lock_acct() callers; sorry we cannot re-test CAP_IPC_LOCK for all callers - patch 2/ Re-add pr_warn, add Kirti's R-b - patch 3/ NEW, analyzing impact of vfio_lock_acct() testing CAP_IPC_LOCK for all callers revealed a long hanging optimization Thanks for the reviews, keep 'em coming, Alex --- Alex Williamson (3): vfio/type1: Remove locked page accounting workqueue vfio/type1: Prune vfio_pin_page_external() vfio/type1: Reduce repetitive calls in vfio_pin_pages_remote() drivers/vfio/vfio_iommu_type1.c | 150 +-- 1 file changed, 64 insertions(+), 86 deletions(-)
[PATCH v5 0/3] vfio/type1: Synchronous locked page accounting
v5: - patch 1/ Use bool* to cleanup vfio_lock_acct() callers; sorry we cannot re-test CAP_IPC_LOCK for all callers - patch 2/ Re-add pr_warn, add Kirti's R-b - patch 3/ NEW, analyzing impact of vfio_lock_acct() testing CAP_IPC_LOCK for all callers revealed a long hanging optimization Thanks for the reviews, keep 'em coming, Alex --- Alex Williamson (3): vfio/type1: Remove locked page accounting workqueue vfio/type1: Prune vfio_pin_page_external() vfio/type1: Reduce repetitive calls in vfio_pin_pages_remote() drivers/vfio/vfio_iommu_type1.c | 150 +-- 1 file changed, 64 insertions(+), 86 deletions(-)
[PATCH v5 1/3] vfio/type1: Remove locked page accounting workqueue
If the mmap_sem is contented then the vfio type1 IOMMU backend will defer locked page accounting updates to a workqueue task. This has a few problems and depending on which side the user tries to play, they might be over-penalized for unmaps that haven't yet been accounted or race the workqueue to enter more mappings than they're allowed. The original intent of this workqueue mechanism seems to be focused on reducing latency through the ioctl, but we cannot do so at the cost of correctness. Remove this workqueue mechanism and update the callers to allow for failure. We can also now recheck the limit under write lock to make sure we don't exceed it. vfio_pin_pages_remote() also now necessarily includes an unwind path which we can jump to directly if the consecutive page pinning finds that we're exceeding the user's memory limits. This avoids the current lazy approach which does accounting and mapping up to the fault, only to return an error on the next iteration to unwind the entire vfio_dma. Cc: sta...@vger.kernel.org Signed-off-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 110 ++- 1 file changed, 51 insertions(+), 59 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 32d2633092a3..a8a079ba9477 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -246,69 +246,46 @@ static int vfio_iova_put_vfio_pfn(struct vfio_dma *dma, struct vfio_pfn *vpfn) return ret; } -struct vwork { - struct mm_struct*mm; - longnpage; - struct work_struct work; -}; - -/* delayed decrement/increment for locked_vm */ -static void vfio_lock_acct_bg(struct work_struct *work) -{ - struct vwork *vwork = container_of(work, struct vwork, work); - struct mm_struct *mm; - - mm = vwork->mm; - down_write(>mmap_sem); - mm->locked_vm += vwork->npage; - up_write(>mmap_sem); - mmput(mm); - kfree(vwork); -} - -static void vfio_lock_acct(struct task_struct *task, long npage) +static int vfio_lock_acct(struct task_struct *task, long npage, bool *lock_cap) { - struct vwork *vwork; struct mm_struct *mm; bool is_current; + int ret; if (!npage) - return; + return 0; is_current = (task->mm == current->mm); mm = is_current ? task->mm : get_task_mm(task); if (!mm) - return; /* process exited */ + return -ESRCH; /* process exited */ - if (down_write_trylock(>mmap_sem)) { - mm->locked_vm += npage; - up_write(>mmap_sem); - if (!is_current) - mmput(mm); - return; - } + ret = down_write_killable(>mmap_sem); + if (!ret) { + if (npage > 0) { + if (lock_cap ? !*lock_cap : + !has_capability(task, CAP_IPC_LOCK)) { + unsigned long limit; + + limit = task_rlimit(task, + RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + if (mm->locked_vm + npage > limit) + ret = -ENOMEM; + } + } + + if (!ret) + mm->locked_vm += npage; - if (is_current) { - mm = get_task_mm(task); - if (!mm) - return; + up_write(>mmap_sem); } - /* -* Couldn't get mmap_sem lock, so must setup to update -* mm->locked_vm later. If locked_vm were atomic, we -* wouldn't need this silliness -*/ - vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); - if (WARN_ON(!vwork)) { + if (!is_current) mmput(mm); - return; - } - INIT_WORK(>work, vfio_lock_acct_bg); - vwork->mm = mm; - vwork->npage = npage; - schedule_work(>work); + + return ret; } /* @@ -405,7 +382,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, long npage, unsigned long *pfn_base) { - unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + unsigned long pfn = 0, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; bool lock_cap = capable(CAP_IPC_LOCK); long ret, pinned = 0, lock_acct = 0; bool rsvd; @@ -442,8 +419,6 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, /* Lock all the consecutive pages from pfn_base */ for (vaddr += PAGE_SIZE, iova += PAGE_SIZE; pinned < npage; pinned++, vaddr += PAGE_SIZE, iova += PAGE_SIZE) { - unsigned long pfn = 0; -
[PATCH v5 2/3] vfio/type1: Prune vfio_pin_page_external()
With vfio_lock_acct() testing the locked memory limit under mmap_sem, it's redundant to do it here for a single page. We can also reorder our tests such that we can avoid testing for reserved pages if we're not doing accounting and let vfio_lock_acct() test the process CAP_IPC_LOCK. Finally, this function oddly returns 1 on success. Update to return zero on success, -errno on error. Since the function only pins a single page, there's no need to return the number of pages pinned. N.B. vfio_pin_pages_remote() can pin a large contiguous range of pages before calling vfio_lock_acct(). If we were to similarly remove the extra test there, a user could temporarily pin far more pages than they're allowed. Suggested-by: Kirti WankhedeSuggested-by: Eric Auger Reviewed-by: Kirti Wankhede Signed-off-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 35 --- 1 file changed, 8 insertions(+), 27 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index a8a079ba9477..372e4f626138 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -482,43 +482,26 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, static int vfio_pin_page_external(struct vfio_dma *dma, unsigned long vaddr, unsigned long *pfn_base, bool do_accounting) { - unsigned long limit; - bool lock_cap = has_capability(dma->task, CAP_IPC_LOCK); struct mm_struct *mm; int ret; - bool rsvd; mm = get_task_mm(dma->task); if (!mm) return -ENODEV; ret = vaddr_get_pfn(mm, vaddr, dma->prot, pfn_base); - if (ret) - goto pin_page_exit; - - rsvd = is_invalid_reserved_pfn(*pfn_base); - limit = task_rlimit(dma->task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; - - if (!rsvd && !lock_cap && mm->locked_vm + 1 > limit) { - put_pfn(*pfn_base, dma->prot); - pr_warn("%s: Task %s (%d) RLIMIT_MEMLOCK (%ld) exceeded\n", - __func__, dma->task->comm, task_pid_nr(dma->task), - limit << PAGE_SHIFT); - ret = -ENOMEM; - goto pin_page_exit; - } - - if (!rsvd && do_accounting) { - ret = vfio_lock_acct(dma->task, 1, _cap); + if (!ret && do_accounting && !is_invalid_reserved_pfn(*pfn_base)) { + ret = vfio_lock_acct(dma->task, 1, NULL); if (ret) { put_pfn(*pfn_base, dma->prot); - goto pin_page_exit; + if (ret == -ENOMEM) + pr_warn("%s: Task %s (%d) RLIMIT_MEMLOCK " + "(%ld) exceeded\n", __func__, + dma->task->comm, task_pid_nr(dma->task), + task_rlimit(dma->task, RLIMIT_MEMLOCK)); } } - ret = 1; - -pin_page_exit: mmput(mm); return ret; } @@ -598,10 +581,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, remote_vaddr = dma->vaddr + iova - dma->iova; ret = vfio_pin_page_external(dma, remote_vaddr, _pfn[i], do_accounting); - if (ret <= 0) { - WARN_ON(!ret); + if (ret) goto pin_unwind; - } ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); if (ret) {
[PATCH v5 2/3] vfio/type1: Prune vfio_pin_page_external()
With vfio_lock_acct() testing the locked memory limit under mmap_sem, it's redundant to do it here for a single page. We can also reorder our tests such that we can avoid testing for reserved pages if we're not doing accounting and let vfio_lock_acct() test the process CAP_IPC_LOCK. Finally, this function oddly returns 1 on success. Update to return zero on success, -errno on error. Since the function only pins a single page, there's no need to return the number of pages pinned. N.B. vfio_pin_pages_remote() can pin a large contiguous range of pages before calling vfio_lock_acct(). If we were to similarly remove the extra test there, a user could temporarily pin far more pages than they're allowed. Suggested-by: Kirti Wankhede Suggested-by: Eric Auger Reviewed-by: Kirti Wankhede Signed-off-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 35 --- 1 file changed, 8 insertions(+), 27 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index a8a079ba9477..372e4f626138 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -482,43 +482,26 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, static int vfio_pin_page_external(struct vfio_dma *dma, unsigned long vaddr, unsigned long *pfn_base, bool do_accounting) { - unsigned long limit; - bool lock_cap = has_capability(dma->task, CAP_IPC_LOCK); struct mm_struct *mm; int ret; - bool rsvd; mm = get_task_mm(dma->task); if (!mm) return -ENODEV; ret = vaddr_get_pfn(mm, vaddr, dma->prot, pfn_base); - if (ret) - goto pin_page_exit; - - rsvd = is_invalid_reserved_pfn(*pfn_base); - limit = task_rlimit(dma->task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; - - if (!rsvd && !lock_cap && mm->locked_vm + 1 > limit) { - put_pfn(*pfn_base, dma->prot); - pr_warn("%s: Task %s (%d) RLIMIT_MEMLOCK (%ld) exceeded\n", - __func__, dma->task->comm, task_pid_nr(dma->task), - limit << PAGE_SHIFT); - ret = -ENOMEM; - goto pin_page_exit; - } - - if (!rsvd && do_accounting) { - ret = vfio_lock_acct(dma->task, 1, _cap); + if (!ret && do_accounting && !is_invalid_reserved_pfn(*pfn_base)) { + ret = vfio_lock_acct(dma->task, 1, NULL); if (ret) { put_pfn(*pfn_base, dma->prot); - goto pin_page_exit; + if (ret == -ENOMEM) + pr_warn("%s: Task %s (%d) RLIMIT_MEMLOCK " + "(%ld) exceeded\n", __func__, + dma->task->comm, task_pid_nr(dma->task), + task_rlimit(dma->task, RLIMIT_MEMLOCK)); } } - ret = 1; - -pin_page_exit: mmput(mm); return ret; } @@ -598,10 +581,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, remote_vaddr = dma->vaddr + iova - dma->iova; ret = vfio_pin_page_external(dma, remote_vaddr, _pfn[i], do_accounting); - if (ret <= 0) { - WARN_ON(!ret); + if (ret) goto pin_unwind; - } ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); if (ret) {
Re: bug? dwc2: insufficient fifo memory
On Fri, Feb 24, 2017 at 2:46 PM, John Stultzwrote: > Hey John, > So after the USB tree landed in 4.11-rc, I've been seeing the > following warning at bootup. > > It seems the fifo_mem/total_fifo_size value on hikey is 1920, but I'm > seeing the addresses zip upward quickly as the txfsz values are all > 2048. Not exactly sure whats wrong here. Things still seem to work > properly. > > thanks > -john > > > [8.944987] dwc2 f72c.usb: bound driver configfs-gadget > [8.956651] insufficient fifo memory > [8.956703] [ cut here ] > [8.964906] WARNING: CPU: 7 PID: 1 at drivers/usb/dwc2/gadget.c:330 > dwc2_hsotg_init_fifo+0x1a8/0x1c8 Hey John, So I finally got a bit of time to look deeper into this, and it seems like this issue was introduced by commit 3c6aea7344c3 ("usb: dwc2: gadget: Add checking for g-tx-fifo-size parameter"), as that change added the following snippit: if (hsotg->params.g_tx_fifo_size[fifo] < min || hsotg->params.g_tx_fifo_size[fifo] > dptxfszn) { dev_warn(hsotg->dev, "%s: Invalid parameter g_tx_fifo_size[%d]=%d\n", __func__, fifo, hsotg->params.g_tx_fifo_size[fifo]); hsotg->params.g_tx_fifo_size[fifo] = dptxfszn; } On HiKey, we have g-tx-fifo-size = <128 128 128 128 128 128> in the dtsi, and the fifo_mem value ends up initialized at 1920. Unfortunately, in the above, it sees other entries in the g_tx_fifo_size[] array are initialized to zero, which then causes them to be set to the "default" value of dptxfszn which is 2048. So then later in dwc2_hsotg_init_fifo() the addr value (which adds the fifo_size array value each step) quickly grows beyond the fifo_mem value, causing the warning. Not sure what the right fix is here? Should the min value be used instead of the "default" dptxfszn value? Again, I'm not sure I see any side-effects from this warning, but wanted to try to figure out how to fix it properly. thanks -john
Re: bug? dwc2: insufficient fifo memory
On Fri, Feb 24, 2017 at 2:46 PM, John Stultz wrote: > Hey John, > So after the USB tree landed in 4.11-rc, I've been seeing the > following warning at bootup. > > It seems the fifo_mem/total_fifo_size value on hikey is 1920, but I'm > seeing the addresses zip upward quickly as the txfsz values are all > 2048. Not exactly sure whats wrong here. Things still seem to work > properly. > > thanks > -john > > > [8.944987] dwc2 f72c.usb: bound driver configfs-gadget > [8.956651] insufficient fifo memory > [8.956703] [ cut here ] > [8.964906] WARNING: CPU: 7 PID: 1 at drivers/usb/dwc2/gadget.c:330 > dwc2_hsotg_init_fifo+0x1a8/0x1c8 Hey John, So I finally got a bit of time to look deeper into this, and it seems like this issue was introduced by commit 3c6aea7344c3 ("usb: dwc2: gadget: Add checking for g-tx-fifo-size parameter"), as that change added the following snippit: if (hsotg->params.g_tx_fifo_size[fifo] < min || hsotg->params.g_tx_fifo_size[fifo] > dptxfszn) { dev_warn(hsotg->dev, "%s: Invalid parameter g_tx_fifo_size[%d]=%d\n", __func__, fifo, hsotg->params.g_tx_fifo_size[fifo]); hsotg->params.g_tx_fifo_size[fifo] = dptxfszn; } On HiKey, we have g-tx-fifo-size = <128 128 128 128 128 128> in the dtsi, and the fifo_mem value ends up initialized at 1920. Unfortunately, in the above, it sees other entries in the g_tx_fifo_size[] array are initialized to zero, which then causes them to be set to the "default" value of dptxfszn which is 2048. So then later in dwc2_hsotg_init_fifo() the addr value (which adds the fifo_size array value each step) quickly grows beyond the fifo_mem value, causing the warning. Not sure what the right fix is here? Should the min value be used instead of the "default" dptxfszn value? Again, I'm not sure I see any side-effects from this warning, but wanted to try to figure out how to fix it properly. thanks -john
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 11:29:38PM +0200, Rafael J. Wysocki wrote: > On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeckwrote: > > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: > >> > >> > >> > -Original Message- > >> > From: Guenter Roeck [mailto:li...@roeck-us.net] > >> > Sent: Monday, April 17, 2017 12:45 PM > >> > To: Moore, Robert > >> > Cc: Zheng, Lv ; Wysocki, Rafael J > >> > ; 'Len Brown' ; 'linux- > >> > a...@vger.kernel.org' ; 'de...@acpica.org' > >> > ; 'linux-kernel@vger.kernel.org' >> > ker...@vger.kernel.org>; Box, David E > >> > Subject: Re: [PATCH] ACPICA: Export mutex functions > >> > > >> > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > >> > > > >> > > > -Original Message- > >> > > > From: Moore, Robert > >> > > > Sent: Monday, April 17, 2017 10:13 AM > >> > > > To: Guenter Roeck ; Zheng, Lv > >> > > > > >> > > > Cc: Wysocki, Rafael J ; Len Brown > >> > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > >> > > > linux- ker...@vger.kernel.org > >> > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > >> > > > > >> > > > There is a model for the drivers to directly acquire an AML mutex > >> > > > object. That is why the acquire/release public interfaces were added > >> > > > to ACPICA. > >> > > > > >> > > > I forget all of the details, but the model was developed with MS and > >> > > > others during the ACPI 6.0 timeframe. > >> > > > > >> > > > > >> > > [Moore, Robert] > >> > > > >> > > > >> > > Here is the case where the OS may need to directly acquire an AML > >> > mutex: > >> > > > >> > > From the ACPI spec: > >> > > > >> > > 19.6.2 Acquire (Acquire a Mutex) > >> > > > >> > > Note: For Mutex objects referenced by a _DLM object, the host OS may > >> > also contend for ownership. > >> > > > >> > From the context in the dsdt, and from description of expected use cases > >> > for _DLM objects I can find, this is what the mutex is used for (to > >> > serialize access to a resource on a low pin count serial interconnect, > >> > aka LPC). > >> > > >> > What does that mean in practice ? That I am not supposed to use it > >> > because it doesn't follow standard ACPI mutex declaration rules ? > >> > > >> > Thanks, > >> > Guenter > >> > > >> > > > >> [Moore, Robert] > >> > >> I'm not an expert on the _DLM method, but I would point you to the > >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > >> > > > > I did. However, not being an ACPI expert, that doesn't tell me anything. > > Basically, if the kernel and AML need to access a device concurrently, > there should be a _DLM object under that device in the ACPI tables. > In that case it is expected to return a list of (AML) mutexes that can > be acquired by the kernel in order to synchronize device access with > respect to AML (and for each mutex it may also return a description of > the specific resources to be protected by it). > > Bottom line: without _DLM, the kernel cannot synchronize things with > respect to AML properly, because it has no information how to do that > then. That is all quite interesting. I do see the mutex in question used on various motherboards from various vendors (I checked boards from Gigabyte, MSI, and Intel). Interestingly, the naming seems to be consistent - it is always named "MUT0". For the most part, it seems to be available on more recent motherboards; older motherboards tend to use the resource without locking. However, I don't see any mention of "_DLM" in any of the DSDTs. At the same time, access to ports 0x2e/0x2f is widely used in the kernel. As mentioned before, it is used in watchdog, hardware monitoring, and gpio drivers, but also in parallel port and infrared driver code. Effectively that means that all this code is inherently unsafe on systems with ACPI support. I had thought about implementing a set of utility functions to make the kernel code safer to use if the mutex is found to exist. Right now I wonder, though, if such code would have a chance to be accepted. Any thoughts on that ? Thanks, Guenter > > Thanks, > Rafael
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 11:29:38PM +0200, Rafael J. Wysocki wrote: > On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeck wrote: > > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: > >> > >> > >> > -Original Message- > >> > From: Guenter Roeck [mailto:li...@roeck-us.net] > >> > Sent: Monday, April 17, 2017 12:45 PM > >> > To: Moore, Robert > >> > Cc: Zheng, Lv ; Wysocki, Rafael J > >> > ; 'Len Brown' ; 'linux- > >> > a...@vger.kernel.org' ; 'de...@acpica.org' > >> > ; 'linux-kernel@vger.kernel.org' >> > ker...@vger.kernel.org>; Box, David E > >> > Subject: Re: [PATCH] ACPICA: Export mutex functions > >> > > >> > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > >> > > > >> > > > -Original Message- > >> > > > From: Moore, Robert > >> > > > Sent: Monday, April 17, 2017 10:13 AM > >> > > > To: Guenter Roeck ; Zheng, Lv > >> > > > > >> > > > Cc: Wysocki, Rafael J ; Len Brown > >> > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > >> > > > linux- ker...@vger.kernel.org > >> > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > >> > > > > >> > > > There is a model for the drivers to directly acquire an AML mutex > >> > > > object. That is why the acquire/release public interfaces were added > >> > > > to ACPICA. > >> > > > > >> > > > I forget all of the details, but the model was developed with MS and > >> > > > others during the ACPI 6.0 timeframe. > >> > > > > >> > > > > >> > > [Moore, Robert] > >> > > > >> > > > >> > > Here is the case where the OS may need to directly acquire an AML > >> > mutex: > >> > > > >> > > From the ACPI spec: > >> > > > >> > > 19.6.2 Acquire (Acquire a Mutex) > >> > > > >> > > Note: For Mutex objects referenced by a _DLM object, the host OS may > >> > also contend for ownership. > >> > > > >> > From the context in the dsdt, and from description of expected use cases > >> > for _DLM objects I can find, this is what the mutex is used for (to > >> > serialize access to a resource on a low pin count serial interconnect, > >> > aka LPC). > >> > > >> > What does that mean in practice ? That I am not supposed to use it > >> > because it doesn't follow standard ACPI mutex declaration rules ? > >> > > >> > Thanks, > >> > Guenter > >> > > >> > > > >> [Moore, Robert] > >> > >> I'm not an expert on the _DLM method, but I would point you to the > >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > >> > > > > I did. However, not being an ACPI expert, that doesn't tell me anything. > > Basically, if the kernel and AML need to access a device concurrently, > there should be a _DLM object under that device in the ACPI tables. > In that case it is expected to return a list of (AML) mutexes that can > be acquired by the kernel in order to synchronize device access with > respect to AML (and for each mutex it may also return a description of > the specific resources to be protected by it). > > Bottom line: without _DLM, the kernel cannot synchronize things with > respect to AML properly, because it has no information how to do that > then. That is all quite interesting. I do see the mutex in question used on various motherboards from various vendors (I checked boards from Gigabyte, MSI, and Intel). Interestingly, the naming seems to be consistent - it is always named "MUT0". For the most part, it seems to be available on more recent motherboards; older motherboards tend to use the resource without locking. However, I don't see any mention of "_DLM" in any of the DSDTs. At the same time, access to ports 0x2e/0x2f is widely used in the kernel. As mentioned before, it is used in watchdog, hardware monitoring, and gpio drivers, but also in parallel port and infrared driver code. Effectively that means that all this code is inherently unsafe on systems with ACPI support. I had thought about implementing a set of utility functions to make the kernel code safer to use if the mutex is found to exist. Right now I wonder, though, if such code would have a chance to be accepted. Any thoughts on that ? Thanks, Guenter > > Thanks, > Rafael
Re: [PATCH] slab: avoid IPIs when creating kmem caches
On Sun, 16 Apr 2017, Greg Thelen wrote: > Each slab kmem cache has per cpu array caches. The array caches are > created when the kmem_cache is created, either via kmem_cache_create() > or lazily when the first object is allocated in context of a kmem > enabled memcg. Array caches are replaced by writing to /proc/slabinfo. > > Array caches are protected by holding slab_mutex or disabling > interrupts. Array cache allocation and replacement is done by > __do_tune_cpucache() which holds slab_mutex and calls > kick_all_cpus_sync() to interrupt all remote processors which confirms > there are no references to the old array caches. > > IPIs are needed when replacing array caches. But when creating a new > array cache, there's no need to send IPIs because there cannot be any > references to the new cache. Outside of memcg kmem accounting these > IPIs occur at boot time, so they're not a problem. But with memcg kmem > accounting each container can create kmem caches, so the IPIs are > wasteful. > > Avoid unnecessary IPIs when creating array caches. > > Test which reports the IPI count of allocating slab in 1 memcg: > import os > > def ipi_count(): > with open("/proc/interrupts") as f: > for l in f: > if 'Function call interrupts' in l: > return int(l.split()[1]) > > def echo(val, path): > with open(path, "w") as f: > f.write(val) > > n = 1 > os.chdir("/mnt/cgroup/memory") > pid = str(os.getpid()) > a = ipi_count() > for i in range(n): > os.mkdir(str(i)) > echo("1G\n", "%d/memory.limit_in_bytes" % i) > echo("1G\n", "%d/memory.kmem.limit_in_bytes" % i) > echo(pid, "%d/cgroup.procs" % i) > open("/tmp/x", "w").close() > os.unlink("/tmp/x") > b = ipi_count() > print "%d loops: %d => %d (+%d ipis)" % (n, a, b, b-a) > echo(pid, "cgroup.procs") > for i in range(n): > os.rmdir(str(i)) > > patched: 1 loops: 1069 => 1170 (+101 ipis) > unpatched: 1 loops: 1192 => 48933 (+47741 ipis) > > Signed-off-by: Greg ThelenAcked-by: David Rientjes
Re: [PATCH] slab: avoid IPIs when creating kmem caches
On Sun, 16 Apr 2017, Greg Thelen wrote: > Each slab kmem cache has per cpu array caches. The array caches are > created when the kmem_cache is created, either via kmem_cache_create() > or lazily when the first object is allocated in context of a kmem > enabled memcg. Array caches are replaced by writing to /proc/slabinfo. > > Array caches are protected by holding slab_mutex or disabling > interrupts. Array cache allocation and replacement is done by > __do_tune_cpucache() which holds slab_mutex and calls > kick_all_cpus_sync() to interrupt all remote processors which confirms > there are no references to the old array caches. > > IPIs are needed when replacing array caches. But when creating a new > array cache, there's no need to send IPIs because there cannot be any > references to the new cache. Outside of memcg kmem accounting these > IPIs occur at boot time, so they're not a problem. But with memcg kmem > accounting each container can create kmem caches, so the IPIs are > wasteful. > > Avoid unnecessary IPIs when creating array caches. > > Test which reports the IPI count of allocating slab in 1 memcg: > import os > > def ipi_count(): > with open("/proc/interrupts") as f: > for l in f: > if 'Function call interrupts' in l: > return int(l.split()[1]) > > def echo(val, path): > with open(path, "w") as f: > f.write(val) > > n = 1 > os.chdir("/mnt/cgroup/memory") > pid = str(os.getpid()) > a = ipi_count() > for i in range(n): > os.mkdir(str(i)) > echo("1G\n", "%d/memory.limit_in_bytes" % i) > echo("1G\n", "%d/memory.kmem.limit_in_bytes" % i) > echo(pid, "%d/cgroup.procs" % i) > open("/tmp/x", "w").close() > os.unlink("/tmp/x") > b = ipi_count() > print "%d loops: %d => %d (+%d ipis)" % (n, a, b, b-a) > echo(pid, "cgroup.procs") > for i in range(n): > os.rmdir(str(i)) > > patched: 1 loops: 1069 => 1170 (+101 ipis) > unpatched: 1 loops: 1192 => 48933 (+47741 ipis) > > Signed-off-by: Greg Thelen Acked-by: David Rientjes
Re: [PATCH] hidma_mgmt_sys: Use devm_kmalloc_array() in hidma_mgmt_init_sys()
On 4/17/2017 4:55 PM, SF Markus Elfring wrote: > From: Markus Elfring> Date: Mon, 17 Apr 2017 22:42:40 +0200 > > * A multiplication for the size determination of a memory allocation > indicated that an array data structure should be processed. > Thus use the corresponding function "devm_kmalloc_array". > > This issue was detected by using the Coccinelle software. > > * Delete the local variable "required" which became unnecessary > with this refactoring. > > Signed-off-by: Markus Elfring > --- > drivers/dma/qcom/hidma_mgmt_sys.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/dma/qcom/hidma_mgmt_sys.c > b/drivers/dma/qcom/hidma_mgmt_sys.c > index d61f1068a34b..e7222a0e9ba4 100644 > --- a/drivers/dma/qcom/hidma_mgmt_sys.c > +++ b/drivers/dma/qcom/hidma_mgmt_sys.c > @@ -245,11 +245,12 @@ int hidma_mgmt_init_sys(struct hidma_mgmt_dev *mdev) > { > unsigned int i; > int rc; > - int required; > struct kobject *chanops; > > - required = sizeof(*mdev->chroots) * mdev->dma_channels; > - mdev->chroots = devm_kmalloc(>pdev->dev, required, GFP_KERNEL); > + mdev->chroots = devm_kmalloc_array(>pdev->dev, > +mdev->dma_channels, > +sizeof(*mdev->chroots), > +GFP_KERNEL); > if (!mdev->chroots) > return -ENOMEM; > > Reviewed-by: Sinan Kaya -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH] hidma_mgmt_sys: Use devm_kmalloc_array() in hidma_mgmt_init_sys()
On 4/17/2017 4:55 PM, SF Markus Elfring wrote: > From: Markus Elfring > Date: Mon, 17 Apr 2017 22:42:40 +0200 > > * A multiplication for the size determination of a memory allocation > indicated that an array data structure should be processed. > Thus use the corresponding function "devm_kmalloc_array". > > This issue was detected by using the Coccinelle software. > > * Delete the local variable "required" which became unnecessary > with this refactoring. > > Signed-off-by: Markus Elfring > --- > drivers/dma/qcom/hidma_mgmt_sys.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/dma/qcom/hidma_mgmt_sys.c > b/drivers/dma/qcom/hidma_mgmt_sys.c > index d61f1068a34b..e7222a0e9ba4 100644 > --- a/drivers/dma/qcom/hidma_mgmt_sys.c > +++ b/drivers/dma/qcom/hidma_mgmt_sys.c > @@ -245,11 +245,12 @@ int hidma_mgmt_init_sys(struct hidma_mgmt_dev *mdev) > { > unsigned int i; > int rc; > - int required; > struct kobject *chanops; > > - required = sizeof(*mdev->chroots) * mdev->dma_channels; > - mdev->chroots = devm_kmalloc(>pdev->dev, required, GFP_KERNEL); > + mdev->chroots = devm_kmalloc_array(>pdev->dev, > +mdev->dma_channels, > +sizeof(*mdev->chroots), > +GFP_KERNEL); > if (!mdev->chroots) > return -ENOMEM; > > Reviewed-by: Sinan Kaya -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH 2/2] iio: adc: Allow setting Shunt Voltage PGA gain and Bus Voltage range
On Freitag, 14. April 2017 17:12:03 CEST Jonathan Cameron wrote: > On 12/04/17 04:01, Stefan Brüns wrote: > > Reducing shunt and bus voltage range improves the accuracy, so allow > > altering the default settings. > > > > Signed-off-by: Stefan Brüns> > Hi Stefan, > > There is new userspace ABI in here, so starting point is to document that. > > That would allow the discussion of whether it is the right ABI to begin. > > In particular can one of these at least be rolled into the standard > scale attributes that are already supported by the driver? > It looks to me like they both probably can - perhaps in conjunction with > use of the _available callback to notify userspace the range available from > _raw thus allowing easy computation of the range you are providing. > > Keeping new ABI to a minimum makes life a lot easier for userspace tooling! > > I particularly love the way it's described in the datasheet as a gain > for the shunt voltage but a range for the bus voltage - despite being the > same PGA (at least in the symbolic representation). Unfortunately, correct use of raw and scale is somewhat underdocumented. I would expect the raw values to reflect the value read from the device, unaltered. For the INA226, all value registers are 16 bit, while for the INA219 the voltage register is 13bit (msb aligned, lowest 3 bits from the register are masked), the other 3 registers are 16 bit as well. The INA219 incorporates the bus range and shunt voltage gain in the register value, i.e. the shunt voltage value 0x0100 always corresponds to 256 * 10uV, irrespective of the PGA setting. I think its a bad idea to incorporate the gain settings into the scale attribute: 1. Raw values would no longer be raw values 2. Scale for the INA219 would be settable, but readonly for the INA226 3. If the device has a gain setting, it should be exposed as such, and names should correspond to the datasheet 4. Any user of the gain settings had to be made aware of the possibility to change it, no matter how it is exposed. Making it part of the scale, and thus changing the meaning of the raw values, would be breaking the existing ABI. Kind regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen home: +49 241 53809034 mobile: +49 151 50412019 work: +49 2405 49936-424
Re: [PATCH 2/2] iio: adc: Allow setting Shunt Voltage PGA gain and Bus Voltage range
On Freitag, 14. April 2017 17:12:03 CEST Jonathan Cameron wrote: > On 12/04/17 04:01, Stefan Brüns wrote: > > Reducing shunt and bus voltage range improves the accuracy, so allow > > altering the default settings. > > > > Signed-off-by: Stefan Brüns > > Hi Stefan, > > There is new userspace ABI in here, so starting point is to document that. > > That would allow the discussion of whether it is the right ABI to begin. > > In particular can one of these at least be rolled into the standard > scale attributes that are already supported by the driver? > It looks to me like they both probably can - perhaps in conjunction with > use of the _available callback to notify userspace the range available from > _raw thus allowing easy computation of the range you are providing. > > Keeping new ABI to a minimum makes life a lot easier for userspace tooling! > > I particularly love the way it's described in the datasheet as a gain > for the shunt voltage but a range for the bus voltage - despite being the > same PGA (at least in the symbolic representation). Unfortunately, correct use of raw and scale is somewhat underdocumented. I would expect the raw values to reflect the value read from the device, unaltered. For the INA226, all value registers are 16 bit, while for the INA219 the voltage register is 13bit (msb aligned, lowest 3 bits from the register are masked), the other 3 registers are 16 bit as well. The INA219 incorporates the bus range and shunt voltage gain in the register value, i.e. the shunt voltage value 0x0100 always corresponds to 256 * 10uV, irrespective of the PGA setting. I think its a bad idea to incorporate the gain settings into the scale attribute: 1. Raw values would no longer be raw values 2. Scale for the INA219 would be settable, but readonly for the INA226 3. If the device has a gain setting, it should be exposed as such, and names should correspond to the datasheet 4. Any user of the gain settings had to be made aware of the possibility to change it, no matter how it is exposed. Making it part of the scale, and thus changing the meaning of the raw values, would be breaking the existing ABI. Kind regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen home: +49 241 53809034 mobile: +49 151 50412019 work: +49 2405 49936-424
Re: RFC: WMI Enhancements
On Fri, Apr 14, 2017 at 4:05 PM, Darren Hartwrote: > On Sat, Apr 15, 2017 at 12:45:30AM +0200, Rafael Wysocki wrote: >> On Wednesday, April 12, 2017 04:08:54 PM Darren Hart wrote: >> > Hi All, >> > >> > There are a few parallel efforts involving the Windows Management >> > Instrumentation (WMI)[1] and dependent/related drivers. I'd like to have a >> > round of >> > discussion among those of you that have been involved in this space before >> > we >> > decide on a direction. >> > >> > The WMI support in the kernel today fairly narrowly supports a handful of >> > systems. Andy L. has a work-in-progress series [2] which converts wmi into >> > a >> > platform device and a proper bus, providing devices for dependent drivers >> > to >> > bind to, and a mechanism for sibling devices to communicate with each >> > other. >> > I've reviewed the series and feel like the approach is sound, I plan to >> > carry >> > this series forward and merge it (with Andy L's permission). >> > >> > Are there any objections to this? >> > >> > In Windows, applications interact with WMI more or less directly. We don't >> > do >> > this in Linux currently, although it has been discussed in the past [3]. >> > Some >> > vendors will work around this by performing SMI/SMM, which is inefficient >> > at >> > best. Exposing WMI methods to userspace would bring parity to WMI for >> > Linux and >> > Windows. >> > >> > There are two principal concerns I'd appreciate your thoughts on: >> > >> > a) As an undiscoverable interface (you need to know the method signatures >> > ahead >> > of time), universally exposing every WMI "device" to userspace seems like >> > "a bad >> > idea" from a security and stability perspective. While access would >> > certainly be >> > privileged, it seems more prudent to make this exposure opt-in. We also >> > handle >> > some of this with kernel drivers and exposing those "devices" to userspace >> > would >> > enable userspace and the kernel to fight over control. So - if we expose >> > WMI >> > devices to userspace, I believe this should be done on a case by case >> > basis, >> > opting in, and not by default as part of the WMI driver (although it can >> > provide >> > the mechanism for a sub-driver to use), and possibly a devmode to do so by >> > default. >> >> A couple of loose thoughts here. >> >> In principle there could be a "generic default WMI driver" or similar that >> would >> "claim" all WMI "devices" that have not been "claimed" by anyone else and >> would >> simply expose them to user space somehow (e.g. using a chardev interface). >> >> Then, depending on how that thing is implemented, opt-in etc should be >> possible >> too. >> > > I think we agree this would be an ideal approach. > > As we look into this more, it is becoming clear that the necessary > functionality > is not nicely divided into GUIDs for what is necessary in userspace and what > is > handled in the kernel. A single WMI METHOD GUID may be needed by userspace for > certain functionality, while the kernel drivers may use it for something else. > > :-( > > The input to a WMI method is just a buffer, so it is very free form. One > approach Mario has mentioned was to audit the user space WMI METHOD calls in > the > kernel platform drivers and reject those calls with arguments matching those > issued by the kernel driver. This is likely to be complex and error prone in > my > opinion. However, I have not yet thought of another means to meet the > requirement of having disjoint feature sets for userspace and kernel space > via a > mechanism that was effectively designed to be used solely from user space with > vendor defined method signatures. > > Next step is to look at just how complex it would be to audit the method calls > the kernel currently uses. I'm wondering whether it's really worth it. We already allow doing darned near anything using dcdbas. Maybe the world won't end if we expose a complete-ish ioctl interface to WMI. Also, dcdbas is, to put it mildly, a bit ridiculous. It seems to be a seriously awkward sysfs interface that allows you to, drumroll please, issue outb and inb instructions. It doesn't even check that it's running on a Dell system. It might be nice to deprecate it some day in favor of a real interface. I'd consider a low-level WMI ioctl interface to be a vast improvement.
Re: RFC: WMI Enhancements
On Fri, Apr 14, 2017 at 4:05 PM, Darren Hart wrote: > On Sat, Apr 15, 2017 at 12:45:30AM +0200, Rafael Wysocki wrote: >> On Wednesday, April 12, 2017 04:08:54 PM Darren Hart wrote: >> > Hi All, >> > >> > There are a few parallel efforts involving the Windows Management >> > Instrumentation (WMI)[1] and dependent/related drivers. I'd like to have a >> > round of >> > discussion among those of you that have been involved in this space before >> > we >> > decide on a direction. >> > >> > The WMI support in the kernel today fairly narrowly supports a handful of >> > systems. Andy L. has a work-in-progress series [2] which converts wmi into >> > a >> > platform device and a proper bus, providing devices for dependent drivers >> > to >> > bind to, and a mechanism for sibling devices to communicate with each >> > other. >> > I've reviewed the series and feel like the approach is sound, I plan to >> > carry >> > this series forward and merge it (with Andy L's permission). >> > >> > Are there any objections to this? >> > >> > In Windows, applications interact with WMI more or less directly. We don't >> > do >> > this in Linux currently, although it has been discussed in the past [3]. >> > Some >> > vendors will work around this by performing SMI/SMM, which is inefficient >> > at >> > best. Exposing WMI methods to userspace would bring parity to WMI for >> > Linux and >> > Windows. >> > >> > There are two principal concerns I'd appreciate your thoughts on: >> > >> > a) As an undiscoverable interface (you need to know the method signatures >> > ahead >> > of time), universally exposing every WMI "device" to userspace seems like >> > "a bad >> > idea" from a security and stability perspective. While access would >> > certainly be >> > privileged, it seems more prudent to make this exposure opt-in. We also >> > handle >> > some of this with kernel drivers and exposing those "devices" to userspace >> > would >> > enable userspace and the kernel to fight over control. So - if we expose >> > WMI >> > devices to userspace, I believe this should be done on a case by case >> > basis, >> > opting in, and not by default as part of the WMI driver (although it can >> > provide >> > the mechanism for a sub-driver to use), and possibly a devmode to do so by >> > default. >> >> A couple of loose thoughts here. >> >> In principle there could be a "generic default WMI driver" or similar that >> would >> "claim" all WMI "devices" that have not been "claimed" by anyone else and >> would >> simply expose them to user space somehow (e.g. using a chardev interface). >> >> Then, depending on how that thing is implemented, opt-in etc should be >> possible >> too. >> > > I think we agree this would be an ideal approach. > > As we look into this more, it is becoming clear that the necessary > functionality > is not nicely divided into GUIDs for what is necessary in userspace and what > is > handled in the kernel. A single WMI METHOD GUID may be needed by userspace for > certain functionality, while the kernel drivers may use it for something else. > > :-( > > The input to a WMI method is just a buffer, so it is very free form. One > approach Mario has mentioned was to audit the user space WMI METHOD calls in > the > kernel platform drivers and reject those calls with arguments matching those > issued by the kernel driver. This is likely to be complex and error prone in > my > opinion. However, I have not yet thought of another means to meet the > requirement of having disjoint feature sets for userspace and kernel space > via a > mechanism that was effectively designed to be used solely from user space with > vendor defined method signatures. > > Next step is to look at just how complex it would be to audit the method calls > the kernel currently uses. I'm wondering whether it's really worth it. We already allow doing darned near anything using dcdbas. Maybe the world won't end if we expose a complete-ish ioctl interface to WMI. Also, dcdbas is, to put it mildly, a bit ridiculous. It seems to be a seriously awkward sysfs interface that allows you to, drumroll please, issue outb and inb instructions. It doesn't even check that it's running on a Dell system. It might be nice to deprecate it some day in favor of a real interface. I'd consider a low-level WMI ioctl interface to be a vast improvement.
Re: [PATCH v8 10/15] ACPI: platform-msi: retrieve dev id from IORT
On 4/17/2017 5:44 PM, Sinan Kaya wrote: > Any idea what happened to the change in this function during merge? > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ae7c18380495ac5c14a614fdb6c452c3bf9148ac > I realized that there is a V9 out there. I'm catching up with the work. https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-4.12=e6db07d0f3b6da1f8cfd485776bfefa4fcdbfc45 seems to be fixing the issue. > 63a52b3 platform-msi: Make platform_msi_create_device_domain() ACPI aware > e07b978 irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi > domain > 09be1d5 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare > for ACPI > 4e96df9 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare() > e593828 irqchip/gic-v3-its: Keep the include header files in alphabetic order > e226a74 ACPI: platform: setup MSI domain for ACPI based platform device > 79c7533 ACPI: platform-msi: retrieve devid from IORT > 43394c9 ACPI/IORT: Introduce iort_node_map_platform_id() to retrieve dev id > e6eaeae ACPI/IORT: Rename iort_node_map_rid() to make it generic > 6acbb92 ACPI/IORT: Rework iort_match_node_callback() return value handling > f8e2f50 ACPI/IORT: Add missing comment for iort_dev_find_its_id() > 4b1 ACPI/IORT: Fix the indentation in iort_scan_node() Apologies for the noise. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v8 10/15] ACPI: platform-msi: retrieve dev id from IORT
On 4/17/2017 5:44 PM, Sinan Kaya wrote: > Any idea what happened to the change in this function during merge? > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ae7c18380495ac5c14a614fdb6c452c3bf9148ac > I realized that there is a V9 out there. I'm catching up with the work. https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-4.12=e6db07d0f3b6da1f8cfd485776bfefa4fcdbfc45 seems to be fixing the issue. > 63a52b3 platform-msi: Make platform_msi_create_device_domain() ACPI aware > e07b978 irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi > domain > 09be1d5 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare > for ACPI > 4e96df9 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare() > e593828 irqchip/gic-v3-its: Keep the include header files in alphabetic order > e226a74 ACPI: platform: setup MSI domain for ACPI based platform device > 79c7533 ACPI: platform-msi: retrieve devid from IORT > 43394c9 ACPI/IORT: Introduce iort_node_map_platform_id() to retrieve dev id > e6eaeae ACPI/IORT: Rename iort_node_map_rid() to make it generic > 6acbb92 ACPI/IORT: Rework iort_match_node_callback() return value handling > f8e2f50 ACPI/IORT: Add missing comment for iort_dev_find_its_id() > 4b1 ACPI/IORT: Fix the indentation in iort_scan_node() Apologies for the noise. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH -v2 0/9] mm: make movable onlining suck less
On Tue, Apr 11, 2017 at 10:03 AM, Michal Hockowrote: > All the reported issue seem to be fixed and pushed to my git tree > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more > feedback and then repost for the inclusion. I would really appreaciate > more testing/review! This still seems to be based on 4.10? It's missing some block-layer fixes and other things that trigger failures in the nvdimm unit tests. Can you rebase to a more recent 4.11-rc?
Re: [PATCH -v2 0/9] mm: make movable onlining suck less
On Tue, Apr 11, 2017 at 10:03 AM, Michal Hocko wrote: > All the reported issue seem to be fixed and pushed to my git tree > attempts/rewrite-mem_hotplug branch. I will wait a day or two for more > feedback and then repost for the inclusion. I would really appreaciate > more testing/review! This still seems to be based on 4.10? It's missing some block-layer fixes and other things that trigger failures in the nvdimm unit tests. Can you rebase to a more recent 4.11-rc?
Re: [PATCH] drm/vc4: Fix refcounting of runtime PM get if it errors out.
Sean Paulwrites: > On Mon, Apr 17, 2017 at 09:26:03AM -0700, Eric Anholt wrote: >> We were returning without decrementing if the error happened, meaning >> that at the next submit we wouldn't try to bring up the power domain. >> >> Signed-off-by: Eric Anholt > > This change looks good to me. It seems a little odd to wrap pm_runtime which > is already refcounted in another refcount, but I'm really not familiar with > the driver, and I'm sure there's a good reason for it. > > Reviewed-by: Sean Paul Yeah, it's an unfortunate hack because the reset control is buried in the power domain driver on the RPi, so we have to get the power domain actually off (0 refcount) in order to reset the GPU. Yay for building around closed-source firmware :( signature.asc Description: PGP signature
Re: [PATCH] drm/vc4: Fix refcounting of runtime PM get if it errors out.
Sean Paul writes: > On Mon, Apr 17, 2017 at 09:26:03AM -0700, Eric Anholt wrote: >> We were returning without decrementing if the error happened, meaning >> that at the next submit we wouldn't try to bring up the power domain. >> >> Signed-off-by: Eric Anholt > > This change looks good to me. It seems a little odd to wrap pm_runtime which > is already refcounted in another refcount, but I'm really not familiar with > the driver, and I'm sure there's a good reason for it. > > Reviewed-by: Sean Paul Yeah, it's an unfortunate hack because the reset control is buried in the power domain driver on the RPi, so we have to get the power domain actually off (0 refcount) in order to reset the GPU. Yay for building around closed-source firmware :( signature.asc Description: PGP signature
Re: [PATCH v8 10/15] ACPI: platform-msi: retrieve dev id from IORT
On 1/18/2017 7:55 AM, Hanjun Guo wrote: > --- a/drivers/irqchip/irq-gic-v3-its-platform-msi.c > +++ b/drivers/irqchip/irq-gic-v3-its-platform-msi.c > @@ -57,7 +57,8 @@ static int its_pmsi_prepare(struct irq_domain *domain, > struct device *dev, > > msi_info = msi_get_domain_info(domain->parent); > > - ret = of_pmsi_get_dev_id(domain, dev, _id); > + ret = dev->of_node ? of_pmsi_get_dev_id(domain, dev, _id) : > + iort_pmsi_get_dev_id(dev, _id); > if (ret) > return ret; > Any idea what happened to the change in this function during merge? https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ae7c18380495ac5c14a614fdb6c452c3bf9148ac 63a52b3 platform-msi: Make platform_msi_create_device_domain() ACPI aware e07b978 irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi domain 09be1d5 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare for ACPI 4e96df9 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare() e593828 irqchip/gic-v3-its: Keep the include header files in alphabetic order e226a74 ACPI: platform: setup MSI domain for ACPI based platform device 79c7533 ACPI: platform-msi: retrieve devid from IORT 43394c9 ACPI/IORT: Introduce iort_node_map_platform_id() to retrieve dev id e6eaeae ACPI/IORT: Rename iort_node_map_rid() to make it generic 6acbb92 ACPI/IORT: Rework iort_match_node_callback() return value handling f8e2f50 ACPI/IORT: Add missing comment for iort_dev_find_its_id() 4b1 ACPI/IORT: Fix the indentation in iort_scan_node() I'm getting these warnings when using the above patches on top of 4.11-rc6? The code is basically trying to read OF attribute on an ACPI system. [ 11.644792] [ cut here ] [ 11.649407] WARNING: CPU: 35 PID: 1 at /local/mnt/workspace/projects/caf/kernel/drivers/irqchip/irq-gic-v3-its-platform-msi.c:41 its_pmsi_prepare+0x9c/0xfc [ 11.663305] Modules linked in: [ 11.666353] [ 11.667839] CPU: 35 PID: 1 Comm: swapper/0 Tainted: GW 4.11.0-00036-g63a52b3 #1 [ 11.676355] Hardware name: (null) (DT) [ 11.680097] task: 8007dbab8000 task.stack: 8007dbab4000 [ 11.686009] PC is at its_pmsi_prepare+0x9c/0xfc [ 11.690533] LR is at its_pmsi_prepare+0x5c/0xfc [ 11.695056] pc : [] lr : [] pstate: 20400045 [ 11.702443] sp : 8007dbab7a40 [ 11.705750] x29: 8007dbab7a40 x28: [ 11.711057] x27: 092a9000 x26: [ 11.716364] x25: 8007df02ba00 x24: 08a745a9 [ 11.721670] x23: 08a7459e x22: 000b [ 11.726976] x21: 8007dbab7b70 x20: 8007d9c48c10 [ 11.732282] x19: 8007dba8ba80 x18: 000a [ 11.737588] x17: 1000 x16: 0040 [ 11.742895] x15: 0009bb9b x14: 00400041 [ 11.748201] x13: 0140 x12: 0088 [ 11.753507] x11: 08af x10: 094ad000 [ 11.758814] x9 : x8 : 8007d977ac00 [ 11.764120] x7 : x6 : 003f [ 11.769426] x5 : 8007dbab7a90 x4 : [ 11.774733] x3 : 0006 x2 : fffe [ 11.780039] x1 : 8007df02b760 x0 : 08d8fb70 [ 11.785345] [ 11.786830] ---[ end trace 265ed4f0c6d0486b ]--- [ 11.791439] Call trace: [ 11.793879] Exception stack(0x8007dbab7870 to 0x8007dbab79a0) [ 11.800312] 7860: 8007dba8ba80 0001 [ 11.808134] 7880: 8007dbab7a40 08371b68 7e00 0004 [ 11.815956] 78a0: 08af 8200 000ff200 8007dbab8000 [ 11.823779] 78c0: 000ff000 081cb6ec 8007dbab78e0 000ff000 [ 11.831601] 78e0: 0040 80072a80 8007dbab7940 08166530 [ 11.839423] 7900: 8007dbab7960 081984f4 08d8fb70 8007df02b760 [ 11.847245] 7920: fffe 0006 8007dbab7a90 [ 11.855067] 7940: 003f 8007d977ac00 [ 11.862890] 7960: 094ad000 08af 0088 0140 [ 11.870712] 7980: 00400041 0009bb9b 0040 1000 [ 11.878535] [] its_pmsi_prepare+0x9c/0xfc [ 11.884101] [] msi_domain_prepare_irqs+0x54/0x68 [ 11.890274] [] msi_domain_alloc_irqs+0x30/0x14c [ 11.896361] [] platform_msi_domain_alloc_irqs+0x54/0x8c [ 11.903142] [] hidma_probe+0x5ac/0x824 [ 11.908448] [] platform_drv_probe+0x54/0xa4 [ 11.914186] [] driver_probe_device+0x140/0x2a0 [ 11.920185] [] __driver_attach+0x74/0xa0 [ 11.925664] [] bus_for_each_dev+0x68/0x98 [ 11.931229] [] driver_attach+0x20/0x28 [ 11.936534] [] bus_add_driver+0xe0/0x1ec [ 11.942012] [] driver_register+0x90/0xdc [ 11.947490] [] __platform_driver_register+0x48/0x50 [ 11.953924] [] hidma_driver_init+0x18/0x20 [
Re: [PATCH v8 10/15] ACPI: platform-msi: retrieve dev id from IORT
On 1/18/2017 7:55 AM, Hanjun Guo wrote: > --- a/drivers/irqchip/irq-gic-v3-its-platform-msi.c > +++ b/drivers/irqchip/irq-gic-v3-its-platform-msi.c > @@ -57,7 +57,8 @@ static int its_pmsi_prepare(struct irq_domain *domain, > struct device *dev, > > msi_info = msi_get_domain_info(domain->parent); > > - ret = of_pmsi_get_dev_id(domain, dev, _id); > + ret = dev->of_node ? of_pmsi_get_dev_id(domain, dev, _id) : > + iort_pmsi_get_dev_id(dev, _id); > if (ret) > return ret; > Any idea what happened to the change in this function during merge? https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ae7c18380495ac5c14a614fdb6c452c3bf9148ac 63a52b3 platform-msi: Make platform_msi_create_device_domain() ACPI aware e07b978 irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi domain 09be1d5 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare for ACPI 4e96df9 irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare() e593828 irqchip/gic-v3-its: Keep the include header files in alphabetic order e226a74 ACPI: platform: setup MSI domain for ACPI based platform device 79c7533 ACPI: platform-msi: retrieve devid from IORT 43394c9 ACPI/IORT: Introduce iort_node_map_platform_id() to retrieve dev id e6eaeae ACPI/IORT: Rename iort_node_map_rid() to make it generic 6acbb92 ACPI/IORT: Rework iort_match_node_callback() return value handling f8e2f50 ACPI/IORT: Add missing comment for iort_dev_find_its_id() 4b1 ACPI/IORT: Fix the indentation in iort_scan_node() I'm getting these warnings when using the above patches on top of 4.11-rc6? The code is basically trying to read OF attribute on an ACPI system. [ 11.644792] [ cut here ] [ 11.649407] WARNING: CPU: 35 PID: 1 at /local/mnt/workspace/projects/caf/kernel/drivers/irqchip/irq-gic-v3-its-platform-msi.c:41 its_pmsi_prepare+0x9c/0xfc [ 11.663305] Modules linked in: [ 11.666353] [ 11.667839] CPU: 35 PID: 1 Comm: swapper/0 Tainted: GW 4.11.0-00036-g63a52b3 #1 [ 11.676355] Hardware name: (null) (DT) [ 11.680097] task: 8007dbab8000 task.stack: 8007dbab4000 [ 11.686009] PC is at its_pmsi_prepare+0x9c/0xfc [ 11.690533] LR is at its_pmsi_prepare+0x5c/0xfc [ 11.695056] pc : [] lr : [] pstate: 20400045 [ 11.702443] sp : 8007dbab7a40 [ 11.705750] x29: 8007dbab7a40 x28: [ 11.711057] x27: 092a9000 x26: [ 11.716364] x25: 8007df02ba00 x24: 08a745a9 [ 11.721670] x23: 08a7459e x22: 000b [ 11.726976] x21: 8007dbab7b70 x20: 8007d9c48c10 [ 11.732282] x19: 8007dba8ba80 x18: 000a [ 11.737588] x17: 1000 x16: 0040 [ 11.742895] x15: 0009bb9b x14: 00400041 [ 11.748201] x13: 0140 x12: 0088 [ 11.753507] x11: 08af x10: 094ad000 [ 11.758814] x9 : x8 : 8007d977ac00 [ 11.764120] x7 : x6 : 003f [ 11.769426] x5 : 8007dbab7a90 x4 : [ 11.774733] x3 : 0006 x2 : fffe [ 11.780039] x1 : 8007df02b760 x0 : 08d8fb70 [ 11.785345] [ 11.786830] ---[ end trace 265ed4f0c6d0486b ]--- [ 11.791439] Call trace: [ 11.793879] Exception stack(0x8007dbab7870 to 0x8007dbab79a0) [ 11.800312] 7860: 8007dba8ba80 0001 [ 11.808134] 7880: 8007dbab7a40 08371b68 7e00 0004 [ 11.815956] 78a0: 08af 8200 000ff200 8007dbab8000 [ 11.823779] 78c0: 000ff000 081cb6ec 8007dbab78e0 000ff000 [ 11.831601] 78e0: 0040 80072a80 8007dbab7940 08166530 [ 11.839423] 7900: 8007dbab7960 081984f4 08d8fb70 8007df02b760 [ 11.847245] 7920: fffe 0006 8007dbab7a90 [ 11.855067] 7940: 003f 8007d977ac00 [ 11.862890] 7960: 094ad000 08af 0088 0140 [ 11.870712] 7980: 00400041 0009bb9b 0040 1000 [ 11.878535] [] its_pmsi_prepare+0x9c/0xfc [ 11.884101] [] msi_domain_prepare_irqs+0x54/0x68 [ 11.890274] [] msi_domain_alloc_irqs+0x30/0x14c [ 11.896361] [] platform_msi_domain_alloc_irqs+0x54/0x8c [ 11.903142] [] hidma_probe+0x5ac/0x824 [ 11.908448] [] platform_drv_probe+0x54/0xa4 [ 11.914186] [] driver_probe_device+0x140/0x2a0 [ 11.920185] [] __driver_attach+0x74/0xa0 [ 11.925664] [] bus_for_each_dev+0x68/0x98 [ 11.931229] [] driver_attach+0x20/0x28 [ 11.936534] [] bus_add_driver+0xe0/0x1ec [ 11.942012] [] driver_register+0x90/0xdc [ 11.947490] [] __platform_driver_register+0x48/0x50 [ 11.953924] [] hidma_driver_init+0x18/0x20 [
Re: [GIT PULL] KEYS: Blacklisting, restrictions and DH
On Wed, 12 Apr 2017, David Howells wrote: > > Hi James, > > Could you pull these changes into security/next please: > > (1) Provide a blacklist keyring and a blacklist key type such that X.509 > keys and PKCS#7 certs can be blacklisted. It is possible to load the > blacklist from a file at compile time. A future patch will > additionally load the blacklist from the UEFI blacklist if available. > > (2) Make it possible to create a userspace keyring and to apply a > restriction to it such that no new keys can be added unless they meet > the criteria. > > (3) Add SP800-56A KDF support for the DH operation. > Pulled, thanks. -- James Morris
Re: [GIT PULL] KEYS: Blacklisting, restrictions and DH
On Wed, 12 Apr 2017, David Howells wrote: > > Hi James, > > Could you pull these changes into security/next please: > > (1) Provide a blacklist keyring and a blacklist key type such that X.509 > keys and PKCS#7 certs can be blacklisted. It is possible to load the > blacklist from a file at compile time. A future patch will > additionally load the blacklist from the UEFI blacklist if available. > > (2) Make it possible to create a userspace keyring and to apply a > restriction to it such that no new keys can be added unless they meet > the criteria. > > (3) Add SP800-56A KDF support for the DH operation. > Pulled, thanks. -- James Morris
Re: [PATCH v4 1/2] vfio/type1: Remove locked page accounting workqueue
On Tue, 18 Apr 2017 01:02:12 +0530 Kirti Wankhedewrote: > On 4/18/2017 12:49 AM, Alex Williamson wrote: > > On Tue, 18 Apr 2017 00:35:06 +0530 > > Kirti Wankhede wrote: > > > >> On 4/17/2017 8:02 PM, Alex Williamson wrote: > >>> On Mon, 17 Apr 2017 14:47:54 +0800 > >>> Peter Xu wrote: > >>> > On Sun, Apr 16, 2017 at 07:42:27PM -0600, Alex Williamson wrote: > > [...] > > > -static void vfio_lock_acct(struct task_struct *task, long npage) > > +static int vfio_lock_acct(struct task_struct *task, long npage, bool > > lock_cap) > > { > > - struct vwork *vwork; > > struct mm_struct *mm; > > bool is_current; > > + int ret; > > > > if (!npage) > > - return; > > + return 0; > > > > is_current = (task->mm == current->mm); > > > > mm = is_current ? task->mm : get_task_mm(task); > > if (!mm) > > - return; /* process exited */ > > + return -ESRCH; /* process exited */ > > > > - if (down_write_trylock(>mmap_sem)) { > > - mm->locked_vm += npage; > > - up_write(>mmap_sem); > > - if (!is_current) > > - mmput(mm); > > - return; > > - } > > + ret = down_write_killable(>mmap_sem); > > + if (!ret) { > > + if (npage < 0 || lock_cap) { > > Nit: maybe we can avoid passing in lock_cap in all the callers of > vfio_lock_acct() and fetch it via has_capability() only if npage < 0? > IMHO that'll keep the vfio_lock_acct() interface cleaner, and we won't > need to pass in "false" any time when doing unpins. > >>> > >>> Unfortunately vfio_pin_pages_remote() needs to know about lock_cap > >>> since it tests whether the user is exceeding their locked memory > >>> limit. The other callers could certainly get away with > >>> vfio_lock_acct() testing the capability itself but that would add a > >>> redundant call for the most common user. I'm not a big fan of passing > >>> a lock_cap bool either, but it seemed the best fix for now. The > >>> cleanest alternative I can up with is this (untested): > >>> > >> > >> In my opinion, passing 'bool lock_cap' looks much clean and simple. > >> > >> Reviewed-by: Kirti Wankhede > > > > Well shoot, I was just starting to warm up to the bool*. I like that > > we're not presuming the polarity for the callers we expect to be > > removing pages and I generally just dislike passing fixed bool > > parameters to change the function behavior. I've cleaned it up a bit > > further and was starting to do some testing on this which I'd propose > > for v5. Does it change your opinion? > > If passing fixed bool parameter is the concern then I would lean towards > Peter's suggestion. vfio_pin_pages_remote() will check lock capability > outside vfio_lock_acct() and again in vfio_lock_acct(). At other places, > it will be takes care within vfio_lock_acct() Sorry, I don't see that as a viable option. Testing for CAP_IPC_LOCK in both vfio_pin_pages_remote() and vfio_lock_acct() results in over a 10% performance hit on the mapping path with a custom micro-benchmark. In fact, it suggests we should probably pass that from even higher in the call stack. Thanks, Alex > > > > commit cd61c5f507d614ac14b75b0a548c8738deff88ea > > Author: Alex Williamson > > Date: Thu Apr 13 14:10:15 2017 -0600 > > > > vfio/type1: Remove locked page accounting workqueue > > > > If the mmap_sem is contented then the vfio type1 IOMMU backend will > > defer locked page accounting updates to a workqueue task. This has a > > few problems and depending on which side the user tries to play, they > > might be over-penalized for unmaps that haven't yet been accounted or > > race the workqueue to enter more mappings than they're allowed. The > > original intent of this workqueue mechanism seems to be focused on > > reducing latency through the ioctl, but we cannot do so at the cost > > of correctness. Remove this workqueue mechanism and update the > > callers to allow for failure. We can also now recheck the limit under > > write lock to make sure we don't exceed it. > > > > vfio_pin_pages_remote() also now necessarily includes an unwind path > > which we can jump to directly if the consecutive page pinning finds > > that we're exceeding the user's memory limits. This avoids the > > current lazy approach which does accounting and mapping up to the > > fault, only to return an error on the next iteration to unwind the > > entire vfio_dma. > > > > Cc: sta...@vger.kernel.org > > Signed-off-by: Alex
Re: [PATCH v4 1/2] vfio/type1: Remove locked page accounting workqueue
On Tue, 18 Apr 2017 01:02:12 +0530 Kirti Wankhede wrote: > On 4/18/2017 12:49 AM, Alex Williamson wrote: > > On Tue, 18 Apr 2017 00:35:06 +0530 > > Kirti Wankhede wrote: > > > >> On 4/17/2017 8:02 PM, Alex Williamson wrote: > >>> On Mon, 17 Apr 2017 14:47:54 +0800 > >>> Peter Xu wrote: > >>> > On Sun, Apr 16, 2017 at 07:42:27PM -0600, Alex Williamson wrote: > > [...] > > > -static void vfio_lock_acct(struct task_struct *task, long npage) > > +static int vfio_lock_acct(struct task_struct *task, long npage, bool > > lock_cap) > > { > > - struct vwork *vwork; > > struct mm_struct *mm; > > bool is_current; > > + int ret; > > > > if (!npage) > > - return; > > + return 0; > > > > is_current = (task->mm == current->mm); > > > > mm = is_current ? task->mm : get_task_mm(task); > > if (!mm) > > - return; /* process exited */ > > + return -ESRCH; /* process exited */ > > > > - if (down_write_trylock(>mmap_sem)) { > > - mm->locked_vm += npage; > > - up_write(>mmap_sem); > > - if (!is_current) > > - mmput(mm); > > - return; > > - } > > + ret = down_write_killable(>mmap_sem); > > + if (!ret) { > > + if (npage < 0 || lock_cap) { > > Nit: maybe we can avoid passing in lock_cap in all the callers of > vfio_lock_acct() and fetch it via has_capability() only if npage < 0? > IMHO that'll keep the vfio_lock_acct() interface cleaner, and we won't > need to pass in "false" any time when doing unpins. > >>> > >>> Unfortunately vfio_pin_pages_remote() needs to know about lock_cap > >>> since it tests whether the user is exceeding their locked memory > >>> limit. The other callers could certainly get away with > >>> vfio_lock_acct() testing the capability itself but that would add a > >>> redundant call for the most common user. I'm not a big fan of passing > >>> a lock_cap bool either, but it seemed the best fix for now. The > >>> cleanest alternative I can up with is this (untested): > >>> > >> > >> In my opinion, passing 'bool lock_cap' looks much clean and simple. > >> > >> Reviewed-by: Kirti Wankhede > > > > Well shoot, I was just starting to warm up to the bool*. I like that > > we're not presuming the polarity for the callers we expect to be > > removing pages and I generally just dislike passing fixed bool > > parameters to change the function behavior. I've cleaned it up a bit > > further and was starting to do some testing on this which I'd propose > > for v5. Does it change your opinion? > > If passing fixed bool parameter is the concern then I would lean towards > Peter's suggestion. vfio_pin_pages_remote() will check lock capability > outside vfio_lock_acct() and again in vfio_lock_acct(). At other places, > it will be takes care within vfio_lock_acct() Sorry, I don't see that as a viable option. Testing for CAP_IPC_LOCK in both vfio_pin_pages_remote() and vfio_lock_acct() results in over a 10% performance hit on the mapping path with a custom micro-benchmark. In fact, it suggests we should probably pass that from even higher in the call stack. Thanks, Alex > > > > commit cd61c5f507d614ac14b75b0a548c8738deff88ea > > Author: Alex Williamson > > Date: Thu Apr 13 14:10:15 2017 -0600 > > > > vfio/type1: Remove locked page accounting workqueue > > > > If the mmap_sem is contented then the vfio type1 IOMMU backend will > > defer locked page accounting updates to a workqueue task. This has a > > few problems and depending on which side the user tries to play, they > > might be over-penalized for unmaps that haven't yet been accounted or > > race the workqueue to enter more mappings than they're allowed. The > > original intent of this workqueue mechanism seems to be focused on > > reducing latency through the ioctl, but we cannot do so at the cost > > of correctness. Remove this workqueue mechanism and update the > > callers to allow for failure. We can also now recheck the limit under > > write lock to make sure we don't exceed it. > > > > vfio_pin_pages_remote() also now necessarily includes an unwind path > > which we can jump to directly if the consecutive page pinning finds > > that we're exceeding the user's memory limits. This avoids the > > current lazy approach which does accounting and mapping up to the > > fault, only to return an error on the next iteration to unwind the > > entire vfio_dma. > > > > Cc: sta...@vger.kernel.org > > Signed-off-by: Alex Williamson > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c > > b/drivers/vfio/vfio_iommu_type1.c > > index
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeckwrote: > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: >> >> >> > -Original Message- >> > From: Guenter Roeck [mailto:li...@roeck-us.net] >> > Sent: Monday, April 17, 2017 12:45 PM >> > To: Moore, Robert >> > Cc: Zheng, Lv ; Wysocki, Rafael J >> > ; 'Len Brown' ; 'linux- >> > a...@vger.kernel.org' ; 'de...@acpica.org' >> > ; 'linux-kernel@vger.kernel.org' > > ker...@vger.kernel.org>; Box, David E >> > Subject: Re: [PATCH] ACPICA: Export mutex functions >> > >> > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: >> > > >> > > > -Original Message- >> > > > From: Moore, Robert >> > > > Sent: Monday, April 17, 2017 10:13 AM >> > > > To: Guenter Roeck ; Zheng, Lv >> > > > >> > > > Cc: Wysocki, Rafael J ; Len Brown >> > > > ; linux-a...@vger.kernel.org; de...@acpica.org; >> > > > linux- ker...@vger.kernel.org >> > > > Subject: RE: [PATCH] ACPICA: Export mutex functions >> > > > >> > > > There is a model for the drivers to directly acquire an AML mutex >> > > > object. That is why the acquire/release public interfaces were added >> > > > to ACPICA. >> > > > >> > > > I forget all of the details, but the model was developed with MS and >> > > > others during the ACPI 6.0 timeframe. >> > > > >> > > > >> > > [Moore, Robert] >> > > >> > > >> > > Here is the case where the OS may need to directly acquire an AML >> > mutex: >> > > >> > > From the ACPI spec: >> > > >> > > 19.6.2 Acquire (Acquire a Mutex) >> > > >> > > Note: For Mutex objects referenced by a _DLM object, the host OS may >> > also contend for ownership. >> > > >> > From the context in the dsdt, and from description of expected use cases >> > for _DLM objects I can find, this is what the mutex is used for (to >> > serialize access to a resource on a low pin count serial interconnect, >> > aka LPC). >> > >> > What does that mean in practice ? That I am not supposed to use it >> > because it doesn't follow standard ACPI mutex declaration rules ? >> > >> > Thanks, >> > Guenter >> > >> > > >> [Moore, Robert] >> >> I'm not an expert on the _DLM method, but I would point you to the >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). >> > > I did. However, not being an ACPI expert, that doesn't tell me anything. Basically, if the kernel and AML need to access a device concurrently, there should be a _DLM object under that device in the ACPI tables. In that case it is expected to return a list of (AML) mutexes that can be acquired by the kernel in order to synchronize device access with respect to AML (and for each mutex it may also return a description of the specific resources to be protected by it). Bottom line: without _DLM, the kernel cannot synchronize things with respect to AML properly, because it has no information how to do that then. Thanks, Rafael
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 11:03 PM, Guenter Roeck wrote: > On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: >> >> >> > -Original Message- >> > From: Guenter Roeck [mailto:li...@roeck-us.net] >> > Sent: Monday, April 17, 2017 12:45 PM >> > To: Moore, Robert >> > Cc: Zheng, Lv ; Wysocki, Rafael J >> > ; 'Len Brown' ; 'linux- >> > a...@vger.kernel.org' ; 'de...@acpica.org' >> > ; 'linux-kernel@vger.kernel.org' > > ker...@vger.kernel.org>; Box, David E >> > Subject: Re: [PATCH] ACPICA: Export mutex functions >> > >> > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: >> > > >> > > > -Original Message- >> > > > From: Moore, Robert >> > > > Sent: Monday, April 17, 2017 10:13 AM >> > > > To: Guenter Roeck ; Zheng, Lv >> > > > >> > > > Cc: Wysocki, Rafael J ; Len Brown >> > > > ; linux-a...@vger.kernel.org; de...@acpica.org; >> > > > linux- ker...@vger.kernel.org >> > > > Subject: RE: [PATCH] ACPICA: Export mutex functions >> > > > >> > > > There is a model for the drivers to directly acquire an AML mutex >> > > > object. That is why the acquire/release public interfaces were added >> > > > to ACPICA. >> > > > >> > > > I forget all of the details, but the model was developed with MS and >> > > > others during the ACPI 6.0 timeframe. >> > > > >> > > > >> > > [Moore, Robert] >> > > >> > > >> > > Here is the case where the OS may need to directly acquire an AML >> > mutex: >> > > >> > > From the ACPI spec: >> > > >> > > 19.6.2 Acquire (Acquire a Mutex) >> > > >> > > Note: For Mutex objects referenced by a _DLM object, the host OS may >> > also contend for ownership. >> > > >> > From the context in the dsdt, and from description of expected use cases >> > for _DLM objects I can find, this is what the mutex is used for (to >> > serialize access to a resource on a low pin count serial interconnect, >> > aka LPC). >> > >> > What does that mean in practice ? That I am not supposed to use it >> > because it doesn't follow standard ACPI mutex declaration rules ? >> > >> > Thanks, >> > Guenter >> > >> > > >> [Moore, Robert] >> >> I'm not an expert on the _DLM method, but I would point you to the >> description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). >> > > I did. However, not being an ACPI expert, that doesn't tell me anything. Basically, if the kernel and AML need to access a device concurrently, there should be a _DLM object under that device in the ACPI tables. In that case it is expected to return a list of (AML) mutexes that can be acquired by the kernel in order to synchronize device access with respect to AML (and for each mutex it may also return a description of the specific resources to be protected by it). Bottom line: without _DLM, the kernel cannot synchronize things with respect to AML properly, because it has no information how to do that then. Thanks, Rafael
[tip:WIP.timers 8/10] kernel//time/timer_migration.c:190:4: error: implicit declaration of function 'timer_expire_remote'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers head: c0b7a5dbb870d1660aa5e566c5ce9972290a2bed commit: 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 [8/10] timer: Implement the hierarchical pull model config: i386-randconfig-r0-201716 (attached as .config) compiler: gcc-5 (Debian 5.4.1-2) 5.4.1 20160904 reproduce: git checkout 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h: In function 'tmigr_cpu_idle': kernel//time/timer_migration.h:26:1: warning: no return statement in function returning non-void [-Wreturn-type] static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^ kernel//time/timer_migration.c: In function '__tmigr_handle_remote': >> kernel//time/timer_migration.c:190:4: error: implicit declaration of >> function 'timer_expire_remote' [-Werror=implicit-function-declaration] timer_expire_remote(evt->cpu); ^ kernel//time/timer_migration.c: At top level: >> kernel//time/timer_migration.c:223:6: error: redefinition of >> 'tmigr_handle_remote' void tmigr_handle_remote(void) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:25:20: note: previous definition of 'tmigr_handle_remote' was here static inline void tmigr_handle_remote(void) { } ^ >> kernel//time/timer_migration.c:348:5: error: redefinition of 'tmigr_cpu_idle' u64 tmigr_cpu_idle(u64 nextevt) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:26:19: note: previous definition of 'tmigr_cpu_idle' was here static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^ >> kernel//time/timer_migration.c:406:6: error: redefinition of >> 'tmigr_cpu_activate' void tmigr_cpu_activate(void) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:27:20: note: previous definition of 'tmigr_cpu_activate' was here static inline void tmigr_cpu_activate(void) { } ^ cc1: some warnings being treated as errors vim +/timer_expire_remote +190 kernel//time/timer_migration.c 10 #include 11 #include 12 #include 13 #include 14 #include 15 > 16 #include "timer_migration.h" 17 #include "tick-internal.h" 18 19 #ifdef DEBUG 20 # define DBG_BUG_ON(x) BUG_ON(x) 21 #else 22 # define DBG_BUG_ON(x) 23 #endif 24 25 /* Per group capacity. Must be a power of 2! */ 26 static const unsigned int tmigr_childs_per_group = 8; 27 28 bool tmigr_enabled __read_mostly; 29 static unsigned int tmigr_hierarchy_levels __read_mostly; 30 static unsigned int tmigr_crossnode_level __read_mostly; 31 static struct list_head *tmigr_level_list __read_mostly; 32 33 static DEFINE_MUTEX(tmigr_mutex); 34 35 static DEFINE_PER_CPU(struct tmigr_cpu, tmigr_cpu); 36 37 static void tmigr_add_evt(struct tmigr_group *group, struct tmigr_event *evt) 38 { 39 /* 40 * Can be called with @evt == NULL, an already queued @evt or 41 * an event that do not need to be queued (expires == 42 * KTIME_MAX) 43 */ 44 if (!evt || !RB_EMPTY_NODE(>nextevt.node) || 45 evt->nextevt.expires == KTIME_MAX) 46 return; 47 48 /* @group->group event must not be queued in the parent group */ 49 DBG_BUG_ON(!RB_EMPTY_NODE(>groupevt.nextevt.node)); 50 51 /* If this is the new first to expire event, update group event */ 52 if (timerqueue_add(>events, >nextevt)) { 53 group->groupevt.nextevt.expires = evt->nextevt.expires; 54 group->groupevt.cpu = evt->cpu; 55 } 56 } 57 58 static void tmigr_remove_evt(struct tmigr_group *group, struct tmigr_event *evt) 59 { 60 struct timerqueue_node *next; 61 struct tmigr_event *nextevt; 62 bool first; 63 64 /* 65 * It's safe to modify the group event of this group, because it is 66 * not queued in the parent group. 67 */ 68 DBG_BUG_ON(!RB_EMPTY_NODE(>groupevt.nextevt.node)); 69 70 /* Remove the child event, if pending */ 71 if (!evt || RB_EMPTY_NODE(>nextevt.node)) 72 return; 73 /* 74 * If this was the last queued event in the group, clear 75 * the group event. If this was the first event to expire, 76 * update the group. 77 */ 78 first =
[tip:WIP.timers 8/10] kernel//time/timer_migration.c:190:4: error: implicit declaration of function 'timer_expire_remote'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers head: c0b7a5dbb870d1660aa5e566c5ce9972290a2bed commit: 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 [8/10] timer: Implement the hierarchical pull model config: i386-randconfig-r0-201716 (attached as .config) compiler: gcc-5 (Debian 5.4.1-2) 5.4.1 20160904 reproduce: git checkout 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h: In function 'tmigr_cpu_idle': kernel//time/timer_migration.h:26:1: warning: no return statement in function returning non-void [-Wreturn-type] static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^ kernel//time/timer_migration.c: In function '__tmigr_handle_remote': >> kernel//time/timer_migration.c:190:4: error: implicit declaration of >> function 'timer_expire_remote' [-Werror=implicit-function-declaration] timer_expire_remote(evt->cpu); ^ kernel//time/timer_migration.c: At top level: >> kernel//time/timer_migration.c:223:6: error: redefinition of >> 'tmigr_handle_remote' void tmigr_handle_remote(void) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:25:20: note: previous definition of 'tmigr_handle_remote' was here static inline void tmigr_handle_remote(void) { } ^ >> kernel//time/timer_migration.c:348:5: error: redefinition of 'tmigr_cpu_idle' u64 tmigr_cpu_idle(u64 nextevt) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:26:19: note: previous definition of 'tmigr_cpu_idle' was here static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^ >> kernel//time/timer_migration.c:406:6: error: redefinition of >> 'tmigr_cpu_activate' void tmigr_cpu_activate(void) ^ In file included from kernel//time/timer_migration.c:16:0: kernel//time/timer_migration.h:27:20: note: previous definition of 'tmigr_cpu_activate' was here static inline void tmigr_cpu_activate(void) { } ^ cc1: some warnings being treated as errors vim +/timer_expire_remote +190 kernel//time/timer_migration.c 10 #include 11 #include 12 #include 13 #include 14 #include 15 > 16 #include "timer_migration.h" 17 #include "tick-internal.h" 18 19 #ifdef DEBUG 20 # define DBG_BUG_ON(x) BUG_ON(x) 21 #else 22 # define DBG_BUG_ON(x) 23 #endif 24 25 /* Per group capacity. Must be a power of 2! */ 26 static const unsigned int tmigr_childs_per_group = 8; 27 28 bool tmigr_enabled __read_mostly; 29 static unsigned int tmigr_hierarchy_levels __read_mostly; 30 static unsigned int tmigr_crossnode_level __read_mostly; 31 static struct list_head *tmigr_level_list __read_mostly; 32 33 static DEFINE_MUTEX(tmigr_mutex); 34 35 static DEFINE_PER_CPU(struct tmigr_cpu, tmigr_cpu); 36 37 static void tmigr_add_evt(struct tmigr_group *group, struct tmigr_event *evt) 38 { 39 /* 40 * Can be called with @evt == NULL, an already queued @evt or 41 * an event that do not need to be queued (expires == 42 * KTIME_MAX) 43 */ 44 if (!evt || !RB_EMPTY_NODE(>nextevt.node) || 45 evt->nextevt.expires == KTIME_MAX) 46 return; 47 48 /* @group->group event must not be queued in the parent group */ 49 DBG_BUG_ON(!RB_EMPTY_NODE(>groupevt.nextevt.node)); 50 51 /* If this is the new first to expire event, update group event */ 52 if (timerqueue_add(>events, >nextevt)) { 53 group->groupevt.nextevt.expires = evt->nextevt.expires; 54 group->groupevt.cpu = evt->cpu; 55 } 56 } 57 58 static void tmigr_remove_evt(struct tmigr_group *group, struct tmigr_event *evt) 59 { 60 struct timerqueue_node *next; 61 struct tmigr_event *nextevt; 62 bool first; 63 64 /* 65 * It's safe to modify the group event of this group, because it is 66 * not queued in the parent group. 67 */ 68 DBG_BUG_ON(!RB_EMPTY_NODE(>groupevt.nextevt.node)); 69 70 /* Remove the child event, if pending */ 71 if (!evt || RB_EMPTY_NODE(>nextevt.node)) 72 return; 73 /* 74 * If this was the last queued event in the group, clear 75 * the group event. If this was the first event to expire, 76 * update the group. 77 */ 78 first =
[tip:WIP.timers 8/10] kernel/time/timer_migration.h:26:1: warning: no return statement in function returning non-void
tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers head: c0b7a5dbb870d1660aa5e566c5ce9972290a2bed commit: 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 [8/10] timer: Implement the hierarchical pull model config: x86_64-randconfig-x012-201716 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: git checkout 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from kernel/time/timer.c:54:0: kernel/time/timer_migration.h: In function 'tmigr_cpu_idle': >> kernel/time/timer_migration.h:26:1: warning: no return statement in function >> returning non-void [-Wreturn-type] static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^~ vim +26 kernel/time/timer_migration.h 10 #define _KERNEL_TIME_MIGRATION_H 11 12 #ifdef CONFIG_NO_HZ_COMMON 13 extern bool tmigr_enabled; 14 void tmigr_init(void); 15 #else 16 static inline void tmigr_init(void) { } 17 #endif 18 19 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) 20 extern void tmigr_handle_remote(void); 21 extern u64 tmigr_cpu_idle(u64 nextevt); 22 extern void tmigr_cpu_activate(void); 23 extern void timer_expire_remote(unsigned int cpu); 24 #else 25 static inline void tmigr_handle_remote(void) { } > 26 static inline u64 tmigr_cpu_idle(u64 nextevt) { } 27 static inline void tmigr_cpu_activate(void) { } 28 #endif 29 30 #define TMIGR_NONE (~0U) 31 32 /** 33 * struct tmigr_event - A timer event associated to a CPU or a group 34 * @nextevt:The node to enqueue an event in the group queue --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[tip:WIP.timers 8/10] kernel/time/timer_migration.h:26:1: warning: no return statement in function returning non-void
tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers head: c0b7a5dbb870d1660aa5e566c5ce9972290a2bed commit: 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 [8/10] timer: Implement the hierarchical pull model config: x86_64-randconfig-x012-201716 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: git checkout 6a3164fa4cd35a587b5bb2e4bd86b75900af8286 # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from kernel/time/timer.c:54:0: kernel/time/timer_migration.h: In function 'tmigr_cpu_idle': >> kernel/time/timer_migration.h:26:1: warning: no return statement in function >> returning non-void [-Wreturn-type] static inline u64 tmigr_cpu_idle(u64 nextevt) { } ^~ vim +26 kernel/time/timer_migration.h 10 #define _KERNEL_TIME_MIGRATION_H 11 12 #ifdef CONFIG_NO_HZ_COMMON 13 extern bool tmigr_enabled; 14 void tmigr_init(void); 15 #else 16 static inline void tmigr_init(void) { } 17 #endif 18 19 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) 20 extern void tmigr_handle_remote(void); 21 extern u64 tmigr_cpu_idle(u64 nextevt); 22 extern void tmigr_cpu_activate(void); 23 extern void timer_expire_remote(unsigned int cpu); 24 #else 25 static inline void tmigr_handle_remote(void) { } > 26 static inline u64 tmigr_cpu_idle(u64 nextevt) { } 27 static inline void tmigr_cpu_activate(void) { } 28 #endif 29 30 #define TMIGR_NONE (~0U) 31 32 /** 33 * struct tmigr_event - A timer event associated to a CPU or a group 34 * @nextevt:The node to enqueue an event in the group queue --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
On Mon, 2017-04-17 at 10:52 -0600, Logan Gunthorpe wrote: > > On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote: > > But is it ? For example take a GPU, does it, in your scheme, need an > > additional "p2pmem" child ? Why can't the GPU driver just use some > > helper to instantiate the necessary struct pages ? What does having an > > actual "struct device" child buys you ? > > Yes, in this scheme, it needs an additional p2pmem child. Why is that an > issue? It certainly makes it a lot easier for the user to understand the > p2pmem memory in the system (through the sysfs tree) and reason about > the topology and when to use it. This is important. Is it ? Again, you create a "concept" the user may have no idea about, "p2pmem memory". So now any kind of memory buffer on a device can could be use for p2p but also potentially a bunch of other things becomes special and called "p2pmem" ... > > > 2) In order to create the struct pages we use the ZONE_DEVICE > > > infrastructure which requires a struct device. (See > > > devm_memremap_pages.) > > > > Yup, but you already have one in the actual pci_dev ... What is the > > benefit of adding a second one ? > > But that would tie all of this very tightly to be pci only and may get > hard to differentiate if more users of ZONE_DEVICE crop up who happen to > be using a pci device. But what do you have in p2pmem that somebody benefits from. Again I don't understand what that "p2pmem" device buys you in term of functionality vs. having the device just instanciate the pages. Now having some kind of way to override the dma_ops, yes I do get that, and it could be that this "p2pmem" is typically the way to do it, but at the moment you don't even have that. So I'm a bit at a loss here. > Having a specific class for this makes it very > clear how this memory would be handled. But it doesn't *have* to be. Again, take my GPU example. The fact that a NIC might be able to DMA into it doesn't make it specifically "p2p memory". Essentially you are saying that any device that happens to have a piece of mappable "memory" (or something that behaves like it) and can be DMA'ed into should now have that "p2pmem" thing attached to it. Now take an example where that becomes really awkward (it's also a real example of something people want to do). I have a NIC and a GPU, the NIC DMA's data to/from the GPU, but they also want to poke at each other doorbell, the GPU to kick the NIC into action when data is ready to send, the NIC to poke the GPU when data has been received. Those doorbells are MMIO registers. So now your "p2pmem" device needs to also be laid out on top of those MMIO registers ? It's becoming weird. See, basically, doing peer 2 peer between devices has 3 main challenges today: The DMA API needing struct pages, the MMIO translation issues and the IOMMU translation issues. You seem to create that added device as some kind of "owner" for the struct pages, solving #1, but leave #2 and #3 alone. Now, as I said, it could very well be that having the devmap pointer point to some specific device-type with a well known structure to provide solutions for #2 and #3 such as dma_ops overrides, is indeed the right way to solve these problems. If we go down that path, though, rather than calling it p2pmem I would call it something like dma_target which I find much clearer especially since it doesn't have to be just memory. For the sole case of creating struct page's however, I fail to see the point. > For example, although I haven't > looked into it, this could very well be a point of conflict with HMM. If > they were to use the pci device to populate the dev_pagemap then we > couldn't also use the pci device. I feel it's much better for users of > dev_pagemap to have their struct devices they own to avoid such conflicts. If we are going to create some sort of struct dma_target, HMM could potentially just look for the parent if it needs the PCI device. > > > Â This amazingly gets us the get_dev_pagemap > > > architecture which also uses a struct device. So by using a p2pmem > > > device we can go from struct page to struct device to p2pmem device > > > quickly and effortlessly. > > > > Which isn't terribly useful in itself right ? What you care about is > > the "enclosing" pci_dev no ? Or am I missing something ? > > Sure it is. What if we want to someday support p2pmem that's on another bus? But why not directly use that other bus' device in that case ? > > > 3) You wouldn't want to use the pci's struct device because it doesn't > > > really describe what's going on. For example, there may be multiple > > > devices on the pci device in question: eg. an NVME card and some p2pmem. > > > > What is "some p2pmem" ? > > > Or it could be a NIC with some p2pmem. > > > > Again what is "some p2pmem" ? > > Some device local memory intended for use as a DMA target from a > neighbour device or itself. On a PCI device, this would be a BAR, or a > portion of a BAR with memory
Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
On Mon, 2017-04-17 at 10:52 -0600, Logan Gunthorpe wrote: > > On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote: > > But is it ? For example take a GPU, does it, in your scheme, need an > > additional "p2pmem" child ? Why can't the GPU driver just use some > > helper to instantiate the necessary struct pages ? What does having an > > actual "struct device" child buys you ? > > Yes, in this scheme, it needs an additional p2pmem child. Why is that an > issue? It certainly makes it a lot easier for the user to understand the > p2pmem memory in the system (through the sysfs tree) and reason about > the topology and when to use it. This is important. Is it ? Again, you create a "concept" the user may have no idea about, "p2pmem memory". So now any kind of memory buffer on a device can could be use for p2p but also potentially a bunch of other things becomes special and called "p2pmem" ... > > > 2) In order to create the struct pages we use the ZONE_DEVICE > > > infrastructure which requires a struct device. (See > > > devm_memremap_pages.) > > > > Yup, but you already have one in the actual pci_dev ... What is the > > benefit of adding a second one ? > > But that would tie all of this very tightly to be pci only and may get > hard to differentiate if more users of ZONE_DEVICE crop up who happen to > be using a pci device. But what do you have in p2pmem that somebody benefits from. Again I don't understand what that "p2pmem" device buys you in term of functionality vs. having the device just instanciate the pages. Now having some kind of way to override the dma_ops, yes I do get that, and it could be that this "p2pmem" is typically the way to do it, but at the moment you don't even have that. So I'm a bit at a loss here. > Having a specific class for this makes it very > clear how this memory would be handled. But it doesn't *have* to be. Again, take my GPU example. The fact that a NIC might be able to DMA into it doesn't make it specifically "p2p memory". Essentially you are saying that any device that happens to have a piece of mappable "memory" (or something that behaves like it) and can be DMA'ed into should now have that "p2pmem" thing attached to it. Now take an example where that becomes really awkward (it's also a real example of something people want to do). I have a NIC and a GPU, the NIC DMA's data to/from the GPU, but they also want to poke at each other doorbell, the GPU to kick the NIC into action when data is ready to send, the NIC to poke the GPU when data has been received. Those doorbells are MMIO registers. So now your "p2pmem" device needs to also be laid out on top of those MMIO registers ? It's becoming weird. See, basically, doing peer 2 peer between devices has 3 main challenges today: The DMA API needing struct pages, the MMIO translation issues and the IOMMU translation issues. You seem to create that added device as some kind of "owner" for the struct pages, solving #1, but leave #2 and #3 alone. Now, as I said, it could very well be that having the devmap pointer point to some specific device-type with a well known structure to provide solutions for #2 and #3 such as dma_ops overrides, is indeed the right way to solve these problems. If we go down that path, though, rather than calling it p2pmem I would call it something like dma_target which I find much clearer especially since it doesn't have to be just memory. For the sole case of creating struct page's however, I fail to see the point. > For example, although I haven't > looked into it, this could very well be a point of conflict with HMM. If > they were to use the pci device to populate the dev_pagemap then we > couldn't also use the pci device. I feel it's much better for users of > dev_pagemap to have their struct devices they own to avoid such conflicts. If we are going to create some sort of struct dma_target, HMM could potentially just look for the parent if it needs the PCI device. > > > Â This amazingly gets us the get_dev_pagemap > > > architecture which also uses a struct device. So by using a p2pmem > > > device we can go from struct page to struct device to p2pmem device > > > quickly and effortlessly. > > > > Which isn't terribly useful in itself right ? What you care about is > > the "enclosing" pci_dev no ? Or am I missing something ? > > Sure it is. What if we want to someday support p2pmem that's on another bus? But why not directly use that other bus' device in that case ? > > > 3) You wouldn't want to use the pci's struct device because it doesn't > > > really describe what's going on. For example, there may be multiple > > > devices on the pci device in question: eg. an NVME card and some p2pmem. > > > > What is "some p2pmem" ? > > > Or it could be a NIC with some p2pmem. > > > > Again what is "some p2pmem" ? > > Some device local memory intended for use as a DMA target from a > neighbour device or itself. On a PCI device, this would be a BAR, or a > portion of a BAR with memory
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: > > > > -Original Message- > > From: Guenter Roeck [mailto:li...@roeck-us.net] > > Sent: Monday, April 17, 2017 12:45 PM > > To: Moore, Robert> > Cc: Zheng, Lv ; Wysocki, Rafael J > > ; 'Len Brown' ; 'linux- > > a...@vger.kernel.org' ; 'de...@acpica.org' > > ; 'linux-kernel@vger.kernel.org' > ker...@vger.kernel.org>; Box, David E > > Subject: Re: [PATCH] ACPICA: Export mutex functions > > > > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > > > > > > > -Original Message- > > > > From: Moore, Robert > > > > Sent: Monday, April 17, 2017 10:13 AM > > > > To: Guenter Roeck ; Zheng, Lv > > > > > > > > Cc: Wysocki, Rafael J ; Len Brown > > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > > > > linux- ker...@vger.kernel.org > > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > > > > > > > > There is a model for the drivers to directly acquire an AML mutex > > > > object. That is why the acquire/release public interfaces were added > > > > to ACPICA. > > > > > > > > I forget all of the details, but the model was developed with MS and > > > > others during the ACPI 6.0 timeframe. > > > > > > > > > > > [Moore, Robert] > > > > > > > > > Here is the case where the OS may need to directly acquire an AML > > mutex: > > > > > > From the ACPI spec: > > > > > > 19.6.2 Acquire (Acquire a Mutex) > > > > > > Note: For Mutex objects referenced by a _DLM object, the host OS may > > also contend for ownership. > > > > > From the context in the dsdt, and from description of expected use cases > > for _DLM objects I can find, this is what the mutex is used for (to > > serialize access to a resource on a low pin count serial interconnect, > > aka LPC). > > > > What does that mean in practice ? That I am not supposed to use it > > because it doesn't follow standard ACPI mutex declaration rules ? > > > > Thanks, > > Guenter > > > > > > [Moore, Robert] > > I'm not an expert on the _DLM method, but I would point you to the > description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > I did. However, not being an ACPI expert, that doesn't tell me anything. Guenter > > > > > > > > > > > Other than this case, the OS/drivers should never need to directly > > acquire an AML mutex. > > > Bob > > >
Re: [PATCH] ACPICA: Export mutex functions
On Mon, Apr 17, 2017 at 08:40:38PM +, Moore, Robert wrote: > > > > -Original Message- > > From: Guenter Roeck [mailto:li...@roeck-us.net] > > Sent: Monday, April 17, 2017 12:45 PM > > To: Moore, Robert > > Cc: Zheng, Lv ; Wysocki, Rafael J > > ; 'Len Brown' ; 'linux- > > a...@vger.kernel.org' ; 'de...@acpica.org' > > ; 'linux-kernel@vger.kernel.org' > ker...@vger.kernel.org>; Box, David E > > Subject: Re: [PATCH] ACPICA: Export mutex functions > > > > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > > > > > > > -Original Message- > > > > From: Moore, Robert > > > > Sent: Monday, April 17, 2017 10:13 AM > > > > To: Guenter Roeck ; Zheng, Lv > > > > > > > > Cc: Wysocki, Rafael J ; Len Brown > > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > > > > linux- ker...@vger.kernel.org > > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > > > > > > > > There is a model for the drivers to directly acquire an AML mutex > > > > object. That is why the acquire/release public interfaces were added > > > > to ACPICA. > > > > > > > > I forget all of the details, but the model was developed with MS and > > > > others during the ACPI 6.0 timeframe. > > > > > > > > > > > [Moore, Robert] > > > > > > > > > Here is the case where the OS may need to directly acquire an AML > > mutex: > > > > > > From the ACPI spec: > > > > > > 19.6.2 Acquire (Acquire a Mutex) > > > > > > Note: For Mutex objects referenced by a _DLM object, the host OS may > > also contend for ownership. > > > > > From the context in the dsdt, and from description of expected use cases > > for _DLM objects I can find, this is what the mutex is used for (to > > serialize access to a resource on a low pin count serial interconnect, > > aka LPC). > > > > What does that mean in practice ? That I am not supposed to use it > > because it doesn't follow standard ACPI mutex declaration rules ? > > > > Thanks, > > Guenter > > > > > > [Moore, Robert] > > I'm not an expert on the _DLM method, but I would point you to the > description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > I did. However, not being an ACPI expert, that doesn't tell me anything. Guenter > > > > > > > > > > > Other than this case, the OS/drivers should never need to directly > > acquire an AML mutex. > > > Bob > > >
Re: [PATCH] device-mapper: Convert printks to pr_ macros
Hi Joe, [auto build test ERROR on v4.9-rc8] [also build test ERROR on next-20170413] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Joe-Perches/device-mapper-Convert-printks-to-pr_-level-macros/20170418-030508 config: i386-allmodconfig (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): In file included from fs/nfs/blocklayout/blocklayout.h:35:0, from fs/nfs/blocklayout/dev.c:11: fs/nfs/blocklayout/dev.c: In function 'bl_free_device': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ >> fs/nfs/blocklayout/dev.c:33:5: note: in expansion of macro 'pr_err' pr_err("failed to unregister PR key.\n"); ^~ fs/nfs/blocklayout/dev.c: In function 'nfs4_block_decode_volume': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:284:19: note: in expansion of macro 'pr_fmt' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ^~ >> fs/nfs/blocklayout/dev.c:81:5: note: in expansion of macro 'pr_info' pr_info("signature too long: %d\n", ^~~ fs/nfs/blocklayout/dev.c: In function 'bl_validate_designator': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:287:3: note: in expansion of macro 'pr_err' pr_err("pNFS: unsupported designator " ^~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:294:3: note: in expansion of macro 'pr_err' pr_err("pNFS: invalid designator " ^~ fs/nfs/blocklayout/dev.c: In function 'bl_open_udev_path': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:279:22: note: in expansion of macro 'pr_fmt' printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__) ^~ include/linux/printk.h:280:17: note: in expansion of macro 'pr_warning' #define pr_warn pr_warning ^~ >> fs/nfs/blocklayout/dev.c:320:3: note: in expansion of macro 'pr_warn' pr_warn("pNFS: failed to open device %s (%ld)\n", ^~~ fs/nfs/blocklayout/dev.c: In function 'bl_parse_scsi': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:284:19: note: in expansion of macro 'pr_fmt' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:373:2: note: in expansion of macro 'pr_info' pr_info("pNFS: using block device %s (reservation key 0x%llx)\n", ^~~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:378:3: note: in expansion of macro 'pr_err' pr_err("pNFS: block device %s does not support reservations.", ^~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:386:3: note: in expansion of macro 'pr_err' pr_err("pNFS: failed to register key for block device %s.", ^~ -- In file included from
Re: [PATCH] device-mapper: Convert printks to pr_ macros
Hi Joe, [auto build test ERROR on v4.9-rc8] [also build test ERROR on next-20170413] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Joe-Perches/device-mapper-Convert-printks-to-pr_-level-macros/20170418-030508 config: i386-allmodconfig (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): In file included from fs/nfs/blocklayout/blocklayout.h:35:0, from fs/nfs/blocklayout/dev.c:11: fs/nfs/blocklayout/dev.c: In function 'bl_free_device': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ >> fs/nfs/blocklayout/dev.c:33:5: note: in expansion of macro 'pr_err' pr_err("failed to unregister PR key.\n"); ^~ fs/nfs/blocklayout/dev.c: In function 'nfs4_block_decode_volume': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:284:19: note: in expansion of macro 'pr_fmt' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ^~ >> fs/nfs/blocklayout/dev.c:81:5: note: in expansion of macro 'pr_info' pr_info("signature too long: %d\n", ^~~ fs/nfs/blocklayout/dev.c: In function 'bl_validate_designator': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:287:3: note: in expansion of macro 'pr_err' pr_err("pNFS: unsupported designator " ^~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:294:3: note: in expansion of macro 'pr_err' pr_err("pNFS: invalid designator " ^~ fs/nfs/blocklayout/dev.c: In function 'bl_open_udev_path': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:279:22: note: in expansion of macro 'pr_fmt' printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__) ^~ include/linux/printk.h:280:17: note: in expansion of macro 'pr_warning' #define pr_warn pr_warning ^~ >> fs/nfs/blocklayout/dev.c:320:3: note: in expansion of macro 'pr_warn' pr_warn("pNFS: failed to open device %s (%ld)\n", ^~~ fs/nfs/blocklayout/dev.c: In function 'bl_parse_scsi': >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ include/linux/printk.h:284:19: note: in expansion of macro 'pr_fmt' printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:373:2: note: in expansion of macro 'pr_info' pr_info("pNFS: using block device %s (reservation key 0x%llx)\n", ^~~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:378:3: note: in expansion of macro 'pr_err' pr_err("pNFS: block device %s does not support reservations.", ^~ >> include/linux/device-mapper.h:536:34: error: expected ')' before >> 'DM_MSG_PREFIX' #define pr_fmt(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt ^ >> include/linux/printk.h:277:18: note: in expansion of macro 'pr_fmt' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~ fs/nfs/blocklayout/dev.c:386:3: note: in expansion of macro 'pr_err' pr_err("pNFS: failed to register key for block device %s.", ^~ -- In file included from
[PATCH] mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()
cfg80211_chandef_create() expects an 'enum nl80211_channel_type' as channel type however in ieee80211_sta_join_ibss() NL80211_CHAN_WIDTH_20_NOHT is passed in two occasions, which is of the enum type 'nl80211_chan_width'. Change the value to NL80211_CHAN_NO_HT (20 MHz, non-HT channel) of the channel type enum. Signed-off-by: Matthias Kaehlcke--- net/mac80211/ibss.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/mac80211/ibss.c b/net/mac80211/ibss.c index 98999d3d5262..e957351976a2 100644 --- a/net/mac80211/ibss.c +++ b/net/mac80211/ibss.c @@ -425,7 +425,7 @@ static void ieee80211_sta_join_ibss(struct ieee80211_sub_if_data *sdata, case NL80211_CHAN_WIDTH_5: case NL80211_CHAN_WIDTH_10: cfg80211_chandef_create(, cbss->channel, - NL80211_CHAN_WIDTH_20_NOHT); + NL80211_CHAN_NO_HT); chandef.width = sdata->u.ibss.chandef.width; break; case NL80211_CHAN_WIDTH_80: @@ -437,7 +437,7 @@ static void ieee80211_sta_join_ibss(struct ieee80211_sub_if_data *sdata, default: /* fall back to 20 MHz for unsupported modes */ cfg80211_chandef_create(, cbss->channel, - NL80211_CHAN_WIDTH_20_NOHT); + NL80211_CHAN_NO_HT); break; } -- 2.12.2.762.g0e3151a226-goog
[PATCH] mac80211: ibss: Fix channel type enum in ieee80211_sta_join_ibss()
cfg80211_chandef_create() expects an 'enum nl80211_channel_type' as channel type however in ieee80211_sta_join_ibss() NL80211_CHAN_WIDTH_20_NOHT is passed in two occasions, which is of the enum type 'nl80211_chan_width'. Change the value to NL80211_CHAN_NO_HT (20 MHz, non-HT channel) of the channel type enum. Signed-off-by: Matthias Kaehlcke --- net/mac80211/ibss.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/mac80211/ibss.c b/net/mac80211/ibss.c index 98999d3d5262..e957351976a2 100644 --- a/net/mac80211/ibss.c +++ b/net/mac80211/ibss.c @@ -425,7 +425,7 @@ static void ieee80211_sta_join_ibss(struct ieee80211_sub_if_data *sdata, case NL80211_CHAN_WIDTH_5: case NL80211_CHAN_WIDTH_10: cfg80211_chandef_create(, cbss->channel, - NL80211_CHAN_WIDTH_20_NOHT); + NL80211_CHAN_NO_HT); chandef.width = sdata->u.ibss.chandef.width; break; case NL80211_CHAN_WIDTH_80: @@ -437,7 +437,7 @@ static void ieee80211_sta_join_ibss(struct ieee80211_sub_if_data *sdata, default: /* fall back to 20 MHz for unsupported modes */ cfg80211_chandef_create(, cbss->channel, - NL80211_CHAN_WIDTH_20_NOHT); + NL80211_CHAN_NO_HT); break; } -- 2.12.2.762.g0e3151a226-goog
[PATCH] sched/rt: minimize rq->lock contention in do_sched_rt_period_timer()
With CONFIG_RT_GROUP_SCHED defined, do_sched_rt_period_timer() sequentially takes each cpu's rq->lock. On a large, busy system, the cumulative time it takes to acquire each lock can be excessive, even triggering a watchdog timeout. If rt_rq_rt_time and rt_rq->rt_nr_running are both zero, this function does nothing while holding the lock, so don't bother taking it at all. Orabug: 25491970 Signed-off-by: Dave KleikampCc: Ingo Molnar Cc: Peter Zijlstra --- kernel/sched/rt.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 9f3e40226dec..ae4a8c529a02 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -840,6 +840,17 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun) int enqueue = 0; struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i); struct rq *rq = rq_of_rt_rq(rt_rq); + int skip; + + /* +* When span == cpu_online_mask, taking each rq->lock +* can be time-consuming. Try to avoid it when possible. +*/ + raw_spin_lock(_rq->rt_runtime_lock); + skip = !rt_rq->rt_time && !rt_rq->rt_nr_running; + raw_spin_unlock(_rq->rt_runtime_lock); + if (skip) + continue; raw_spin_lock(>lock); if (rt_rq->rt_time) { -- 2.12.2
[PATCH] sched/rt: minimize rq->lock contention in do_sched_rt_period_timer()
With CONFIG_RT_GROUP_SCHED defined, do_sched_rt_period_timer() sequentially takes each cpu's rq->lock. On a large, busy system, the cumulative time it takes to acquire each lock can be excessive, even triggering a watchdog timeout. If rt_rq_rt_time and rt_rq->rt_nr_running are both zero, this function does nothing while holding the lock, so don't bother taking it at all. Orabug: 25491970 Signed-off-by: Dave Kleikamp Cc: Ingo Molnar Cc: Peter Zijlstra --- kernel/sched/rt.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 9f3e40226dec..ae4a8c529a02 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -840,6 +840,17 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun) int enqueue = 0; struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i); struct rq *rq = rq_of_rt_rq(rt_rq); + int skip; + + /* +* When span == cpu_online_mask, taking each rq->lock +* can be time-consuming. Try to avoid it when possible. +*/ + raw_spin_lock(_rq->rt_runtime_lock); + skip = !rt_rq->rt_time && !rt_rq->rt_nr_running; + raw_spin_unlock(_rq->rt_runtime_lock); + if (skip) + continue; raw_spin_lock(>lock); if (rt_rq->rt_time) { -- 2.12.2
[PATCH] hidma_mgmt_sys: Use devm_kmalloc_array() in hidma_mgmt_init_sys()
From: Markus ElfringDate: Mon, 17 Apr 2017 22:42:40 +0200 * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kmalloc_array". This issue was detected by using the Coccinelle software. * Delete the local variable "required" which became unnecessary with this refactoring. Signed-off-by: Markus Elfring --- drivers/dma/qcom/hidma_mgmt_sys.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/dma/qcom/hidma_mgmt_sys.c b/drivers/dma/qcom/hidma_mgmt_sys.c index d61f1068a34b..e7222a0e9ba4 100644 --- a/drivers/dma/qcom/hidma_mgmt_sys.c +++ b/drivers/dma/qcom/hidma_mgmt_sys.c @@ -245,11 +245,12 @@ int hidma_mgmt_init_sys(struct hidma_mgmt_dev *mdev) { unsigned int i; int rc; - int required; struct kobject *chanops; - required = sizeof(*mdev->chroots) * mdev->dma_channels; - mdev->chroots = devm_kmalloc(>pdev->dev, required, GFP_KERNEL); + mdev->chroots = devm_kmalloc_array(>pdev->dev, + mdev->dma_channels, + sizeof(*mdev->chroots), + GFP_KERNEL); if (!mdev->chroots) return -ENOMEM; -- 2.12.2
[PATCH] hidma_mgmt_sys: Use devm_kmalloc_array() in hidma_mgmt_init_sys()
From: Markus Elfring Date: Mon, 17 Apr 2017 22:42:40 +0200 * A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kmalloc_array". This issue was detected by using the Coccinelle software. * Delete the local variable "required" which became unnecessary with this refactoring. Signed-off-by: Markus Elfring --- drivers/dma/qcom/hidma_mgmt_sys.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/dma/qcom/hidma_mgmt_sys.c b/drivers/dma/qcom/hidma_mgmt_sys.c index d61f1068a34b..e7222a0e9ba4 100644 --- a/drivers/dma/qcom/hidma_mgmt_sys.c +++ b/drivers/dma/qcom/hidma_mgmt_sys.c @@ -245,11 +245,12 @@ int hidma_mgmt_init_sys(struct hidma_mgmt_dev *mdev) { unsigned int i; int rc; - int required; struct kobject *chanops; - required = sizeof(*mdev->chroots) * mdev->dma_channels; - mdev->chroots = devm_kmalloc(>pdev->dev, required, GFP_KERNEL); + mdev->chroots = devm_kmalloc_array(>pdev->dev, + mdev->dma_channels, + sizeof(*mdev->chroots), + GFP_KERNEL); if (!mdev->chroots) return -ENOMEM; -- 2.12.2
[PATCH] soc/fsl/qbman: Disable IRQs for deferred QBMan work
Work for Congestion State Notifications (CSCN) and Message Ring (MR) handling is handled via the workqueue mechanism. This requires the driver to disable those IRQs before scheduling the work and re-enabling it once the work is completed so that the interrupt doesn't continually fire. Signed-off-by: Roy Pledge--- drivers/soc/fsl/qbman/qman.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 3d891db..18eefc3 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -1344,6 +1344,7 @@ static void qm_congestion_task(struct work_struct *work) if (!qm_mc_result_timeout(>p, )) { spin_unlock(>cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); + qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; } /* mask out the ones I'm not interested in */ @@ -1358,6 +1359,7 @@ static void qm_congestion_task(struct work_struct *work) if (cgr->cb && qman_cgrs_get(, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(, cgr->cgrid)); spin_unlock(>cgr_lock); + qman_p_irqsource_add(p, QM_PIRQ_CSCI); } static void qm_mr_process_task(struct work_struct *work) @@ -1417,12 +1419,14 @@ static void qm_mr_process_task(struct work_struct *work) } qm_mr_cci_consume(>p, num); + qman_p_irqsource_add(p, QM_PIRQ_MRI); preempt_enable(); } static u32 __poll_portal_slow(struct qman_portal *p, u32 is) { if (is & QM_PIRQ_CSCI) { + qman_p_irqsource_remove(p, QM_PIRQ_CSCI); queue_work_on(smp_processor_id(), qm_portal_wq, >congestion_work); } @@ -1434,6 +1438,7 @@ static u32 __poll_portal_slow(struct qman_portal *p, u32 is) } if (is & QM_PIRQ_MRI) { + qman_p_irqsource_remove(p, QM_PIRQ_MRI); queue_work_on(smp_processor_id(), qm_portal_wq, >mr_work); } -- 2.7.4
[PATCH] soc/fsl/qbman: Disable IRQs for deferred QBMan work
Work for Congestion State Notifications (CSCN) and Message Ring (MR) handling is handled via the workqueue mechanism. This requires the driver to disable those IRQs before scheduling the work and re-enabling it once the work is completed so that the interrupt doesn't continually fire. Signed-off-by: Roy Pledge --- drivers/soc/fsl/qbman/qman.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 3d891db..18eefc3 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -1344,6 +1344,7 @@ static void qm_congestion_task(struct work_struct *work) if (!qm_mc_result_timeout(>p, )) { spin_unlock(>cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); + qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; } /* mask out the ones I'm not interested in */ @@ -1358,6 +1359,7 @@ static void qm_congestion_task(struct work_struct *work) if (cgr->cb && qman_cgrs_get(, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(, cgr->cgrid)); spin_unlock(>cgr_lock); + qman_p_irqsource_add(p, QM_PIRQ_CSCI); } static void qm_mr_process_task(struct work_struct *work) @@ -1417,12 +1419,14 @@ static void qm_mr_process_task(struct work_struct *work) } qm_mr_cci_consume(>p, num); + qman_p_irqsource_add(p, QM_PIRQ_MRI); preempt_enable(); } static u32 __poll_portal_slow(struct qman_portal *p, u32 is) { if (is & QM_PIRQ_CSCI) { + qman_p_irqsource_remove(p, QM_PIRQ_CSCI); queue_work_on(smp_processor_id(), qm_portal_wq, >congestion_work); } @@ -1434,6 +1438,7 @@ static u32 __poll_portal_slow(struct qman_portal *p, u32 is) } if (is & QM_PIRQ_MRI) { + qman_p_irqsource_remove(p, QM_PIRQ_MRI); queue_work_on(smp_processor_id(), qm_portal_wq, >mr_work); } -- 2.7.4
[PATCH v2] net: cx89x0: move attribute declaration before struct keyword
The attribute declaration is typically before the definition. Move the __maybe_unused attribute declaration before the struct keyword. Signed-off-by: Stefan Agner--- Changes in v2: - Move __maybe_unused after the complete type drivers/net/ethernet/cirrus/cs89x0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c index 3647b28e8de0..47384f7323ac 100644 --- a/drivers/net/ethernet/cirrus/cs89x0.c +++ b/drivers/net/ethernet/cirrus/cs89x0.c @@ -1896,7 +1896,7 @@ static int cs89x0_platform_remove(struct platform_device *pdev) return 0; } -static const struct __maybe_unused of_device_id cs89x0_match[] = { +static const struct of_device_id __maybe_unused cs89x0_match[] = { { .compatible = "cirrus,cs8900", }, { .compatible = "cirrus,cs8920", }, { }, -- 2.12.2
[PATCH v2] net: cx89x0: move attribute declaration before struct keyword
The attribute declaration is typically before the definition. Move the __maybe_unused attribute declaration before the struct keyword. Signed-off-by: Stefan Agner --- Changes in v2: - Move __maybe_unused after the complete type drivers/net/ethernet/cirrus/cs89x0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c index 3647b28e8de0..47384f7323ac 100644 --- a/drivers/net/ethernet/cirrus/cs89x0.c +++ b/drivers/net/ethernet/cirrus/cs89x0.c @@ -1896,7 +1896,7 @@ static int cs89x0_platform_remove(struct platform_device *pdev) return 0; } -static const struct __maybe_unused of_device_id cs89x0_match[] = { +static const struct of_device_id __maybe_unused cs89x0_match[] = { { .compatible = "cirrus,cs8900", }, { .compatible = "cirrus,cs8920", }, { }, -- 2.12.2
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
From: Stefan AgnerDate: Mon, 17 Apr 2017 13:31:28 -0700 > Given that, can you reconsider? Please put the attribute after the compete type. Thanks.
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
From: Stefan Agner Date: Mon, 17 Apr 2017 13:31:28 -0700 > Given that, can you reconsider? Please put the attribute after the compete type. Thanks.
RE: [PATCH] ACPICA: Export mutex functions
> -Original Message- > From: Guenter Roeck [mailto:li...@roeck-us.net] > Sent: Monday, April 17, 2017 12:45 PM > To: Moore, Robert> Cc: Zheng, Lv ; Wysocki, Rafael J > ; 'Len Brown' ; 'linux- > a...@vger.kernel.org' ; 'de...@acpica.org' > ; 'linux-kernel@vger.kernel.org' ker...@vger.kernel.org>; Box, David E > Subject: Re: [PATCH] ACPICA: Export mutex functions > > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > > > > > -Original Message- > > > From: Moore, Robert > > > Sent: Monday, April 17, 2017 10:13 AM > > > To: Guenter Roeck ; Zheng, Lv > > > > > > Cc: Wysocki, Rafael J ; Len Brown > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > > > linux- ker...@vger.kernel.org > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > > > > > > There is a model for the drivers to directly acquire an AML mutex > > > object. That is why the acquire/release public interfaces were added > > > to ACPICA. > > > > > > I forget all of the details, but the model was developed with MS and > > > others during the ACPI 6.0 timeframe. > > > > > > > > [Moore, Robert] > > > > > > Here is the case where the OS may need to directly acquire an AML > mutex: > > > > From the ACPI spec: > > > > 19.6.2 Acquire (Acquire a Mutex) > > > > Note: For Mutex objects referenced by a _DLM object, the host OS may > also contend for ownership. > > > From the context in the dsdt, and from description of expected use cases > for _DLM objects I can find, this is what the mutex is used for (to > serialize access to a resource on a low pin count serial interconnect, > aka LPC). > > What does that mean in practice ? That I am not supposed to use it > because it doesn't follow standard ACPI mutex declaration rules ? > > Thanks, > Guenter > > > [Moore, Robert] I'm not an expert on the _DLM method, but I would point you to the description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > > > > > > Other than this case, the OS/drivers should never need to directly > acquire an AML mutex. > > Bob > >
RE: [PATCH] ACPICA: Export mutex functions
> -Original Message- > From: Guenter Roeck [mailto:li...@roeck-us.net] > Sent: Monday, April 17, 2017 12:45 PM > To: Moore, Robert > Cc: Zheng, Lv ; Wysocki, Rafael J > ; 'Len Brown' ; 'linux- > a...@vger.kernel.org' ; 'de...@acpica.org' > ; 'linux-kernel@vger.kernel.org' ker...@vger.kernel.org>; Box, David E > Subject: Re: [PATCH] ACPICA: Export mutex functions > > On Mon, Apr 17, 2017 at 07:27:37PM +, Moore, Robert wrote: > > > > > -Original Message- > > > From: Moore, Robert > > > Sent: Monday, April 17, 2017 10:13 AM > > > To: Guenter Roeck ; Zheng, Lv > > > > > > Cc: Wysocki, Rafael J ; Len Brown > > > ; linux-a...@vger.kernel.org; de...@acpica.org; > > > linux- ker...@vger.kernel.org > > > Subject: RE: [PATCH] ACPICA: Export mutex functions > > > > > > There is a model for the drivers to directly acquire an AML mutex > > > object. That is why the acquire/release public interfaces were added > > > to ACPICA. > > > > > > I forget all of the details, but the model was developed with MS and > > > others during the ACPI 6.0 timeframe. > > > > > > > > [Moore, Robert] > > > > > > Here is the case where the OS may need to directly acquire an AML > mutex: > > > > From the ACPI spec: > > > > 19.6.2 Acquire (Acquire a Mutex) > > > > Note: For Mutex objects referenced by a _DLM object, the host OS may > also contend for ownership. > > > From the context in the dsdt, and from description of expected use cases > for _DLM objects I can find, this is what the mutex is used for (to > serialize access to a resource on a low pin count serial interconnect, > aka LPC). > > What does that mean in practice ? That I am not supposed to use it > because it doesn't follow standard ACPI mutex declaration rules ? > > Thanks, > Guenter > > > [Moore, Robert] I'm not an expert on the _DLM method, but I would point you to the description section in the ACPI spec, 5.7.5 _DLM (DeviceLock Mutex). > > > > > > Other than this case, the OS/drivers should never need to directly > acquire an AML mutex. > > Bob > >
Re: [PATCH] um: Include kbuild.h instead of duplicating its macros
El Mon, Apr 03, 2017 at 12:54:58PM -0700 Matthias Kaehlcke ha dit: > Signed-off-by: Matthias Kaehlcke> --- > arch/x86/um/shared/sysdep/kernel-offsets.h | 9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/arch/x86/um/shared/sysdep/kernel-offsets.h > b/arch/x86/um/shared/sysdep/kernel-offsets.h > index 46a9df99f3c5..7e1d35b6ad5c 100644 > --- a/arch/x86/um/shared/sysdep/kernel-offsets.h > +++ b/arch/x86/um/shared/sysdep/kernel-offsets.h > @@ -2,16 +2,9 @@ > #include > #include > #include > +#include > #include > > -#define DEFINE(sym, val) \ > - asm volatile("\n->" #sym " %0 " #val : : "i" (val)) > - > -#define BLANK() asm volatile("\n->" : : ) > - > -#define OFFSET(sym, str, mem) \ > - DEFINE(sym, offsetof(struct str, mem)); > - > void foo(void) > { > #include > -- Ping, any comment on this patch? Cheers Matthias
Re: [PATCH] um: Include kbuild.h instead of duplicating its macros
El Mon, Apr 03, 2017 at 12:54:58PM -0700 Matthias Kaehlcke ha dit: > Signed-off-by: Matthias Kaehlcke > --- > arch/x86/um/shared/sysdep/kernel-offsets.h | 9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/arch/x86/um/shared/sysdep/kernel-offsets.h > b/arch/x86/um/shared/sysdep/kernel-offsets.h > index 46a9df99f3c5..7e1d35b6ad5c 100644 > --- a/arch/x86/um/shared/sysdep/kernel-offsets.h > +++ b/arch/x86/um/shared/sysdep/kernel-offsets.h > @@ -2,16 +2,9 @@ > #include > #include > #include > +#include > #include > > -#define DEFINE(sym, val) \ > - asm volatile("\n->" #sym " %0 " #val : : "i" (val)) > - > -#define BLANK() asm volatile("\n->" : : ) > - > -#define OFFSET(sym, str, mem) \ > - DEFINE(sym, offsetof(struct str, mem)); > - > void foo(void) > { > #include > -- Ping, any comment on this patch? Cheers Matthias
[PATCH] watchdog: f71808e_wdt: Add F71868 support
This adds support for watchdog part of Fintek F71868 Super I/O chip to f71808e_wdt driver. The F71868 chip is, in general, very similar to a F71869, however it has slightly different set of available reset pulse widths. Tested on MSI A55M-P33 motherboard. Signed-off-by: Maciej S. Szmigiero--- drivers/watchdog/Kconfig | 7 --- drivers/watchdog/f71808e_wdt.c | 27 --- 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 8b9049dac094..1f23079537ad 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -829,11 +829,12 @@ config EBC_C384_WDT the timeout module parameter. config F71808E_WDT - tristate "Fintek F71808E, F71862FG, F71869, F71882FG and F71889FG Watchdog" + tristate "Fintek F718xx, F818xx Super I/O Watchdog" depends on X86 help - This is the driver for the hardware watchdog on the Fintek - F71808E, F71862FG, F71869, F71882FG and F71889FG Super I/O controllers. + This is the driver for the hardware watchdog on the Fintek F71808E, + F71862FG, F71868, F71869, F71882FG, F71889FG, F81865 and F81866 + Super I/O controllers. You can compile this driver directly into the kernel, or use it as a module. The module will be called f71808e_wdt. diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c index 1b7e9169072f..8658dba21768 100644 --- a/drivers/watchdog/f71808e_wdt.c +++ b/drivers/watchdog/f71808e_wdt.c @@ -57,6 +57,7 @@ #define SIO_F71808_ID 0x0901 /* Chipset ID */ #define SIO_F71858_ID 0x0507 /* Chipset ID */ #define SIO_F71862_ID 0x0601 /* Chipset ID */ +#define SIO_F71868_ID 0x1106 /* Chipset ID */ #define SIO_F71869_ID 0x0814 /* Chipset ID */ #define SIO_F71869A_ID 0x1007 /* Chipset ID */ #define SIO_F71882_ID 0x0541 /* Chipset ID */ @@ -101,7 +102,7 @@ MODULE_PARM_DESC(timeout, static unsigned int pulse_width = WATCHDOG_PULSE_WIDTH; module_param(pulse_width, uint, 0); MODULE_PARM_DESC(pulse_width, - "Watchdog signal pulse width. 0(=level), 1 ms, 25 ms, 125 ms or 5000 ms" + "Watchdog signal pulse width. 0(=level), 1, 25, 30, 125, 150, 5000 or 6000 ms" " (default=" __MODULE_STRING(WATCHDOG_PULSE_WIDTH) ")"); static unsigned int f71862fg_pin = WATCHDOG_F71862FG_PIN; @@ -119,13 +120,14 @@ module_param(start_withtimeout, uint, 0); MODULE_PARM_DESC(start_withtimeout, "Start watchdog timer on module load with" " given initial timeout. Zero (default) disables this feature."); -enum chips { f71808fg, f71858fg, f71862fg, f71869, f71882fg, f71889fg, f81865, -f81866}; +enum chips { f71808fg, f71858fg, f71862fg, f71868, f71869, f71882fg, f71889fg, +f81865, f81866}; static const char *f71808e_names[] = { "f71808fg", "f71858fg", "f71862fg", + "f71868", "f71869", "f71882fg", "f71889fg", @@ -252,16 +254,23 @@ static int watchdog_set_timeout(int timeout) static int watchdog_set_pulse_width(unsigned int pw) { int err = 0; + unsigned int t1 = 25, t2 = 125, t3 = 5000; + + if (watchdog.type == f71868) { + t1 = 30; + t2 = 150; + t3 = 6000; + } mutex_lock(); - if(pw <=1) { + if(pw <= 1) { watchdog.pulse_val = 0; - } else if (pw <= 25) { + } else if (pw <= t1) { watchdog.pulse_val = 1; - } else if (pw <= 125) { + } else if (pw <= t2) { watchdog.pulse_val = 2; - } else if (pw <= 5000) { + } else if (pw <= t3) { watchdog.pulse_val = 3; } else { pr_err("pulse width out of range\n"); @@ -354,6 +363,7 @@ static int watchdog_start(void) goto exit_superio; break; + case f71868: case f71869: /* GPIO14 --> WDTRST# */ superio_clear_bit(watchdog.sioaddr, SIO_REG_MFUNCT1, 4); @@ -792,6 +802,9 @@ static int __init f71808e_find(int sioaddr) watchdog.type = f71862fg; err = f71862fg_pin_configure(0); /* validate module parameter */ break; + case SIO_F71868_ID: + watchdog.type = f71868; + break; case SIO_F71869_ID: case SIO_F71869A_ID: watchdog.type = f71869;
[PATCH] watchdog: f71808e_wdt: Add F71868 support
This adds support for watchdog part of Fintek F71868 Super I/O chip to f71808e_wdt driver. The F71868 chip is, in general, very similar to a F71869, however it has slightly different set of available reset pulse widths. Tested on MSI A55M-P33 motherboard. Signed-off-by: Maciej S. Szmigiero --- drivers/watchdog/Kconfig | 7 --- drivers/watchdog/f71808e_wdt.c | 27 --- 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 8b9049dac094..1f23079537ad 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -829,11 +829,12 @@ config EBC_C384_WDT the timeout module parameter. config F71808E_WDT - tristate "Fintek F71808E, F71862FG, F71869, F71882FG and F71889FG Watchdog" + tristate "Fintek F718xx, F818xx Super I/O Watchdog" depends on X86 help - This is the driver for the hardware watchdog on the Fintek - F71808E, F71862FG, F71869, F71882FG and F71889FG Super I/O controllers. + This is the driver for the hardware watchdog on the Fintek F71808E, + F71862FG, F71868, F71869, F71882FG, F71889FG, F81865 and F81866 + Super I/O controllers. You can compile this driver directly into the kernel, or use it as a module. The module will be called f71808e_wdt. diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c index 1b7e9169072f..8658dba21768 100644 --- a/drivers/watchdog/f71808e_wdt.c +++ b/drivers/watchdog/f71808e_wdt.c @@ -57,6 +57,7 @@ #define SIO_F71808_ID 0x0901 /* Chipset ID */ #define SIO_F71858_ID 0x0507 /* Chipset ID */ #define SIO_F71862_ID 0x0601 /* Chipset ID */ +#define SIO_F71868_ID 0x1106 /* Chipset ID */ #define SIO_F71869_ID 0x0814 /* Chipset ID */ #define SIO_F71869A_ID 0x1007 /* Chipset ID */ #define SIO_F71882_ID 0x0541 /* Chipset ID */ @@ -101,7 +102,7 @@ MODULE_PARM_DESC(timeout, static unsigned int pulse_width = WATCHDOG_PULSE_WIDTH; module_param(pulse_width, uint, 0); MODULE_PARM_DESC(pulse_width, - "Watchdog signal pulse width. 0(=level), 1 ms, 25 ms, 125 ms or 5000 ms" + "Watchdog signal pulse width. 0(=level), 1, 25, 30, 125, 150, 5000 or 6000 ms" " (default=" __MODULE_STRING(WATCHDOG_PULSE_WIDTH) ")"); static unsigned int f71862fg_pin = WATCHDOG_F71862FG_PIN; @@ -119,13 +120,14 @@ module_param(start_withtimeout, uint, 0); MODULE_PARM_DESC(start_withtimeout, "Start watchdog timer on module load with" " given initial timeout. Zero (default) disables this feature."); -enum chips { f71808fg, f71858fg, f71862fg, f71869, f71882fg, f71889fg, f81865, -f81866}; +enum chips { f71808fg, f71858fg, f71862fg, f71868, f71869, f71882fg, f71889fg, +f81865, f81866}; static const char *f71808e_names[] = { "f71808fg", "f71858fg", "f71862fg", + "f71868", "f71869", "f71882fg", "f71889fg", @@ -252,16 +254,23 @@ static int watchdog_set_timeout(int timeout) static int watchdog_set_pulse_width(unsigned int pw) { int err = 0; + unsigned int t1 = 25, t2 = 125, t3 = 5000; + + if (watchdog.type == f71868) { + t1 = 30; + t2 = 150; + t3 = 6000; + } mutex_lock(); - if(pw <=1) { + if(pw <= 1) { watchdog.pulse_val = 0; - } else if (pw <= 25) { + } else if (pw <= t1) { watchdog.pulse_val = 1; - } else if (pw <= 125) { + } else if (pw <= t2) { watchdog.pulse_val = 2; - } else if (pw <= 5000) { + } else if (pw <= t3) { watchdog.pulse_val = 3; } else { pr_err("pulse width out of range\n"); @@ -354,6 +363,7 @@ static int watchdog_start(void) goto exit_superio; break; + case f71868: case f71869: /* GPIO14 --> WDTRST# */ superio_clear_bit(watchdog.sioaddr, SIO_REG_MFUNCT1, 4); @@ -792,6 +802,9 @@ static int __init f71808e_find(int sioaddr) watchdog.type = f71862fg; err = f71862fg_pin_configure(0); /* validate module parameter */ break; + case SIO_F71868_ID: + watchdog.type = f71868; + break; case SIO_F71869_ID: case SIO_F71869A_ID: watchdog.type = f71869;
Re: [PATCH] frv: Use OFFSET macro in DEF_*REG()
El Mon, Apr 03, 2017 at 12:46:36PM -0700 Matthias Kaehlcke ha dit: > Avoid code duplication by using OFFSET() in DEF_*REG() instead of > replicating the macro. > > Signed-off-by: Matthias KaehlckePing, any feedback on this patch? Thanks Matthias
Re: [PATCH] frv: Use OFFSET macro in DEF_*REG()
El Mon, Apr 03, 2017 at 12:46:36PM -0700 Matthias Kaehlcke ha dit: > Avoid code duplication by using OFFSET() in DEF_*REG() instead of > replicating the macro. > > Signed-off-by: Matthias Kaehlcke Ping, any feedback on this patch? Thanks Matthias
Re: Microphone gain on N900
Hi! > v4.10 works quite nicely on N900, but I still have problems with > audio. Even GSM calls would be usable, if I had reasonable volume on > microphone and speaker... but I don't. > > Both speaker and microphone are too quiet. I can get louder output by > plugging headset -- good. I can hear the other side nicely, but am > too quiet for them. Unfortunately, even if I plug in wired headset, > internal microphone is still used. > > >> For me these settings provide reasonable playback and capture volumes > >> using v4.10.1: > >> > >> amixer -D hw:0 set PCM 100% > >> amixer -D hw:0 set 'Line DAC' 80% on > >> amixer -D hw:0 set 'Line' on > >> amixer -D hw:0 set 'Left Line Mixer DACL1' on > >> amixer -D hw:0 set 'Right Line Mixer DACR1' on > >> amixer -D hw:0 set 'Jack Function' 'Headphone' > >> amixer -D hw:0 set 'TPA6130A2 Headphone' 80% > >> > >> amixer -D hw:0 set 'Jack Function' 'Headset' > >> amixer -D hw:0 set 'AGC' off > >> amixer -D hw:0 set 'Input Select' 'ADC' > >> amixer -D hw:0 set 'Left PGA Mixer Line1L' on > >> amixer -D hw:0 set 'Left Line1L Mux' 'differential' > >> amixer -D hw:0 set 'Right PGA Mixer Line1L' on > >> amixer -D hw:0 set 'Right Line1L Mux' 'differential' > >> amixer -D hw:0 set 'ADC HPF Cut-off' '0.0045xFs' 'disabled' > >> amixer -D hw:0 set PGA 50% on > >> amixer -D hw:0 cset iface=MIXER,name='PGA Capture Switch',index=0 on > > > > This seems to correspond to volumes I have had before. They are enough > > for phone call in quiet environment, but not really for normal usage. > > > > Do you have settings where microphone on wired headset works? > > > Those settings above. They are set after putting all volumes to 0 and > muted. I'm using the Nokia WH-205 headset that came with the phone. I just wanted to say thank you. I finally got to testing those settings again, and this time headset microphone definitely works. Loopback using arecord | aplay now produces reasonable volume of audio. (Unfortunately, I still have some problems somewhere in the audio<->GSM path :-( ). Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Microphone gain on N900
Hi! > v4.10 works quite nicely on N900, but I still have problems with > audio. Even GSM calls would be usable, if I had reasonable volume on > microphone and speaker... but I don't. > > Both speaker and microphone are too quiet. I can get louder output by > plugging headset -- good. I can hear the other side nicely, but am > too quiet for them. Unfortunately, even if I plug in wired headset, > internal microphone is still used. > > >> For me these settings provide reasonable playback and capture volumes > >> using v4.10.1: > >> > >> amixer -D hw:0 set PCM 100% > >> amixer -D hw:0 set 'Line DAC' 80% on > >> amixer -D hw:0 set 'Line' on > >> amixer -D hw:0 set 'Left Line Mixer DACL1' on > >> amixer -D hw:0 set 'Right Line Mixer DACR1' on > >> amixer -D hw:0 set 'Jack Function' 'Headphone' > >> amixer -D hw:0 set 'TPA6130A2 Headphone' 80% > >> > >> amixer -D hw:0 set 'Jack Function' 'Headset' > >> amixer -D hw:0 set 'AGC' off > >> amixer -D hw:0 set 'Input Select' 'ADC' > >> amixer -D hw:0 set 'Left PGA Mixer Line1L' on > >> amixer -D hw:0 set 'Left Line1L Mux' 'differential' > >> amixer -D hw:0 set 'Right PGA Mixer Line1L' on > >> amixer -D hw:0 set 'Right Line1L Mux' 'differential' > >> amixer -D hw:0 set 'ADC HPF Cut-off' '0.0045xFs' 'disabled' > >> amixer -D hw:0 set PGA 50% on > >> amixer -D hw:0 cset iface=MIXER,name='PGA Capture Switch',index=0 on > > > > This seems to correspond to volumes I have had before. They are enough > > for phone call in quiet environment, but not really for normal usage. > > > > Do you have settings where microphone on wired headset works? > > > Those settings above. They are set after putting all volumes to 0 and > muted. I'm using the Nokia WH-205 headset that came with the phone. I just wanted to say thank you. I finally got to testing those settings again, and this time headset microphone definitely works. Loopback using arecord | aplay now produces reasonable volume of audio. (Unfortunately, I still have some problems somewhere in the audio<->GSM path :-( ). Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
On 2017-04-17 13:09, David Miller wrote: > From: Stefan Agner> Date: Sun, 16 Apr 2017 23:20:32 -0700 > >> The attribute declaration is typically before the definition. Move >> the __maybe_unused attribute declaration before the struct keyword. >> >> Signed-off-by: Stefan Agner > I did catch that while compiling with clang, and the exact error message is: drivers/net/ethernet/cirrus/cs89x0.c:1899:21: warning: attribute declaration must precede definition [-Wignored-attributes] static const struct __maybe_unused of_device_id cs89x0_match[] = { > Well, I see if just as often after the variable name too: > > net/irda/iriap.c:static const char *const ias_charset_types[] __maybe_unused > = { > net/irda/irlap.c:static const char *const lap_reasons[] __maybe_unused = { > net/irda/irlap_event.c:static const char *const irlap_event[] __maybe_unused > = { > net/irda/irlmp_event.c:static const char *const irlmp_event[] __maybe_unused > = { > That seems not to fire when compiling with clang. I guess because the attribute is after the _complete_ type? > Or after the struct: > > drivers/net/phy/ste10Xp.c:static struct mdio_device_id __maybe_unused > ste10Xp_tbl[] = { > drivers/net/phy/teranetics.c:static struct mdio_device_id > __maybe_unused teranetics_tbl[] = { > drivers/net/phy/vitesse.c:static struct mdio_device_id __maybe_unused > vitesse_tbl[] = { > Same here... > So unless we decide tree wide to do it in one order or another, such changes > are largely a waste of time. Afaik, "struct of_device_id" as a whole is a type. This case is really odd since it puts the attribute in the middle of a type. It is the only instance which came across (not everything compiles fine with clang yet, so there might be more... but a quick grep did not turn up more of the same cases)... > > Sorry I'm not applying this patch. Given that, can you reconsider? -- Stefan
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
On 2017-04-17 13:09, David Miller wrote: > From: Stefan Agner > Date: Sun, 16 Apr 2017 23:20:32 -0700 > >> The attribute declaration is typically before the definition. Move >> the __maybe_unused attribute declaration before the struct keyword. >> >> Signed-off-by: Stefan Agner > I did catch that while compiling with clang, and the exact error message is: drivers/net/ethernet/cirrus/cs89x0.c:1899:21: warning: attribute declaration must precede definition [-Wignored-attributes] static const struct __maybe_unused of_device_id cs89x0_match[] = { > Well, I see if just as often after the variable name too: > > net/irda/iriap.c:static const char *const ias_charset_types[] __maybe_unused > = { > net/irda/irlap.c:static const char *const lap_reasons[] __maybe_unused = { > net/irda/irlap_event.c:static const char *const irlap_event[] __maybe_unused > = { > net/irda/irlmp_event.c:static const char *const irlmp_event[] __maybe_unused > = { > That seems not to fire when compiling with clang. I guess because the attribute is after the _complete_ type? > Or after the struct: > > drivers/net/phy/ste10Xp.c:static struct mdio_device_id __maybe_unused > ste10Xp_tbl[] = { > drivers/net/phy/teranetics.c:static struct mdio_device_id > __maybe_unused teranetics_tbl[] = { > drivers/net/phy/vitesse.c:static struct mdio_device_id __maybe_unused > vitesse_tbl[] = { > Same here... > So unless we decide tree wide to do it in one order or another, such changes > are largely a waste of time. Afaik, "struct of_device_id" as a whole is a type. This case is really odd since it puts the attribute in the middle of a type. It is the only instance which came across (not everything compiles fine with clang yet, so there might be more... but a quick grep did not turn up more of the same cases)... > > Sorry I'm not applying this patch. Given that, can you reconsider? -- Stefan
[GIT PULL] parisc architecture fix for v4.11
Hi Linus, please pull one important fix for the parisc architecture for kernel 4.11 from: git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git parisc-4.11-5 One patch which fixes get_user() for 64-bit values on 32-bit kernels. Up to now we lost the upper 32-bits of the returned 64-bit value. Thanks, Helge Helge Deller (1): parisc: Fix get_user() for 64-bit value on 32-bit kernel arch/parisc/include/asm/uaccess.h | 86 +-- 1 file changed, 55 insertions(+), 31 deletions(-)
[GIT PULL] parisc architecture fix for v4.11
Hi Linus, please pull one important fix for the parisc architecture for kernel 4.11 from: git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git parisc-4.11-5 One patch which fixes get_user() for 64-bit values on 32-bit kernels. Up to now we lost the upper 32-bits of the returned 64-bit value. Thanks, Helge Helge Deller (1): parisc: Fix get_user() for 64-bit value on 32-bit kernel arch/parisc/include/asm/uaccess.h | 86 +-- 1 file changed, 55 insertions(+), 31 deletions(-)
Re: [PATCH 4/9] mm, memory_hotplug: get rid of is_zone_device_section
On Mon, Apr 10, 2017 at 01:03:46PM +0200, Michal Hocko wrote: > From: Michal Hocko> > device memory hotplug hooks into regular memory hotplug only half way. > It needs memory sections to track struct pages but there is no > need/desire to associate those sections with memory blocks and export > them to the userspace via sysfs because they cannot be onlined anyway. > > This is currently expressed by for_device argument to arch_add_memory > which then makes sure to associate the given memory range with > ZONE_DEVICE. register_new_memory then relies on is_zone_device_section > to distinguish special memory hotplug from the regular one. While this > works now, later patches in this series want to move __add_zone outside > of arch_add_memory path so we have to come up with something else. > > Add want_memblock down the __add_pages path and use it to control > whether the section->memblock association should be done. arch_add_memory > then just trivially want memblock for everything but for_device hotplug. > > remove_memory_section doesn't need is_zone_device_section either. We can > simply skip all the memblock specific cleanup if there is no memblock > for the given section. > > This shouldn't introduce any functional change. > > Cc: Dan Williams > Signed-off-by: Michal Hocko > --- > arch/ia64/mm/init.c| 2 +- > arch/powerpc/mm/mem.c | 2 +- > arch/s390/mm/init.c| 2 +- > arch/sh/mm/init.c | 2 +- > arch/x86/mm/init_32.c | 2 +- > arch/x86/mm/init_64.c | 2 +- > drivers/base/memory.c | 22 -- > include/linux/memory_hotplug.h | 2 +- > mm/memory_hotplug.c| 11 +++ > 9 files changed, 22 insertions(+), 25 deletions(-) > > diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c > index 06cdaef54b2e..62085fd902e6 100644 > --- a/arch/ia64/mm/init.c > +++ b/arch/ia64/mm/init.c > @@ -657,7 +657,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > > zone = pgdat->node_zones + > zone_for_memory(nid, start, size, ZONE_NORMAL, for_device); > - ret = __add_pages(nid, zone, start_pfn, nr_pages); > + ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > > if (ret) > printk("%s: Problem encountered in __add_pages() as ret=%d\n", > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c > index 5f844337de21..ea3e09a62f38 100644 > --- a/arch/powerpc/mm/mem.c > +++ b/arch/powerpc/mm/mem.c > @@ -149,7 +149,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > zone = pgdata->node_zones + > zone_for_memory(nid, start, size, 0, for_device); > > - return __add_pages(nid, zone, start_pfn, nr_pages); > + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c > index bf5b8a0c4ff7..5c84346e5211 100644 > --- a/arch/s390/mm/init.c > +++ b/arch/s390/mm/init.c > @@ -182,7 +182,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > continue; > nr_pages = (start_pfn + size_pages > zone_end_pfn) ? > zone_end_pfn - start_pfn : size_pages; > - rc = __add_pages(nid, zone, start_pfn, nr_pages); > + rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > if (rc) > break; > start_pfn += nr_pages; > diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c > index 75491862d900..a9d57f75ae8c 100644 > --- a/arch/sh/mm/init.c > +++ b/arch/sh/mm/init.c > @@ -498,7 +498,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > ret = __add_pages(nid, pgdat->node_zones + > zone_for_memory(nid, start, size, ZONE_NORMAL, > for_device), > - start_pfn, nr_pages); > + start_pfn, nr_pages, !for_device); > if (unlikely(ret)) > printk("%s: Failed, __add_pages() == %d\n", __func__, ret); > > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c > index c68078fd06fd..4b0f05328af0 100644 > --- a/arch/x86/mm/init_32.c > +++ b/arch/x86/mm/init_32.c > @@ -834,7 +834,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > > - return __add_pages(nid, zone, start_pfn, nr_pages); > + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 7eef17239378..39cfaee93975 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -652,7 +652,7 @@ int arch_add_memory(int nid, u64 start,
Re: [PATCH 4/9] mm, memory_hotplug: get rid of is_zone_device_section
On Mon, Apr 10, 2017 at 01:03:46PM +0200, Michal Hocko wrote: > From: Michal Hocko > > device memory hotplug hooks into regular memory hotplug only half way. > It needs memory sections to track struct pages but there is no > need/desire to associate those sections with memory blocks and export > them to the userspace via sysfs because they cannot be onlined anyway. > > This is currently expressed by for_device argument to arch_add_memory > which then makes sure to associate the given memory range with > ZONE_DEVICE. register_new_memory then relies on is_zone_device_section > to distinguish special memory hotplug from the regular one. While this > works now, later patches in this series want to move __add_zone outside > of arch_add_memory path so we have to come up with something else. > > Add want_memblock down the __add_pages path and use it to control > whether the section->memblock association should be done. arch_add_memory > then just trivially want memblock for everything but for_device hotplug. > > remove_memory_section doesn't need is_zone_device_section either. We can > simply skip all the memblock specific cleanup if there is no memblock > for the given section. > > This shouldn't introduce any functional change. > > Cc: Dan Williams > Signed-off-by: Michal Hocko > --- > arch/ia64/mm/init.c| 2 +- > arch/powerpc/mm/mem.c | 2 +- > arch/s390/mm/init.c| 2 +- > arch/sh/mm/init.c | 2 +- > arch/x86/mm/init_32.c | 2 +- > arch/x86/mm/init_64.c | 2 +- > drivers/base/memory.c | 22 -- > include/linux/memory_hotplug.h | 2 +- > mm/memory_hotplug.c| 11 +++ > 9 files changed, 22 insertions(+), 25 deletions(-) > > diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c > index 06cdaef54b2e..62085fd902e6 100644 > --- a/arch/ia64/mm/init.c > +++ b/arch/ia64/mm/init.c > @@ -657,7 +657,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > > zone = pgdat->node_zones + > zone_for_memory(nid, start, size, ZONE_NORMAL, for_device); > - ret = __add_pages(nid, zone, start_pfn, nr_pages); > + ret = __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > > if (ret) > printk("%s: Problem encountered in __add_pages() as ret=%d\n", > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c > index 5f844337de21..ea3e09a62f38 100644 > --- a/arch/powerpc/mm/mem.c > +++ b/arch/powerpc/mm/mem.c > @@ -149,7 +149,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > zone = pgdata->node_zones + > zone_for_memory(nid, start, size, 0, for_device); > > - return __add_pages(nid, zone, start_pfn, nr_pages); > + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c > index bf5b8a0c4ff7..5c84346e5211 100644 > --- a/arch/s390/mm/init.c > +++ b/arch/s390/mm/init.c > @@ -182,7 +182,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > continue; > nr_pages = (start_pfn + size_pages > zone_end_pfn) ? > zone_end_pfn - start_pfn : size_pages; > - rc = __add_pages(nid, zone, start_pfn, nr_pages); > + rc = __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > if (rc) > break; > start_pfn += nr_pages; > diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c > index 75491862d900..a9d57f75ae8c 100644 > --- a/arch/sh/mm/init.c > +++ b/arch/sh/mm/init.c > @@ -498,7 +498,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > ret = __add_pages(nid, pgdat->node_zones + > zone_for_memory(nid, start, size, ZONE_NORMAL, > for_device), > - start_pfn, nr_pages); > + start_pfn, nr_pages, !for_device); > if (unlikely(ret)) > printk("%s: Failed, __add_pages() == %d\n", __func__, ret); > > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c > index c68078fd06fd..4b0f05328af0 100644 > --- a/arch/x86/mm/init_32.c > +++ b/arch/x86/mm/init_32.c > @@ -834,7 +834,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > > - return __add_pages(nid, zone, start_pfn, nr_pages); > + return __add_pages(nid, zone, start_pfn, nr_pages, !for_device); > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 7eef17239378..39cfaee93975 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -652,7 +652,7 @@ int arch_add_memory(int nid, u64 start, u64 size, bool > for_device) > >
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
From: Stefan AgnerDate: Sun, 16 Apr 2017 23:20:32 -0700 > The attribute declaration is typically before the definition. Move > the __maybe_unused attribute declaration before the struct keyword. > > Signed-off-by: Stefan Agner Well, I see if just as often after the variable name too: net/irda/iriap.c:static const char *const ias_charset_types[] __maybe_unused = { net/irda/irlap.c:static const char *const lap_reasons[] __maybe_unused = { net/irda/irlap_event.c:static const char *const irlap_event[] __maybe_unused = { net/irda/irlmp_event.c:static const char *const irlmp_event[] __maybe_unused = { Or after the struct: drivers/net/phy/ste10Xp.c:static struct mdio_device_id __maybe_unused ste10Xp_tbl[] = { drivers/net/phy/teranetics.c:static struct mdio_device_id __maybe_unused teranetics_tbl[] = { drivers/net/phy/vitesse.c:static struct mdio_device_id __maybe_unused vitesse_tbl[] = { So unless we decide tree wide to do it in one order or another, such changes are largely a waste of time. Sorry I'm not applying this patch.
Re: [PATCH] net: cx89x0: move attribute declaration before struct keyword
From: Stefan Agner Date: Sun, 16 Apr 2017 23:20:32 -0700 > The attribute declaration is typically before the definition. Move > the __maybe_unused attribute declaration before the struct keyword. > > Signed-off-by: Stefan Agner Well, I see if just as often after the variable name too: net/irda/iriap.c:static const char *const ias_charset_types[] __maybe_unused = { net/irda/irlap.c:static const char *const lap_reasons[] __maybe_unused = { net/irda/irlap_event.c:static const char *const irlap_event[] __maybe_unused = { net/irda/irlmp_event.c:static const char *const irlmp_event[] __maybe_unused = { Or after the struct: drivers/net/phy/ste10Xp.c:static struct mdio_device_id __maybe_unused ste10Xp_tbl[] = { drivers/net/phy/teranetics.c:static struct mdio_device_id __maybe_unused teranetics_tbl[] = { drivers/net/phy/vitesse.c:static struct mdio_device_id __maybe_unused vitesse_tbl[] = { So unless we decide tree wide to do it in one order or another, such changes are largely a waste of time. Sorry I'm not applying this patch.
Re: [PATCH 3.18 010/145] sctp: avoid BUG_ON on sctp_wait_for_sndbuf
Em 16-04-2017 07:48, Greg Kroah-Hartman escreveu: 3.18-stable review patch. If anyone has any objections, please let me know. -- From: Marcelo Ricardo Leitner[ Upstream commit 2dcab598484185dea7ec22219c76dcdd59e3cb90 ] Hi Greg. Are you also including dfcb9f4f99f1 ("sctp: deny peeloff operation on asocs with threads sleeping on it") It's a follow-up fix for this one, would be good to have it too. Alexander Popov reported that an application may trigger a BUG_ON in sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is waiting on it to queue more data and meanwhile another thread peels off the association being used by the first thread. This patch replaces the BUG_ON call with a proper error handling. It will return -EPIPE to the original sendmsg call, similarly to what would have been done if the association wasn't found in the first place. Acked-by: Alexander Popov Signed-off-by: Marcelo Ricardo Leitner Reviewed-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman --- net/sctp/socket.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -6962,7 +6962,8 @@ static int sctp_wait_for_sndbuf(struct s */ release_sock(sk); current_timeo = schedule_timeout(current_timeo); - BUG_ON(sk != asoc->base.sk); + if (sk != asoc->base.sk) + goto do_error; lock_sock(sk); *timeo_p = current_timeo;
Re: [PATCH 3.18 010/145] sctp: avoid BUG_ON on sctp_wait_for_sndbuf
Em 16-04-2017 07:48, Greg Kroah-Hartman escreveu: 3.18-stable review patch. If anyone has any objections, please let me know. -- From: Marcelo Ricardo Leitner [ Upstream commit 2dcab598484185dea7ec22219c76dcdd59e3cb90 ] Hi Greg. Are you also including dfcb9f4f99f1 ("sctp: deny peeloff operation on asocs with threads sleeping on it") It's a follow-up fix for this one, would be good to have it too. Alexander Popov reported that an application may trigger a BUG_ON in sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is waiting on it to queue more data and meanwhile another thread peels off the association being used by the first thread. This patch replaces the BUG_ON call with a proper error handling. It will return -EPIPE to the original sendmsg call, similarly to what would have been done if the association wasn't found in the first place. Acked-by: Alexander Popov Signed-off-by: Marcelo Ricardo Leitner Reviewed-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman --- net/sctp/socket.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -6962,7 +6962,8 @@ static int sctp_wait_for_sndbuf(struct s */ release_sock(sk); current_timeo = schedule_timeout(current_timeo); - BUG_ON(sk != asoc->base.sk); + if (sk != asoc->base.sk) + goto do_error; lock_sock(sk); *timeo_p = current_timeo;