Re: [PATCH v2 2/5] x86/PCI: Support additional MMIO range capabilities
On 29 Apr 2014, at 3:20 , Borislav Petkov b...@suse.de wrote: On Tue, Apr 29, 2014 at 09:33:09AM +0200, Andreas Herrmann wrote: I am sure, it's because some server systems had MMIO ECS access not enabled in BIOS. I can't remember which systems were affected. Ok, now AMD people: what's the story with IO ECS, can we assume that on everything after F10h, BIOS has a sensible MCFG and we can limit this to F10h only? I like Bjorn's idea but we need to make sure a working MCFG is ubiquitous. Which begs the real question: Suravee, why are you even touching IO ECS provided F15h and later have a MCFG? Or, do they? Our experience with this is that Fam10h and later have a very well working MCFG setup, earlier generations not so much (hence IO ECS was needed). Cheers, Steffen -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] toshiba_acpi: Add alternative keymap support for Satellite M840
At Tue, 29 Apr 2014 16:05:54 +0200, Jose Ignacio Naranjo wrote: Hi, I sent a similar solution some months ago for another Toshiba's model http://permalink.gmane.org/gmane.linux.drivers.platform.x86.devel/5198 I answered to the thread, but I just noticed it didn't make it to the list. Don't know why, maybe because of having attached some files :( My answer was basically if we could use the oem_table_id instead of DMI, but I don't know others dsdt from Toshiba. Yeah, that sounds feasible. I'm also not particularly in favor of DMI, either. It was used just because it's the easiest way. I cannot answer to both of Matthew's questions in the thread above, as I'm no owner of the machine but merely a patch monkey who tried to solve a bug on openSUSE. Feel free to join to the bugzilla thread mentioned in the patch for more detailed information. thanks, Takashi Regards, JI On Tue, Apr 29, 2014 at 3:15 PM, Takashi Iwai ti...@suse.de wrote: Toshiba Satellite M840 laptop has a complete different keymap although it's bound with the same ACPI ID TOS1900. This patch provides an alternative keymap specific to this machine by identifying via DMI matching. The keymap table doesn't fill all entries that were used before since some keys aren't found on this machine at all. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69761 Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=812209 Reported-and-tested-by: Federico Vecchiarelli fe...@gmx.net Signed-off-by: Takashi Iwai ti...@suse.de --- drivers/platform/x86/toshiba_acpi.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/toshiba_acpi.c b/drivers/platform/x86/toshiba_acpi.c index 46473ca7566b..76441dcbe5ff 100644 --- a/drivers/platform/x86/toshiba_acpi.c +++ b/drivers/platform/x86/toshiba_acpi.c @@ -56,6 +56,7 @@ #include linux/workqueue.h #include linux/i8042.h #include linux/acpi.h +#include linux/dmi.h #include asm/uaccess.h MODULE_AUTHOR(John Belmonte); @@ -213,6 +214,30 @@ static const struct key_entry toshiba_acpi_keymap[] = { { KE_END, 0 }, }; +/* alternative keymap */ +static const struct dmi_system_id toshiba_alt_keymap_dmi[] = { + { + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, TOSHIBA), + DMI_MATCH(DMI_PRODUCT_NAME, Satellite M840), + }, + }, + {} +}; + +static const struct key_entry toshiba_acpi_alt_keymap[] = { + { KE_KEY, 0x157, { KEY_MUTE } }, + { KE_KEY, 0x102, { KEY_ZOOMOUT } }, + { KE_KEY, 0x103, { KEY_ZOOMIN } }, + { KE_KEY, 0x139, { KEY_ZOOMRESET } }, + { KE_KEY, 0x13e, { KEY_SWITCHVIDEOMODE } }, + { KE_KEY, 0x13c, { KEY_BRIGHTNESSDOWN } }, + { KE_KEY, 0x13d, { KEY_BRIGHTNESSUP } }, + { KE_KEY, 0x158, { KEY_WLAN } }, + { KE_KEY, 0x13f, { KEY_TOUCHPAD_TOGGLE } }, + { KE_END, 0 }, +}; + /* utility */ @@ -1440,6 +1465,7 @@ static int toshiba_acpi_setup_keyboard(struct toshiba_acpi_dev *dev) acpi_handle ec_handle; int error; u32 hci_result; + const struct key_entry *keymap = toshiba_acpi_keymap; dev-hotkey_dev = input_allocate_device(); if (!dev-hotkey_dev) @@ -1449,7 +1475,9 @@ static int toshiba_acpi_setup_keyboard(struct toshiba_acpi_dev *dev) dev-hotkey_dev-phys = toshiba_acpi/input0; dev-hotkey_dev-id.bustype = BUS_HOST; - error = sparse_keymap_setup(dev-hotkey_dev, toshiba_acpi_keymap, NULL); + if (dmi_check_system(toshiba_alt_keymap_dmi)) + keymap = toshiba_acpi_alt_keymap; + error = sparse_keymap_setup(dev-hotkey_dev, keymap, NULL); if (error) goto err_free_dev; -- 1.9.2 -- To unsubscribe from this list: send the line unsubscribe platform-driver-x86 in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html [2 text/html; UTF-8 (quoted-printable)] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RT 2/4] net: gianfar: do not try to cleanup TX packets if they are not done
On 14-04-28 02:37 AM, Sebastian Andrzej Siewior wrote: On 04/27/2014 04:31 PM, Steven Rostedt wrote: diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 5c0efcc..8aecc1d 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -2856,10 +2855,14 @@ static int gfar_poll(struct napi_struct *napi, int budget) tx_queue = priv-tx_queue[i]; /* run Tx cleanup to completion */ if (tx_queue-tx_skbuff[tx_queue-skb_dirtytx]) { -gfar_clean_tx_ring(tx_queue); -has_tx_work = 1; +int ret; + +ret = gfar_clean_tx_ring(tx_queue); +if (ret) +has_tx_work++; } } +work_done += has_tx_work; for_each_set_bit(i, gfargrp-rx_bit_map, priv-num_rx_queues) { /* skip queue if not active */ The 3.14-RT version of the patch should have an additional return statement here which I forgot initially. Sanity boot tested the 3.10 rc1 on a sbc8548 (UP PPC with gianfar), with the one-liner added as follows: diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 8aecc1d81395..b87a8c919c3e 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -2574,6 +2574,7 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) tx_queue-dirty_tx = bdp; netdev_tx_completed_queue(txq, howmany, bytes_sent); + return howmany; } static void gfar_schedule_cleanup(struct gfar_priv_grp *gfargrp) Paul. -- Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Pulseaudio hung at schedule in 3.15-rc1
At Sun, 20 Apr 2014 14:50:12 -0400, Bryan Quigley wrote: Does the patch below work? Nope, that didn't help. Now I'm back and start looking at this again. Could you give the raw kernel messages with the stack trace? The previous cited message is hard to read. I did determine it is caused by having my Logitech webcam plugged in. Here is lsusb from a working system: Bus 002 Device 002: ID 046d:0825 Logitech, Inc. Webcam C270 OK, there is a quick for this device but it shouldn't affect the suspend/resume behavior, at least. thanks, Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sched_{set,get}attr() manpage
On Tue, Apr 29, 2014 at 03:08:55PM +0200, Michael Kerrisk (man-pages) wrote: Hi Peter, On 04/28/2014 10:18 AM, Peter Zijlstra wrote: Hi Michael, find below an updated manpage, I did not apply the comments on parts that are identical to SCHED_SETSCHEDULER(2) in order to keep these texts in alignment. I feel that if we change one we should also change the other, and such a 'patch' is best done separate from the new manpage itself. I did add the missing EBUSY error, and amended the text where it said we'd return EINVAL in that case. I added a paragraph stating that SCHED_DEADLINE preempted anything else userspace can do (with the explicit mention of userspace to leave me wriggle room for the kernel's stop task :-). I also did a short paragraph on the deadline sched_yield(). For further deadline yield details we should maybe add to the SCHED_YIELD(2) manpage. Re juri/claudio; no I think sched_yield() as implemented for deadline makes sense, no other yield semantics other than NOP makes sense for it, and since we have the syscall already might as well make it do something useful. Thanks for the updated page. Would you be willing to revise as per the comments below. Ok. NAME sched_setattr, sched_getattr - set and get scheduling policy/attributes SYNOPSIS #include sched.h struct sched_attr { u32 size; u32 sched_policy; u64 sched_flags; /* SCHED_NORMAL, SCHED_BATCH */ s32 sched_nice; /* SCHED_FIFO, SCHED_RR */ u32 sched_priority; /* SCHED_DEADLINE */ u64 sched_runtime; u64 sched_deadline; u64 sched_period; }; int sched_setattr(pid_t pid, const struct sched_attr *attr, unsigned int flags); int sched_getattr(pid_t pid, const struct sched_attr *attr, unsigned int size, unsigned int flags); DESCRIPTION sched_setattr() sets both the scheduling policy and the associated attributes for the process whose ID is specified in pid. Around about here, I think there needs to be a sentence explaining that sched_setattr() provides a superset of the functionality of sched_setscheduler(2) and setpritority(2). I mean, it can do all that those two calls can do, right? Almost; setpriority() has the .which argument which we don't have. So while that syscall can change the nice value for an entire process group or user, sched_setattr() can only change the nice value for 1 task. But yes, I can mention something along those lines. If pid equals zero, the scheduling policy and attributes of the calling process will be set. The interpretation of the argument attr depends on the selected policy. Currently, Linux supports the following normal (i.e., non-real-time) scheduling policies: SCHED_OTHER the standard fair time-sharing policy; SCHED_BATCH for batch style execution of processes; and SCHED_IDLE for running very low priority background jobs. The following real-time policies are also supported, for special time-critical applications that need precise control over the way in which runnable processes are selected for execution: SCHED_FIFO a first-in, first-out policy; SCHED_RRa round-robin policy; and SCHED_DEADLINE a deadline policy. The semantics of each of these policies are detailed below. The semantics of each of these policies are detailed in sched(7). I don't appear to have SCHED(7), how new is that? [See my comments below] sched_attr::size must be set to the size of the structure, as in sizeof(struct sched_attr), if the provided structure is smaller than the kernel structure, any additional fields are assumed '0'. If the provided structure is larger than the kernel structure, the kernel verifies all additional fields are '0' if not the syscall will fail with -E2BIG. sched_attr::sched_policy the desired scheduling policy. sched_attr::sched_flags additional flags that can influence scheduling behaviour. Currently as per Linux kernel 3.14: SCHED_FLAG_RESET_ON_FORK - resets the scheduling policy to: (struct sched_attr){ .sched_policy = SCHED_OTHER, } on fork(). is the only supported flag. sched_attr::sched_nice should only be set for SCHED_OTHER, SCHED_BATCH, the desired nice value [-20,19], see NICE(2). sched_attr::sched_priority should only be set for SCHED_FIFO, SCHED_RR, the desired static priority [1,99]. sched_attr::sched_runtime sched_attr::sched_deadline sched_attr::sched_period should only be set for SCHED_DEADLINE and are the traditional sporadic task model parameters. Could you add (a lot ;-))
Re: [PATCH 2/5] netfilter: Fix format string mismatch in mangle_content_len()
On Tue, Apr 01, 2014 at 12:43:36AM +0900, Masanari Iida wrote: Fix format string mismatch in mangle_connect_len() All these patches seem like pointless noise to me. In none of these cases can the value legitimately be negative. If anything, you should fix the types to be unsigned. Signed-off-by: Masanari Iida standby2...@gmail.com --- net/netfilter/nf_nat_sip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_nat_sip.c b/net/netfilter/nf_nat_sip.c index b4d691d..5f98845 100644 --- a/net/netfilter/nf_nat_sip.c +++ b/net/netfilter/nf_nat_sip.c @@ -434,7 +434,7 @@ static int mangle_content_len(struct sk_buff *skb, unsigned int protoff, matchoff, matchlen) = 0) return 0; - buflen = sprintf(buffer, %u, c_len); + buflen = sprintf(buffer, %d, c_len); return mangle_packet(skb, protoff, dataoff, dptr, datalen, matchoff, matchlen, buffer, buflen); } -- 1.9.1.352.gd393d14 -- To unsubscribe from this list: send the line unsubscribe netfilter-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] mba6x_bl: Backlight driver for mid 2013 MacBook Air
This driver takes control over the LP8550 backlight driver chip found in the mid 2013 and newer MacBook Air (6,1 and 6,2). The i915 GPU driver cannot properly restore the backlight after resume, but with this driver we can hijack the LP8550 and get fully functional backlight support. v2: - Dropped if ACPI in Kconfig since we already depend on it - Added comment about brightness mapping - Removed lp8550_init() from set_brightness() - Always write to dev_ctl when setting brightness - Change %Ld to standard C %lld - Constify the backlight_ops struct Signed-off-by: Patrik Jakobsson patrik.r.jakobs...@gmail.com --- MAINTAINERS | 6 + drivers/platform/x86/Kconfig| 13 ++ drivers/platform/x86/Makefile | 1 + drivers/platform/x86/mba6x_bl.c | 353 4 files changed, 373 insertions(+) create mode 100644 drivers/platform/x86/mba6x_bl.c diff --git a/MAINTAINERS b/MAINTAINERS index e67ea24..cad3e82 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5576,6 +5576,12 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git S: Maintained F: net/mac80211/rc80211_pid* +MACBOOK AIR 6,X BACKLIGHT DRIVER +M: Patrik Jakobsson patrik.r.jakobs...@gmail.com +L: platform-driver-...@vger.kernel.org +S: Maintained +F: drivers/platform/x86/mba6x_bl.c + MACVLAN DRIVER M: Patrick McHardy ka...@trash.net L: net...@vger.kernel.org diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index 27df2c5..10ac918 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -795,6 +795,19 @@ config APPLE_GMUX graphics as well as the backlight. Currently only backlight control is supported by the driver. +config MBA6X_BL + tristate MacBook Air 6,x backlight driver + depends on ACPI + depends on BACKLIGHT_CLASS_DEVICE + select ACPI_VIDEO + help +This driver takes control over the LP8550 backlight driver found in +some MacBook Air models. Say Y here if you have a MacBook Air from mid +2013 or newer. + +To compile this driver as a module, choose M here: the module will +be called mba6x_bl. + config INTEL_RST tristate Intel Rapid Start Technology Driver depends on ACPI diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile index 1a2eafc..9a182fe 100644 --- a/drivers/platform/x86/Makefile +++ b/drivers/platform/x86/Makefile @@ -56,3 +56,4 @@ obj-$(CONFIG_INTEL_SMARTCONNECT) += intel-smartconnect.o obj-$(CONFIG_PVPANIC) += pvpanic.o obj-$(CONFIG_ALIENWARE_WMI)+= alienware-wmi.o +obj-$(CONFIG_MBA6X_BL) += mba6x_bl.o diff --git a/drivers/platform/x86/mba6x_bl.c b/drivers/platform/x86/mba6x_bl.c new file mode 100644 index 000..c549667 --- /dev/null +++ b/drivers/platform/x86/mba6x_bl.c @@ -0,0 +1,353 @@ +/* + * MacBook Air 6,1 and 6,2 (mid 2013) backlight driver + * + * Copyright (C) 2014 Patrik Jakobsson (patrik.r.jakobs...@gmail.com) + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#include linux/kernel.h +#include linux/init.h +#include linux/module.h +#include linux/platform_device.h +#include linux/backlight.h +#include linux/acpi.h +#include acpi/acpi.h +#include acpi/video.h + +#define LP8550_SMBUS_ADDR (0x58 1) +#define LP8550_REG_BRIGHTNESS 0 +#define LP8550_REG_DEV_CTL 1 +#define LP8550_REG_FAULT 2 +#define LP8550_REG_IDENT 3 +#define LP8550_REG_DIRECT_CTL 4 +#define LP8550_REG_TEMP_MSB5 /* Must be read before TEMP_LSB */ +#define LP8550_REG_TEMP_LSB6 + +#define INIT_BRIGHTNESS150 + +static struct { + u8 brightness; /* Brightness control */ + u8 dev_ctl; /* Device control */ + u8 fault; /* Fault indication */ + u8 ident; /* Identification */ + u8 direct_ctl; /* Direct control */ + u8 temp_msb;/* Temperature MSB */ + u8 temp_lsb;/* Temperature LSB */ +} lp8550_regs; + +static struct platform_device *platform_device; +static struct backlight_device *backlight_device; + +static int lp8550_reg_read(u8 reg, u8 *val) +{ + acpi_status status; + acpi_handle handle; + struct acpi_object_list arg_list; + struct acpi_buffer buffer = {ACPI_ALLOCATE_BUFFER, NULL}; + union acpi_object args[2]; + union acpi_object *result; + int ret = 0; + + status = acpi_get_handle(NULL, \\_SB.PCI0.SBUS.SRDB, handle); + if (ACPI_FAILURE(status)) { +
Re: [RFC PATCH v2 4/9] crypto: qce: Add ablkcipher algorithms
Thanks for the review! On 04/28/2014 11:00 AM, Herbert Xu wrote: On Mon, Apr 14, 2014 at 03:48:40PM +0300, Stanimir Varbanov wrote: +if (IS_AES(flags)) { +switch (keylen) { +case AES_KEYSIZE_128: +case AES_KEYSIZE_256: +break; +default: +goto badkey; You need to support 192 here. If the hardware doesn't do that you can work around it by using a software fallback. Sure, I will make a software fallback. Thanks. In general you need to provide everything that is supported by the generic software implementation. -- regards, Stan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf tests: Add static build make test
On 4/29/14, 2:33 AM, Jiri Olsa wrote: Adding test for building static perf build into the automated suite. Also available via following commands: $ make -f tests/make make_static - make_static: cd . make -f Makefile DESTDIR=/tmp/tmp.7u5MlB4njo LDFLAGS=-static $ make -f tests/make make_static_O - make_static_O: cd . make -f Makefile O=/tmp/tmp.Ay6r3wEmtX DESTDIR=/tmp/tmp.vK0KQwO0Vi LDFLAGS=-static Cc: Arnaldo Carvalho de Melo a...@kernel.org Cc: Corey Ashford cjash...@linux.vnet.ibm.com Cc: David Ahern dsah...@gmail.com Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Namhyung Kim namhy...@kernel.org Cc: Paul Mackerras pau...@samba.org Cc: Peter Zijlstra a.p.zijls...@chello.nl Signed-off-by: Jiri Olsa jo...@kernel.org --- tools/perf/tests/make | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/tests/make b/tools/perf/tests/make index 5daeae1..2f92d6e 100644 --- a/tools/perf/tests/make +++ b/tools/perf/tests/make @@ -46,6 +46,7 @@ make_install_man:= install-man make_install_html := install-html make_install_info := install-info make_install_pdf:= install-pdf +make_static := LDFLAGS=-static # all the NO_* variable combined make_minimal:= NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 @@ -87,6 +88,7 @@ run += make_install_bin # run += make_install_info # run += make_install_pdf run += make_minimal +run += make_static ifneq ($(call has,ctags),) run += make_tags Acked-by: David Ahern dsah...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RT 2/4] net: gianfar: do not try to cleanup TX packets if they are not done
On Tue, 29 Apr 2014 10:16:51 -0400 Paul Gortmaker paul.gortma...@windriver.com wrote: Sanity boot tested the 3.10 rc1 on a sbc8548 (UP PPC with gianfar), with the one-liner added as follows: diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 8aecc1d81395..b87a8c919c3e 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -2574,6 +2574,7 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) tx_queue-dirty_tx = bdp; netdev_tx_completed_queue(txq, howmany, bytes_sent); + return howmany; That's the change I added to 3.10-rc2. I'll post it soon if you want to test it. -- Steve } static void gfar_schedule_cleanup(struct gfar_priv_grp *gfargrp) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit x86: NMI nesting still buggy?
On 04/29/2014 07:06 AM, Steven Rostedt wrote: On Tue, 29 Apr 2014 06:29:04 -0700 H. Peter Anvin h...@linux.intel.com wrote: [2] A special case can occur if an SMI handler nests inside an NMI handler and then another NMI occurs. During NMI interrupt handling, NMI interrupts are disabled, so normally NMI interrupts are serviced and completed with an IRET instruction one at a time. When the processor enters SMM while executing an NMI handler, the processor saves the SMRAM state save map but does not save the attribute to keep NMI interrupts disabled. Potentially, an NMI could be latched (while in SMM or upon exit) and serviced upon exit of SMM even though the previous NMI handler has still not completed. I believe [2] only applies if there is an IRET executing inside the SMM handler, which should not normally be the case. It might also have been addressed since that was written, but I don't know. Bad behaving BIOS? But I'm sure there's no such thing ;-) Never... -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] kdb: Implement seq_file command
Combining the kdb seq_file infrastructure with its symbolic lookups allows a good sub-set of files held in pseudo filesystems to be displayed by kdb. The seq_file command does exactly this and allows a significant subset of pseudo files to be safely examined whilst debugging (and in the hands of a brave expert an even bigger subset can be unsafely examined). Good arguments to try with this command include: cpuinfo_op, gpiolib_seq_ops and vmalloc_op. Signed-off-by: Daniel Thompson daniel.thomp...@linaro.org --- kernel/debug/kdb/kdb_main.c | 28 1 file changed, 28 insertions(+) diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c index 0b097c8..d87731c 100644 --- a/kernel/debug/kdb/kdb_main.c +++ b/kernel/debug/kdb/kdb_main.c @@ -1734,6 +1734,32 @@ static int kdb_mm(int argc, const char **argv) } /* + * kdb_seq_file - This function implements the 'seq_file' command. + * seq_file address-expression + */ +static int kdb_seq_file(int argc, const char **argv) +{ + int diag; + unsigned long addr; + int nextarg; + long offset; + char *name; + const struct seq_operations *ops; + + nextarg = 1; + diag = kdbgetaddrarg(argc, argv, nextarg, addr, offset, name); + if (diag) + return diag; + + if (nextarg != argc+1) + return KDB_ARGCOUNT; + + ops = (const struct seq_operations *) (addr + offset); + kdb_printf(Using sequence_ops at 0x%p (%s)\n, ops, name); + return kdb_print_seq_file(ops); +} + +/* * kdb_go - This function implements the 'go' command. * go [address-expression] */ @@ -2838,6 +2864,8 @@ static void __init kdb_inittab(void) Display per_cpu variables, 3, KDB_REPEAT_NONE); kdb_register_repeat(grephelp, kdb_grep_help, , Display help on | grep, 0, KDB_REPEAT_NONE); + kdb_register_repeat(seq_file, kdb_seq_file, , + Show a seq_file using struct seq_operations, 3, KDB_REPEAT_NONE); } /* Execute any commands defined in kdb_cmds. */ -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RT 1/4] net: gianfar: do not disable interrupts
3.10.37-rt38-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewior bige...@linutronix.de each per-queue lock is taken with spin_lock_irqsave() except in the case where all of them are taken for some kind of serialisation. As an optimisation local_irq_save() is used so that lock_tx_qs() and lock_rx_qs() can use just the spin_lock() variant instead. On RT local_irq_save() behaves differently so we use the nort() variant. Lockdep screems easily by ethtool -K eth0 rx off tx off What remains is missing lockdep annotation that makes lockdep think lock_tx_qs() may cause a dead lock. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior bige...@linutronix.de Signed-off-by: Steven Rostedt rost...@goodmis.org --- drivers/net/ethernet/freescale/gianfar.c | 16 drivers/net/ethernet/freescale/gianfar_ethtool.c | 8 drivers/net/ethernet/freescale/gianfar_sysfs.c | 24 3 files changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 0343a14..5c0efcc 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -1274,7 +1274,7 @@ static int gfar_suspend(struct device *dev) if (netif_running(ndev)) { - local_irq_save(flags); + local_irq_save_nort(flags); lock_tx_qs(priv); lock_rx_qs(priv); @@ -1292,7 +1292,7 @@ static int gfar_suspend(struct device *dev) unlock_rx_qs(priv); unlock_tx_qs(priv); - local_irq_restore(flags); + local_irq_restore_nort(flags); disable_napi(priv); @@ -1334,7 +1334,7 @@ static int gfar_resume(struct device *dev) /* Disable Magic Packet mode, in case something * else woke us up. */ - local_irq_save(flags); + local_irq_save_nort(flags); lock_tx_qs(priv); lock_rx_qs(priv); @@ -1346,7 +1346,7 @@ static int gfar_resume(struct device *dev) unlock_rx_qs(priv); unlock_tx_qs(priv); - local_irq_restore(flags); + local_irq_restore_nort(flags); netif_device_attach(ndev); @@ -2346,7 +2346,7 @@ void gfar_vlan_mode(struct net_device *dev, netdev_features_t features) u32 tempval; regs = priv-gfargrp[0].regs; - local_irq_save(flags); + local_irq_save_nort(flags); lock_rx_qs(priv); if (features NETIF_F_HW_VLAN_CTAG_TX) { @@ -2379,7 +2379,7 @@ void gfar_vlan_mode(struct net_device *dev, netdev_features_t features) gfar_change_mtu(dev, dev-mtu); unlock_rx_qs(priv); - local_irq_restore(flags); + local_irq_restore_nort(flags); } static int gfar_change_mtu(struct net_device *dev, int new_mtu) @@ -3258,14 +3258,14 @@ static irqreturn_t gfar_error(int irq, void *grp_id) dev-stats.tx_dropped++; atomic64_inc(priv-extra_stats.tx_underrun); - local_irq_save(flags); + local_irq_save_nort(flags); lock_tx_qs(priv); /* Reactivate the Tx Queues */ gfar_write(regs-tstat, gfargrp-tstat); unlock_tx_qs(priv); - local_irq_restore(flags); + local_irq_restore_nort(flags); } netif_dbg(priv, tx_err, dev, Transmit Error\n); } diff --git a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c index 21cd881..c965c0a 100644 --- a/drivers/net/ethernet/freescale/gianfar_ethtool.c +++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c @@ -501,7 +501,7 @@ static int gfar_sringparam(struct net_device *dev, /* Halt TX and RX, and process the frames which * have already been received */ - local_irq_save(flags); + local_irq_save_nort(flags); lock_tx_qs(priv); lock_rx_qs(priv); @@ -509,7 +509,7 @@ static int gfar_sringparam(struct net_device *dev, unlock_rx_qs(priv); unlock_tx_qs(priv); - local_irq_restore(flags); + local_irq_restore_nort(flags); for (i = 0; i priv-num_rx_queues; i++) gfar_clean_rx_ring(priv-rx_queue[i], @@ -552,7 +552,7 @@ int gfar_set_features(struct net_device *dev, netdev_features_t features) /* Halt TX and RX, and process the frames which * have already been received */ - local_irq_save(flags); + local_irq_save_nort(flags); lock_tx_qs(priv); lock_rx_qs(priv); @@ -560,7
[PATCH RT 0/4] Linux 3.10.37-rt38-rc2
Dear RT Folks, This is the RT stable review cycle of patch 3.10.37-rt38-rc2. Please scream at me if I messed something up. Please test the patches too. The -rc release will be uploaded to kernel.org and will be deleted when the final release is out. This is just a review release (or release candidate). The pre-releases will not be pushed to the git repository, only the final release is. If all goes well, this patch will be converted to the next main release on 4/30/2014. Enjoy, -- Steve To build 3.10.37-rt38-rc2 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.37.xz http://www.kernel.org/pub/linux/kernel/projects/rt/3.10/patch-3.10.37-rt38-rc2.patch.xz You can also build from 3.10.37-rt37 by applying the incremental patch: http://www.kernel.org/pub/linux/kernel/projects/rt/3.10/incr/patch-3.10.37-rt37-rt38-rc2.patch.xz Changes from 3.10.37-rt37: --- Sebastian Andrzej Siewior (3): net: gianfar: do not disable interrupts net: gianfar: do not try to cleanup TX packets if they are not done rcu: make RCU_BOOST default on RT Steven Rostedt (Red Hat) (1): Linux 3.10.37-rt38-rc2 drivers/net/ethernet/freescale/gianfar.c | 28 ++-- drivers/net/ethernet/freescale/gianfar_ethtool.c | 8 +++ drivers/net/ethernet/freescale/gianfar_sysfs.c | 24 ++-- init/Kconfig | 2 +- localversion-rt | 2 +- 5 files changed, 34 insertions(+), 30 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RT 2/4] net: gianfar: do not try to cleanup TX packets if they are not done
3.10.37-rt38-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewior bige...@linutronix.de What I observe is that the TX queue is not empty and does not make any progress. gfar_clean_tx_ring() does not clean up the packet because it is not completed yet. The root cause is that the DMA engine did not start yet (it was preempted before doing so) and that dumb loop, loops until that packet is gone. This is broken since c233cf4 (gianfar: Fix tx napi polling). What remains are spurious interrupts if CPU0 cleans up TX packages and CPU1 returns with IRQ_NONE. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior bige...@linutronix.de [ added return howmany; ] Signed-off-by: Steven Rostedt rost...@goodmis.org --- drivers/net/ethernet/freescale/gianfar.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 5c0efcc..b87a8c9 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -132,7 +132,6 @@ static int gfar_poll(struct napi_struct *napi, int budget); static void gfar_netpoll(struct net_device *dev); #endif int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit); -static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue); static void gfar_process_frame(struct net_device *dev, struct sk_buff *skb, int amount_pull, struct napi_struct *napi); void gfar_halt(struct net_device *dev); @@ -2475,7 +2474,7 @@ static void gfar_align_skb(struct sk_buff *skb) } /* Interrupt Handler for Transmit complete */ -static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) +static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) { struct net_device *dev = tx_queue-dev; struct netdev_queue *txq; @@ -2575,6 +2574,7 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) tx_queue-dirty_tx = bdp; netdev_tx_completed_queue(txq, howmany, bytes_sent); + return howmany; } static void gfar_schedule_cleanup(struct gfar_priv_grp *gfargrp) @@ -2856,10 +2856,14 @@ static int gfar_poll(struct napi_struct *napi, int budget) tx_queue = priv-tx_queue[i]; /* run Tx cleanup to completion */ if (tx_queue-tx_skbuff[tx_queue-skb_dirtytx]) { - gfar_clean_tx_ring(tx_queue); - has_tx_work = 1; + int ret; + + ret = gfar_clean_tx_ring(tx_queue); + if (ret) + has_tx_work++; } } + work_done += has_tx_work; for_each_set_bit(i, gfargrp-rx_bit_map, priv-num_rx_queues) { /* skip queue if not active */ -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2 4/9] crypto: qce: Add ablkcipher algorithms
Thanks for the review! On 04/28/2014 11:18 AM, Herbert Xu wrote: On Mon, Apr 14, 2014 at 03:48:40PM +0300, Stanimir Varbanov wrote: +} else if (IS_DES(flags)) { +u32 tmp[DES_EXPKEY_WORDS]; + +if (keylen != QCE_DES_KEY_SIZE) +goto badkey; No need to check here since you've already set min_keysize and max_keysize correctly. +} else if (IS_3DES(flags)) { +if (keylen != DES3_EDE_KEY_SIZE) +goto badkey; Ditto. OK, I will delete those needless keylen checks. -- regards, Stan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3] kdb: Infrastructure to display sequence files
This patchset started out as a simple patch to introduce the irqs command from Android's FIQ debugger to kdb. However it has since grown more powerful because allowing kdb to reuse existing kernel infrastructure gives us extra opportunities. Based on the comments at the top of irqdesc.h (plotting to take the irq_desc structure private to kernel/irq) and the relative similarity between FIQ debugger's irqs command and the contents /proc/interrupts we start by adding a kdb feature to print seq_files. This forms the foundation for a new command, interrupts. I have also been able to implement a much more generic command, seq_file, that can display a good number of files from pseudo filesystems. This command is very powerful although that power does mean care must be taken to deploy it safely. It is deliberately and by default aimed at your foot! Note that the risk associated with the seq_file command is why I implemented the interrupts command in C (in principle it could have been a kdb macro). Doing it in C codifies the need for show_interrupts() to continue using spin locks as its locking strategy. To give an idea of what can be done with this command. The following seq_operations structures worked correctly and report no errors: cpuinfo_op extfrag_op fragmentation_op gpiolib_seq_ops int_seq_ops (a.k.a. /proc/interrupts) pagetypeinfo_op unusable_op vmalloc_op zoneinfo_op The following display the information correctly but triggered errors (sleeping function called from invalid context) with lock debugging enabled: consoles_op crypto_seq_ops diskstats_op partitions_op slabinfo_op vmstat_op All tests are run on an ARM multi_v7_defconfig kernel (plus lots of debug features) and halted using magic SysRq so that kdb has interrupt context. Note also that some of the seq_operations structures hook into driver supplied code that will only be called if that driver is enabled so the test above are useful but cannot be exhaustive. Daniel Thompson (3): kdb: Add framework to display sequence files proc: Provide access to /proc/interrupts from kdb kdb: Implement seq_file command fs/proc/interrupts.c| 10 + include/linux/kdb.h | 3 +++ kernel/debug/kdb/kdb_io.c | 51 + kernel/debug/kdb/kdb_main.c | 28 + 4 files changed, 92 insertions(+) -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] proc: Provide access to /proc/interrupts from kdb
The contents of /proc/interrupts is useful to diagnose problems during boot up or when the system becomes unresponsive (or at least it can be if failure is causes by interrupt problems). This command is also seen in out-of-tree debug systems such as Android's FIQ debugger. This change allows the file to be displayed from kdb. Signed-off-by: Daniel Thompson daniel.thomp...@linaro.org --- fs/proc/interrupts.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/fs/proc/interrupts.c b/fs/proc/interrupts.c index a352d57..1f8eeaf 100644 --- a/fs/proc/interrupts.c +++ b/fs/proc/interrupts.c @@ -4,6 +4,7 @@ #include linux/irqnr.h #include linux/proc_fs.h #include linux/seq_file.h +#include linux/kdb.h /* * /proc/interrupts @@ -45,9 +46,18 @@ static const struct file_operations proc_interrupts_operations = { .release= seq_release, }; +#ifdef CONFIG_KGDB_KDB +static int kdb_interrupts(int argc, const char **argv) +{ + return kdb_print_seq_file(int_seq_ops); +} +#endif + static int __init proc_interrupts_init(void) { proc_create(interrupts, 0, NULL, proc_interrupts_operations); + kdb_register(interrupts, kdb_interrupts, , +Show /proc/interrupts, 3); return 0; } fs_initcall(proc_interrupts_init); -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RT 3/4] rcu: make RCU_BOOST default on RT
3.10.37-rt38-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewior bige...@linutronix.de Since it is no longer invoked from the softirq people run into OOM more often if the priority of the RCU thread is too low. Making boosting default on RT should help in those case and it can be switched off if someone knows better. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior bige...@linutronix.de Signed-off-by: Steven Rostedt rost...@goodmis.org --- init/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index 6c3a4fd..bd3612d 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -604,7 +604,7 @@ config TREE_RCU_TRACE config RCU_BOOST bool Enable RCU priority boosting depends on RT_MUTEXES PREEMPT_RCU - default n + default y if PREEMPT_RT_FULL help This option boosts the priority of preempted RCU readers that block the current preemptible RCU grace period for too long. -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/2 with seqcount v3] reservation: add suppport for read-only access using rcu
op 23-04-14 13:15, Maarten Lankhorst schreef: This adds 4 more functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Using seqcount and fixing some lockdep bugs. Changes since v2: - Fix some crashes, remove some unneeded barriers when provided by seqcount writes - Fix code to work correctly with sparse's RCU annotations. - Create a global string for the seqcount lock to make lockdep happy. Can I get this version reviewed? If it looks correct I'll mail the full series because it's intertwined with the TTM conversion to use this code. Ping, can anyone review this? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit x86: NMI nesting still buggy?
On Tue, 29 Apr 2014 06:29:04 -0700 H. Peter Anvin h...@linux.intel.com wrote: On 04/29/2014 06:05 AM, Jiri Kosina wrote: We were not able to come up with any other fix than avoiding using IST completely on x86_64, and instead going back to stack switching in software -- the same way 32bit x86 does. This is not possible, though, because there are several windows during which if we were to take an exception which doesn't do IST, e.g. NMI, we are worse than dead -- we are in fact rootable. Right after SYSCALL in particular. Ah, right. SYSCALL does not update RSP. :-( Hm, so anything that can fire up right after a SYSCALL must use IST. It's possible to use an alternative IDT that gets loaded as the first thing in an NMI handler, but this gets incredibly ugly... So basically, I have two questions: (1) is the above analysis correct? (if not, why?) (2) if it is correct, is there any other option for fix than avoiding using IST for exception stack switching, and having kernel do the legacy task switching (the same way x86_32 is doing)? It is not an option, see above. [1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf [2] A special case can occur if an SMI handler nests inside an NMI handler and then another NMI occurs. During NMI interrupt handling, NMI interrupts are disabled, so normally NMI interrupts are serviced and completed with an IRET instruction one at a time. When the processor enters SMM while executing an NMI handler, the processor saves the SMRAM state save map but does not save the attribute to keep NMI interrupts disabled. Potentially, an NMI could be latched (while in SMM or upon exit) and serviced upon exit of SMM even though the previous NMI handler has still not completed. I believe [2] only applies if there is an IRET executing inside the SMM handler, which should not normally be the case. It might also have been addressed since that was written, but I don't know. The trouble here is that the official Intel documentation describes how to do this and specifically requests the OS to cope with nested NMIs. Petr T -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ioctl CAP_LINUX_IMMUTABLE is checked in the wrong namespace
Hello, when using user namespaces I found a bug in the capability checks done by ioctl. If someone tries to use chattr +i while in a different user namespace it will get the following: ioctl(3, EXT2_IOC_SETFLAGS, 0x7fffa4fedacc) = -1 EPERM (Operation not permitted) I'm proposing a fix to this, by replacing the capable(CAP_LINUX_IMMUTABLE) check with ns_capable(current_cred()-user_ns, CAP_LINUX_IMMUTABLE). If you agree I can send patches for all filesystems. I'm proposing the following patch: --- fs/ext4/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index d011b69..25683d0 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -265,7 +265,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) * This test looks nicer. Thanks to Pauline Middelink */ if ((flags ^ oldflags) (EXT4_APPEND_FL | EXT4_IMMUTABLE_FL)) { - if (!capable(CAP_LINUX_IMMUTABLE)) + if (!ns_capable(current_cred()-user_ns, CAP_LINUX_IMMUTABLE)) goto flags_out; } -- 1.8.4 -- Marian Marinov Founder CEO of 1H Ltd. Jabber/GTalk: hack...@jabber.org ICQ: 7556201 Mobile: +359 886 660 270 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/10] arm64: UEFI support
On 04/29/2014 04:43 AM, Matt Fleming wrote: (Pulling in Peter and Stephen) On Tue, 29 Apr, at 11:28:17AM, Catalin Marinas wrote: The patches look fine to me, they've been through several rounds of review already. How do we propose these get merged as the series contains both generic and arm64 patches? And there are dependencies already in linux-next. Are the EFI patches in -next pulled from some non-rebaseable branch? Peter suggsted a plan when he took the generic EFI stuff that's in tip (and hence currently in linux-next), It doesn't hurt to inform Stephen, although I think it will simply fall out automatically since he uses git to merge and git will recognize the graph. During the merge window, it means they should not push their patches until Linus has accepted the precondition patches from the tip tree. Since Ingo and I try to push most of the tip tree as early as possible in the merge window, this is usually not a problem. So we currently have the prerequisites in tip/x86/efi, and assuming that this 10-patch series gets merged into a single branch somewhere, things should work automatically for linux-next. It may be prudent to negotiate a plan now for when the merge window opens because, as Peter mentions above, the stuff in tip/x86/efi needs to be merged by Linus first to avoid build breakage with the arm64 stuff. Whomever is going to push the arm64 stuff just needs to be aware of this constraint. Again, since we tend to push -tip very early in the merge window, unless there are problems or late additions, this is unlikely to be a problem in any way. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RT 4/4] Linux 3.10.37-rt38-rc2
3.10.37-rt38-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Steven Rostedt (Red Hat) rost...@goodmis.org --- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index a3b2408..43245dc 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt37 +-rt38-rc2 -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] kdb: Add framework to display sequence files
Lots of useful information about the system is held in pseudo filesystems and presented using the seq_file mechanism. Unfortunately during both boot up and kernel panic (both good times to break out kdb) it is difficult to examine these files. This patch introduces a means to display sequence files via kdb. Signed-off-by: Daniel Thompson daniel.thomp...@linaro.org --- include/linux/kdb.h | 3 +++ kernel/debug/kdb/kdb_io.c | 51 +++ 2 files changed, 54 insertions(+) diff --git a/include/linux/kdb.h b/include/linux/kdb.h index 290db12..2607893 100644 --- a/include/linux/kdb.h +++ b/include/linux/kdb.h @@ -25,6 +25,7 @@ typedef int (*kdb_func_t)(int, const char **); #include linux/init.h #include linux/sched.h #include linux/atomic.h +#include linux/seq_file.h #define KDB_POLL_FUNC_MAX 5 extern int kdb_poll_idx; @@ -117,6 +118,8 @@ extern __printf(1, 0) int vkdb_printf(const char *fmt, va_list args); extern __printf(1, 2) int kdb_printf(const char *, ...); typedef __printf(1, 2) int (*kdb_printf_t)(const char *, ...); +extern int kdb_print_seq_file(const struct seq_operations *ops); + extern void kdb_init(int level); /* Access to kdb specific polling devices */ diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c index 14ff484..c68c223 100644 --- a/kernel/debug/kdb/kdb_io.c +++ b/kernel/debug/kdb/kdb_io.c @@ -850,3 +850,54 @@ int kdb_printf(const char *fmt, ...) return r; } EXPORT_SYMBOL_GPL(kdb_printf); + +/* + * Display a seq_file on the kdb console. + */ + +static int __kdb_print_seq_file(struct seq_file *m, void *v) +{ + int i, res; + + res = m-op-show(m, v); + if (0 != res) + return KDB_BADLENGTH; + + for (i = 0; i m-count !KDB_FLAG(CMD_INTERRUPT); i++) + kdb_printf(%c, m-buf[i]); + m-count = 0; + + return 0; +} + +int kdb_print_seq_file(const struct seq_operations *ops) +{ + static char seq_buf[4096]; + static DEFINE_SPINLOCK(seq_buf_lock); + unsigned long flags; + struct seq_file m = { + .buf = seq_buf, + .size = sizeof(seq_buf), + /* .lock is deliberately uninitialized to help reveal +* unsupportable show methods +*/ + .op = ops, + }; + loff_t pos = 0; + void *v; + int res = 0; + + v = ops-start(m, pos); + while (v) { + spin_lock_irqsave(seq_buf_lock, flags); + res = __kdb_print_seq_file(m, v); + spin_unlock_irqrestore(seq_buf_lock, flags); + if (res != 0 || KDB_FLAG(CMD_INTERRUPT)) + break; + + v = ops-next(m, v, pos); + } + ops-stop(m, v); + + return res; +} -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/10] arm64: UEFI support
On 04/29/2014 06:47 AM, Catalin Marinas wrote: Waiting for the tip/x86/efi to be merged first is not a problem. We also need a stable base for testing the arm64 UEFI series, so I assume this series can be based onto tip/x86/efi (would such branch be rebased before hitting mainline?). Given that Leif's series contains both generic efi and arm64 patches, what's your preference for merging them? I'm happy to add my ack and they go via your tree (or the other way around). tip:x86/efi will not be rebased (barring major unforseen events). I'm not opposed to pushing the arm64 patches through -tip (via Matt), if it works with your workflow, either. Perhaps we need to rename the branch to tip:core/efi... -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2 1/9] crypto: qce: Add core driver implementation
Thanks for the review! On 04/28/2014 11:50 AM, Herbert Xu wrote: On Mon, Apr 14, 2014 at 03:48:37PM +0300, Stanimir Varbanov wrote: +if (backlog) +backlog-complete(backlog, -EINPROGRESS); The completion function needs to be called with BH disabled. Cheers, This is new for me because I saw similar code in cryptd.c where in cryptd_queue_worker() (workqueue context) the backlog-complete() is called outside of local_bh_disable(). -- regards, Stan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/12] scsi/NCR5380: fix debugging macros and #include structure
On Tue, 2014-04-29 at 15:15 +1200, Michael Schmitz wrote: Finn, On Tue, Apr 29, 2014 at 2:22 PM, Finn Thain fth...@telegraphics.com.au wrote: On Sat, 26 Apr 2014, James Bottomley wrote: OK, so this is a pretty big change to an unmaintained driver. I'll take it if you're willing to maintain the driver afterwards ... in which case I need another patch to add you to the MAINTAINERS file. Sure, I'm happy to support these patches and future work I plan to do on the driver. What additional responsibilities would come with adding my name the MAINTAINERS file? Perhaps Michael and Sam would be interested in sharing the role, for atari and sun3 NCR5380 drivers (?) If you insist ... (kidding - Im OK with it if James thinks it's worth it) As long as you understand how it works and how to fix it, the more the merrier. It gives me more people to yell at if something goes wrong with the driver. James -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 4/7] of: configure the platform device dma parameters
On Thu, 24 Apr 2014 11:30:04 -0400, Santosh Shilimkar santosh.shilim...@ti.com wrote: Retrieve DMA configuration from DT and setup platform device's DMA parameters. The DMA configuration in DT has to be specified using dma-ranges and dma-coherent properties if supported. We setup dma_pfn_offset using dma-ranges and dma_coherent_ops using dma-coherent device tree properties. The set_arch_dma_coherent_ops macro has to be defined by arch if it supports coherent dma_ops. Otherwise, set_arch_dma_coherent_ops() is declared as nop. Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Russell King li...@arm.linux.org.uk Cc: Arnd Bergmann a...@arndb.de Cc: Olof Johansson o...@lixom.net Cc: Grant Likely grant.lik...@linaro.org Cc: Rob Herring robh...@kernel.org Cc: Catalin Marinas catalin.mari...@arm.com Cc: Linus Walleij linus.wall...@linaro.org Signed-off-by: Grygorii Strashko grygorii.stras...@ti.com Signed-off-by: Santosh Shilimkar santosh.shilim...@ti.com --- drivers/of/platform.c | 48 --- include/linux/dma-mapping.h |7 +++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 48de98f..270c0b9 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -187,6 +187,50 @@ struct platform_device *of_device_alloc(struct device_node *np, EXPORT_SYMBOL(of_device_alloc); /** + * of_dma_configure - Setup DMA configuration + * @dev: Device to apply DMA configuration + * + * Try to get devices's DMA configuration from DT and update it + * accordingly. + * + * In case if platform code need to use own special DMA configuration,it + * can use Platform bus notifier and handle BUS_NOTIFY_ADD_DEVICE event + * to fix up DMA configuration. + */ +static void of_dma_configure(struct device *dev) +{ + u64 dma_addr, paddr, size; + int ret; + + dev-coherent_dma_mask = DMA_BIT_MASK(32); + if (!dev-dma_mask) + dev-dma_mask = dev-coherent_dma_mask; + + /* + * if dma-coherent property exist, call arch hook to setup + * dma coherent operations. + */ + if (of_dma_is_coherent(dev-of_node)) { + set_arch_dma_coherent_ops(dev); + dev_dbg(dev, device is dma coherent\n); + } + + /* + * if dma-ranges property doesn't exist - just return else + * setup the dma offset + */ + ret = of_dma_get_range(dev-of_node, dma_addr, paddr, size); + if ((ret == -ENODEV) || (ret 0)) { + dev_dbg(dev, no dma range information to setup\n); + return; + } + + /* DMA ranges found. Calculate and set dma_pfn_offset */ + dev-dma_pfn_offset = PFN_DOWN(paddr - dma_addr); + dev_dbg(dev, dma_pfn_offset(%#08lx)\n, dev-dma_pfn_offset); I've got two concerns here. of_dma_get_range() retrieves only the first tuple from the dma-ranges property, but it is perfectly valid for dma-ranges to contain multiple tuples. How should we handle it if a device has multiple ranges it can DMA from? Second, while the pfn offset is being determined, I don't see anything making use of either the base address or size. How is the device constrained to only getting DMA buffers from within that range? Is the driver expected to manage that directly? g. +} + +/** * of_platform_device_create_pdata - Alloc, initialize and register an of_device * @np: pointer to node to create device for * @bus_id: name to assign device @@ -214,9 +258,7 @@ static struct platform_device *of_platform_device_create_pdata( #if defined(CONFIG_MICROBLAZE) dev-archdata.dma_mask = 0xUL; #endif - dev-dev.coherent_dma_mask = DMA_BIT_MASK(32); - if (!dev-dev.dma_mask) - dev-dev.dma_mask = dev-dev.coherent_dma_mask; + of_dma_configure(dev-dev); dev-dev.bus = platform_bus_type; dev-dev.platform_data = platform_data; diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index fd4aee2..c7d9b1b 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -123,6 +123,13 @@ static inline int dma_coerce_mask_and_coherent(struct device *dev, u64 mask) extern u64 dma_get_required_mask(struct device *dev); +#ifndef set_arch_dma_coherent_ops +static inline int set_arch_dma_coherent_ops(struct device *dev) +{ + return 0; +} +#endif + static inline unsigned int dma_get_max_seg_size(struct device *dev) { return dev-dma_parms ? dev-dma_parms-max_segment_size : 65536; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 2/7] arm64: Decouple page size from level of translation tables
Jungseok, On Tue, Apr 29, 2014 at 05:59:20AM +0100, Jungseok Lee wrote: +choice + prompt Level of translation tables + default ARM64_3_LEVELS if ARM64_4K_PAGES + default ARM64_2_LEVELS if ARM64_64K_PAGES + help + Allows level of translation tables. + +config ARM64_2_LEVELS + bool 2 level + depends on ARM64_64K_PAGES + help + This feature enables 2 levels of translation tables. + +config ARM64_3_LEVELS + bool 3 level + depends on ARM64_4K_PAGES + help + This feature enables 3 levels of translation tables. + +endchoice As I mentioned previously (http://www.spinics.net/linux/lists/arm-kernel/msg319552.html), just expose options for the VA space bits rather than the number of levels. You can still keep the number of levels config options but not visible in menuconfig (though I think you could also hide them in some header and avoid config altogether). The VA bits config options can be: VA_BITS_39 if 4K (3 levels) VA_BITS_42 if 64K (2 levels) VA_BITS_47 if 16K (3 levels) VA_BITS_48 if 4K || 16K || 64K (4/4/3 levels depending on page size) That's more meaningful to people configuring the kernel. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview
-Original Message- From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Tuesday, April 29, 2014 6:33 PM To: R, Durgadoss Cc: Linux PM; Wysocki, Rafael J; LKML; David E. Box; Alan Cox; Accardi, Kristen C Subject: Re: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview On Tue, 29 Apr 2014 02:45:22 + R, Durgadoss durgados...@intel.com wrote: Hi Jacob, -Original Message- From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Monday, April 28, 2014 7:35 PM To: Linux PM; Wysocki, Rafael J; LKML Cc: David E. Box; Alan Cox; R, Durgadoss; Accardi, Kristen C; Jacob Pan Subject: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview RAPL power limit reduce power by limiting CPU P-state and other techniques. On Valleyview, RAPL power limit cannot go to LFM (low frequency mode) if we don't set the floor frequency via IOSF mailbox. This patch enables setting of floor frquency such that RAPL power limit is more effective. Signed-off-by: Jacob Pan jacob.jun@linux.intel.com --- drivers/powercap/intel_rapl.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c index b1cda6f..13e4776 100644 --- a/drivers/powercap/intel_rapl.c +++ b/drivers/powercap/intel_rapl.c @@ -32,6 +32,7 @@ #include asm/processor.h #include asm/cpu_device_id.h +#include asm/iosf_mbi.h /* bitmasks for RAPL MSRs, used by primitive access functions */ #define ENERGY_STATUS_MASK 0x @@ -336,11 +337,17 @@ static int find_nr_power_limit(struct rapl_domain *rd) return i; } +#define VLV_CPU_POWER_BUDGET_CTL (0x2) +static const struct x86_cpu_id valleyview_id[] = { + { X86_VENDOR_INTEL, 6, 0x37}, + {} +}; There are other platforms that have this FloorFreq register as well. And those addresses are not '0x02'. So, we need to have a cpu_id based table to define the address of the floor freq register as well. [This is not specific to valleyview.] Sounds like I need to add an abstraction to capture this. So far, there are only two exceptions so i was hesitate to do so. Thanks for the input. Yes, We at least have few platforms that need this. Also, is there a plan to expose this floor freq ratio through Sysfs for runtime configuration. ? May be through a standard thermal cooling device interface ? why would that be necessary? who will use it? floor freq only affects RAPL, AFAIK. In Linux there is no guaranteed freq anyway. My original patch to enable RAPL as cooling device was abandoned in favor of powercap framework, I am not sure if we should go back. There are user space thermal controls which change RAPL Power limits according to platform's thermal condition as you might be aware. The floor frequency is not used only to transition to LFM ratio. We can transition to any frequency ratio by adjusting this floor frequency (at least on VLV and couple more platforms) Hence while changing RAPL Power Limits, there is a need to adjust this also, to specify which ratio is our Floor (basically we will not go below that). That's why we need an interface for modifying this at run time (along with Power Limits). Thanks, Durga + static int set_domain_enable(struct powercap_zone *power_zone, bool mode) { struct rapl_domain *rd = power_zone_to_rapl_domain(power_zone); int nr_powerlimit; - + u32 mdata = 0; if (rd-state DOMAIN_STATE_BIOS_LOCKED) return -EACCES; get_online_cpus(); @@ -350,7 +357,16 @@ static int set_domain_enable(struct powercap_zone *power_zone, bool mode) /* always enable clamp such that p-state can go below OS requested * range. power capping priority over guranteed frequency. */ - rapl_write_data_raw(rd, PL1_CLAMP, mode); + if (x86_match_cpu(valleyview_id)) { + iosf_mbi_read(BT_MBI_UNIT_PMC, BT_MBI_PMC_READ, + VLV_CPU_POWER_BUDGET_CTL, mdata); + mdata = ~(0x7f 8); + mdata |= 1 8; + iosf_mbi_write(BT_MBI_UNIT_PMC, BT_MBI_PMC_WRITE, + VLV_CPU_POWER_BUDGET_CTL, mdata); + } else + rapl_write_data_raw(rd, PL1_CLAMP, mode); + /* some domains have pl2 */ if (nr_powerlimit 1) { rapl_write_data_raw(rd, PL2_ENABLE, mode); @@ -833,11 +849,6 @@ static int rapl_write_data_raw(struct rapl_domain *rd, return 0; } -static const struct x86_cpu_id energy_unit_quirk_ids[] = { - { X86_VENDOR_INTEL, 6, 0x37},/* Valleyview */ - {} -}; Same thing here. There are other Atom platforms that need this conversion quirk. So, please keep the table as is. Thanks, Durga - static int rapl_check_unit(struct rapl_package *rp, int cpu) { u64 msr_val; @@ -859,7 +870,7 @@ static int
Re: [PATCH v4 3/7] arm64: Introduce a kernel configuration option for VA_BITS
On Tue, Apr 29, 2014 at 05:59:23AM +0100, Jungseok Lee wrote: +config ARM64_VA_BITS + int Virtual address space size + range 39 39 if ARM64_4K_PAGES ARM64_3_LEVELS + range 42 42 if ARM64_64K_PAGES ARM64_2_LEVELS + help + This feature is determined by a combination of page size and + level of translation tables. OK, so you are doing the VA bits selection already. But see my other email about setting only exposing this and hiding the number of levels (though number of levels can be mentioned in the help). -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] nohz: Fix iowait overcounting if iowait task migrates
On Thu, Apr 24, 2014 at 08:45:58PM +0200, Denys Vlasenko wrote: Before this change, if last IO-blocked task wakes up on a different CPU, the original CPU may stay idle for much longer, and the entire time it stays idle is accounted as iowait time. This change adds struct tick_sched::iowait_exittime member. On entry to idle, it is set to KTIME_MAX. Last IO-blocked task, if migrated, sets it to current time. Note that this can happen only once per each idle period: new iowaiting tasks can't magically appear on idle CPU's rq. If iowait_exittime is set, then (iowait_exittime - idle_entrytime) gets accounted as iowait, and the remaining (now - iowait_exittime) as true idle. Run-tested: /proc/stat counters no longer go backwards. Signed-off-by: Denys Vlasenko dvlas...@redhat.com Cc: Frederic Weisbecker fweis...@gmail.com Cc: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Cc: Fernando Luis Vazquez Cao fernando...@lab.ntt.co.jp Cc: Tetsuo Handa penguin-ker...@i-love.sakura.ne.jp Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Andrew Morton a...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com Cc: Oleg Nesterov o...@redhat.com --- include/linux/tick.h | 2 ++ kernel/sched/core.c | 14 +++ kernel/time/tick-sched.c | 64 3 files changed, 70 insertions(+), 10 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 4de1f9e..1bf653e 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -67,6 +67,7 @@ struct tick_sched { ktime_t idle_exittime; ktime_t idle_sleeptime; ktime_t iowait_sleeptime; + ktime_t iowait_exittime; seqcount_t idle_sleeptime_seq; ktime_t sleep_length; unsigned long last_jiffies; @@ -140,6 +141,7 @@ extern void tick_nohz_irq_exit(void); extern ktime_t tick_nohz_get_sleep_length(void); extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time); extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time); +extern void tick_nohz_iowait_to_idle(int cpu); # else /* !CONFIG_NO_HZ_COMMON */ static inline int tick_nohz_tick_stopped(void) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 268a45e..ffea757 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4218,7 +4218,14 @@ void __sched io_schedule(void) current-in_iowait = 1; schedule(); current-in_iowait = 0; +#ifdef CONFIG_NO_HZ_COMMON + if (atomic_dec_and_test(rq-nr_iowait)) { + if (raw_smp_processor_id() != cpu_of(rq)) + tick_nohz_iowait_to_idle(cpu_of(rq)); + } +#else atomic_dec(rq-nr_iowait); +#endif delayacct_blkio_end(); } EXPORT_SYMBOL(io_schedule); @@ -4234,7 +4241,14 @@ long __sched io_schedule_timeout(long timeout) current-in_iowait = 1; ret = schedule_timeout(timeout); current-in_iowait = 0; +#ifdef CONFIG_NO_HZ_COMMON + if (atomic_dec_and_test(rq-nr_iowait)) { + if (raw_smp_processor_id() != cpu_of(rq)) + tick_nohz_iowait_to_idle(cpu_of(rq)); + } +#else atomic_dec(rq-nr_iowait); +#endif delayacct_blkio_end(); return ret; } diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 47ed7cf..d78c942 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -408,15 +408,27 @@ static void tick_nohz_update_jiffies(ktime_t now) static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) { - ktime_t delta; + ktime_t delta, entry, end; /* Updates the per cpu time idle statistics counters */ write_seqcount_begin(ts-idle_sleeptime_seq); - delta = ktime_sub(now, ts-idle_entrytime); - if (ts-idle_active == 2) + entry = ts-idle_entrytime; + delta = ktime_sub(now, entry); + if (ts-idle_active == 2) { + end = ts-iowait_exittime; + if (end.tv64 != KTIME_MAX) { + /* + * Last iowaiting task on our rq was woken up on other CPU + * sometime in the past, it updated ts-iowait_exittime. + */ + delta = ktime_sub(now, end); + ts-idle_sleeptime = ktime_add(ts-idle_sleeptime, delta); + delta = ktime_sub(end, entry); + } ts-iowait_sleeptime = ktime_add(ts-iowait_sleeptime, delta); - else + } else { ts-idle_sleeptime = ktime_add(ts-idle_sleeptime, delta); + } ts-idle_active = 0; write_seqcount_end(ts-idle_sleeptime_seq); @@ -430,6 +442,7 @@ static ktime_t
Re: [PATCH v2 00/10] arm64: UEFI support
On Tue, 29 Apr, at 02:47:28PM, Catalin Marinas wrote: Waiting for the tip/x86/efi to be merged first is not a problem. We also need a stable base for testing the arm64 UEFI series, so I assume this series can be based onto tip/x86/efi (would such branch be rebased before hitting mainline?). tip/x86/efi is unlikely to be rebased. Certainly with dependencies like this there would have to be a really good reason to rebase it. Given that Leif's series contains both generic efi and arm64 patches, what's your preference for merging them? I'm happy to add my ack and they go via your tree (or the other way around). I'm happy either way, though if I take them through my tree (and subsequently through tip) you won't have to worry about the merge window rigmarole, which is a plus. So, eveyone happy for me to take these with Catalin's Acked-by? -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging: line6: fix possible overrun
At Mon, 28 Apr 2014 01:44:25 +0300, Dan Carpenter wrote: On Sun, Apr 27, 2014 at 10:00:43PM +0200, Mateusz Guzik wrote: and a WARN_ON + -EINVAL in line6_init_audio to catch future offenders. Returning -EINVAL is a bad idea because it would break the driver completely and make it unusable. Well I would vote for returning the error anyway. I'm trying to be polite, but you are talking about adding regressions deliberately... It's very rare for people to deliberately add regressions to the kernel. I have only seen it one time before. I don't think Dan would be against returning -EINVAL if all the offender codes have been fixed (e.g. truncating strings to fit with the fixed arrays) at first. Then it'd be a good help to catch any future bugs. But, having -EINVAL without fixing the caller side means essentially that you're introducing the breakage intentionally although you know it certainly breaks, which is obviously bad. Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] drivercore: deferral race condition fix
On Tue, Apr 29, 2014 at 2:13 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Tue, Apr 29, 2014 at 01:35:09PM +0100, Grant Likely wrote: When the kernel is built with CONFIG_PREEMPT it is possible to reach a state when all modules loaded but some driver still stuck in the deferred list and there is a need for external event to kick the deferred queue to probe these drivers. [...] Hi Greg, This change needs to go into 3.15. I've got this patch in the devicetree/merge branch of my tree and can ask Linus to pull it directly if you would like. Sure, that would be fine: Acked-by: Greg Kroah-Hartman gre...@linuxfoundation.org Thanks Greg I'll give it a few days in linux-next and then ask Linus to pull. g. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ACPI / EC: Process rather than discard events in acpi_ec_clear
Address a regression caused by commit ad332c8a4533: (ACPI / EC: Clear stale EC events on Samsung systems) After the earlier patch, there was found to be a race condition on some earlier Samsung systems (N150/N210/N220). The function acpi_ec_clear was sometimes discarding a new EC event before its GPE was triggered by the system. In the case of these systems, this meant that the lid open event was not registered on resume if that was the cause of the wake, leading to problems when attempting to close the lid to suspend again. After testing on a number of Samsung systems, both those affected by the previous EC bug and those affected by the race condition, it seemed that the best course of action was to process rather than discard the events. On Samsung systems which accumulate stale EC events, there does not seem to be any adverse side-effects of running the associated _Q methods. This patch adds an argument to the static function acpi_ec_sync_query so that it may be used within the acpi_ec_clear loop in place of acpi_ec_query_unlocked which was used previously. With thanks to Stefan Biereigel for reporting the issue, and for all the people who helped test the new patch on affected systems. References: https://lkml.kernel.org/r/532fe3b2.9060...@biereigel-wb.de References: https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173 Reported-by: Stefan Biereigel ste...@biereigel.de Signed-off-by: Kieran Clancy clancy.kie...@gmail.com Tested-by: Stefan Biereigel ste...@biereigel.de Tested-by: Dennis Jansen dennis.jan...@web.de Tested-by: Nicolas Porcel nicolasporce...@gmail.com Tested-by: Maurizio D'Addona mauritiusd...@gmail.com Tested-by: Juan Manuel Cabo juanmanuel.c...@gmail.com Tested-by: Giannis Koutsou giannis.kout...@gmail.com Tested-by: Kieran Clancy clancy.kie...@gmail.com Cc: Lan Tianyu tianyu@intel.com --- To maintainers: Assuming this patch is accepted, please mark this for inclusion in all -stable trees. It should be noted that the previous patch (ad332c8a4533) was excluded from a number of stable trees after the regression was found, but should now be included again along with this patch. I am not sure of the correct way to annotate this above. drivers/acpi/ec.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index d7d32c2..ad11ba4 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -206,13 +206,13 @@ unlock: spin_unlock_irqrestore(ec-lock, flags); } -static int acpi_ec_sync_query(struct acpi_ec *ec); +static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data); static int ec_check_sci_sync(struct acpi_ec *ec, u8 state) { if (state ACPI_EC_FLAG_SCI) { if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, ec-flags)) - return acpi_ec_sync_query(ec); + return acpi_ec_sync_query(ec, NULL); } return 0; } @@ -443,10 +443,8 @@ acpi_handle ec_get_handle(void) EXPORT_SYMBOL(ec_get_handle); -static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data); - /* - * Clears stale _Q events that might have accumulated in the EC. + * Process _Q events that might have accumulated in the EC. * Run with locked ec mutex. */ static void acpi_ec_clear(struct acpi_ec *ec) @@ -455,7 +453,7 @@ static void acpi_ec_clear(struct acpi_ec *ec) u8 value = 0; for (i = 0; i ACPI_EC_CLEAR_MAX; i++) { - status = acpi_ec_query_unlocked(ec, value); + status = acpi_ec_sync_query(ec, value); if (status || !value) break; } @@ -582,13 +580,18 @@ static void acpi_ec_run(void *cxt) kfree(handler); } -static int acpi_ec_sync_query(struct acpi_ec *ec) +static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data) { u8 value = 0; int status; struct acpi_ec_query_handler *handler, *copy; - if ((status = acpi_ec_query_unlocked(ec, value))) + + status = acpi_ec_query_unlocked(ec, value); + if (data) + *data = value; + if (status) return status; + list_for_each_entry(handler, ec-list, node) { if (value == handler-query_bit) { /* have custom handler for this bit */ @@ -612,7 +615,7 @@ static void acpi_ec_gpe_query(void *ec_cxt) if (!ec) return; mutex_lock(ec-mutex); - acpi_ec_sync_query(ec); + acpi_ec_sync_query(ec, NULL); mutex_unlock(ec-mutex); } -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RESEND PATCH V5 0/8] remove cpu_load idx
On Wed, Apr 16, 2014 at 03:43:21AM +0100, Alex Shi wrote: In the cpu_load decay usage, we mixed the long term, short term load with balance bias, randomly pick a big or small value according to balance destination or source. I disagree that it is random. min()/max() in {source,target}_load() provides a conservative bias to the load estimate that should prevent us from trying to pull tasks from the source cpu if its current load is just a temporary spike. Likewise, we don't try to pull tasks to the target cpu if the load is just a temporary drop. This mix is wrong, the balance bias should be based on task moving cost between cpu groups, not on random history or instant load. Your patch set actually changes everything to be based on the instant load alone. rq-cfs.runnable_load_avg is updated instantaneously when tasks are enqueued and deqeueue, so this load expression is quite volatile. What do you mean by task moving cost? History load maybe diverage a lot from real load, that lead to incorrect bias. like on busy_idx, We mix history load decay and bias together. The ridiculous thing is, when all cpu load are continuous stable, long/short term load is same. then we lose the bias meaning, so any minimum imbalance may cause unnecessary task moving. To prevent this funny thing happen, we have to reuse the imbalance_pct again in find_busiest_group(). But that clearly causes over bias in normal time. If there are some burst load in system, it is more worse. Isn't imbalance_pct only used once in the periodic load-balance path? It is not clear to me what the over bias problem is. If you have a stable situation, I would expect the long and short term load to be the same? As to idle_idx: Though I have some cencern of usage corretion, https://lkml.org/lkml/2014/3/12/247 but since we are working on cpu idle migration into scheduler. The problem will be reconsidered. We don't need to care it too much now. In fact, the cpu_load decays can be replaced by the sched_avg decay, that also decays load on time. The balance bias part can fullly use fixed bias -- imbalance_pct, which is already used in newly idle, wake, forkexec balancing and numa balancing scenarios. As I have said previously, I agree that cpu_load[] is somewhat broken in its current form, but I don't see how removing it and replacing it with the instantaneous cpu load solves the problems you point out. The current cpu_load[] averages the cpu_load over time, while rq-cfs.runnable_load_avg is the sum of the currently runnable tasks' load_avg_contrib. The former provides a long term view of the cpu_load, the latter does not. It can change radically in an instant. I'm therefore a bit concerned about the stability of the load-balance decisions. However, since most decisions are based on cpu_load[0] anyway, we could try setting LB_BIAS to false as Peter suggests and see what happens. Morten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] percpu_ida: introduce kobject for percpu_ida pool
So that we can export some allocation/free information for monitoring percpu_ida performance. Signed-off-by: Ming Lei tom.leim...@gmail.com --- include/linux/percpu_ida.h | 16 lib/percpu_ida.c | 21 ++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/include/linux/percpu_ida.h b/include/linux/percpu_ida.h index f5cfdd6..463e3b3 100644 --- a/include/linux/percpu_ida.h +++ b/include/linux/percpu_ida.h @@ -8,6 +8,7 @@ #include linux/spinlock_types.h #include linux/wait.h #include linux/cpumask.h +#include linux/kobject.h struct percpu_ida_cpu; @@ -52,6 +53,8 @@ struct percpu_ida { unsignednr_free; unsigned*freelist; } cacheline_aligned_in_smp; + + struct kobject kobj; }; /* @@ -79,4 +82,17 @@ int percpu_ida_for_each_free(struct percpu_ida *pool, percpu_ida_cb fn, void *data); unsigned percpu_ida_free_tags(struct percpu_ida *pool, int cpu); + +static inline int percpu_ida_kobject_add(struct percpu_ida *pool, + struct kobject *parent, const char *name) +{ + if (pool-kobj.state_initialized) + return kobject_add(pool-kobj, parent, name); + return 0; +} +static inline void percpu_ida_kobject_del(struct percpu_ida *pool) +{ + if (pool-kobj.state_in_sysfs) + kobject_del(pool-kobj); +} #endif /* __PERCPU_IDA_H__ */ diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c index 93d145e..56ae350 100644 --- a/lib/percpu_ida.c +++ b/lib/percpu_ida.c @@ -260,6 +260,20 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag) } EXPORT_SYMBOL_GPL(percpu_ida_free); +static void percpu_ida_release(struct kobject *kobj) +{ + struct percpu_ida *pool = container_of(kobj, + struct percpu_ida, kobj); + + free_percpu(pool-tag_cpu); + free_pages((unsigned long) pool-freelist, + get_order(pool-nr_tags * sizeof(unsigned))); +} + +static struct kobj_type percpu_ida_ktype = { + .release= percpu_ida_release, +}; + /** * percpu_ida_destroy - release a tag pool's resources * @pool: pool to free @@ -268,9 +282,8 @@ EXPORT_SYMBOL_GPL(percpu_ida_free); */ void percpu_ida_destroy(struct percpu_ida *pool) { - free_percpu(pool-tag_cpu); - free_pages((unsigned long) pool-freelist, - get_order(pool-nr_tags * sizeof(unsigned))); + if (pool-kobj.state_initialized) + kobject_put(pool-kobj); } EXPORT_SYMBOL_GPL(percpu_ida_destroy); @@ -324,6 +337,8 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags, for_each_possible_cpu(cpu) spin_lock_init(per_cpu_ptr(pool-tag_cpu, cpu)-lock); + kobject_init(pool-kobj, percpu_ida_ktype); + return 0; err: percpu_ida_destroy(pool); -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] percpu_ida: support exporting allocation/free info via sysfs
With this information, it is easy to monitor percpu_ida performance. Signed-off-by: Ming Lei tom.leim...@gmail.com --- include/linux/percpu_ida.h | 24 lib/Kconfig|7 +++ lib/percpu_ida.c | 130 +++- 3 files changed, 159 insertions(+), 2 deletions(-) diff --git a/include/linux/percpu_ida.h b/include/linux/percpu_ida.h index 463e3b3..be1036d 100644 --- a/include/linux/percpu_ida.h +++ b/include/linux/percpu_ida.h @@ -12,6 +12,27 @@ struct percpu_ida_cpu; +#ifdef CONFIG_PERCPU_IDA_STATS +struct percpu_ida_stats { + u64 alloc_tags; + u64 alloc_in_fastpath; + u64 alloc_from_global_pool; + u64 alloc_by_stealing; + u64 alloc_after_sched; + + u64 freed_tags; + u64 freed_empty; + u64 freed_full; +}; + +#define percpu_ida_inc(pool, ptr) \ +do { \ + __this_cpu_inc(pool-stats-ptr); \ +} while (0) +#else +#define percpu_ida_inc(pool, ptr) do {} while (0) +#endif + struct percpu_ida { /* * number of tags available to be allocated, as passed to @@ -55,6 +76,9 @@ struct percpu_ida { } cacheline_aligned_in_smp; struct kobject kobj; +#ifdef CONFIG_PERCPU_IDA_STATS + struct percpu_ida_stats __percpu *stats; +#endif }; /* diff --git a/lib/Kconfig b/lib/Kconfig index 325a8d4..d47a1cf 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -476,6 +476,13 @@ config OID_REGISTRY help Enable fast lookup object identifier registry. +config PERCPU_IDA_STATS + bool Export percpu_ida status by sysfs + default n + help + Export percpu_ida allocation/free information so + the performance can be monitored. + config UCS2_STRING tristate diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c index 56ae350..6f6c68d 100644 --- a/lib/percpu_ida.c +++ b/lib/percpu_ida.c @@ -42,6 +42,105 @@ struct percpu_ida_cpu { unsignedfreelist[]; }; +#ifdef CONFIG_PERCPU_IDA_STATS +struct pcpu_ida_sysfs_entry { + struct attribute attr; + ssize_t (*show)(struct percpu_ida *, char *); +}; + +#define pcpu_ida_show(field, fmt) \ +static ssize_t field##_show(struct percpu_ida *pool, char *buf) \ +{ \ + u64 val = 0;\ + ssize_t rc; \ + unsigned cpu; \ + \ + for_each_possible_cpu(cpu) \ + val += per_cpu_ptr(pool-stats, cpu)-field;\ + \ + rc = sprintf(buf, fmt, val);\ + return rc; \ +} + +#define PERCPU_IDA_ATTR_RO(_name) \ + struct pcpu_ida_sysfs_entry pcpu_ida_attr_##_name = __ATTR_RO(_name) + +#define pcpu_ida_attr_ro(field, fmt) \ + pcpu_ida_show(field, fmt) \ + static PERCPU_IDA_ATTR_RO(field) + +pcpu_ida_attr_ro(alloc_tags, %lld\n); +pcpu_ida_attr_ro(alloc_in_fastpath, %lld\n); +pcpu_ida_attr_ro(alloc_from_global_pool, %lld\n); +pcpu_ida_attr_ro(alloc_by_stealing, %lld\n); +pcpu_ida_attr_ro(alloc_after_sched, %lld\n); +pcpu_ida_attr_ro(freed_tags, %lld\n); +pcpu_ida_attr_ro(freed_empty, %lld\n); +pcpu_ida_attr_ro(freed_full, %lld\n); + +ssize_t pcpu_ida_sysfs_max_size_show(struct percpu_ida *pool, char *page) +{ + ssize_t rc = sprintf(page, %u\n, pool-percpu_max_size); + return rc; +} + +static struct pcpu_ida_sysfs_entry pcpu_ida_attr_max_size = { + .attr = {.name = percpu_max_size, .mode = S_IRUGO}, + .show = pcpu_ida_sysfs_max_size_show, +}; + +ssize_t pcpu_ida_sysfs_batch_size_show(struct percpu_ida *pool, char *page) +{ + ssize_t rc = sprintf(page, %u\n, pool-percpu_batch_size); + return rc; +} + +static struct pcpu_ida_sysfs_entry pcpu_ida_attr_batch_size = { + .attr = {.name = percpu_batch_size, .mode = S_IRUGO}, + .show = pcpu_ida_sysfs_batch_size_show, +}; + +static ssize_t percpu_ida_sysfs_show(struct kobject *kobj, + struct attribute *attr, char *page) +{ + struct pcpu_ida_sysfs_entry *entry; + struct percpu_ida *pool; + ssize_t res = -EIO; + + entry = container_of(attr, struct pcpu_ida_sysfs_entry, attr); + pool = container_of(kobj, struct percpu_ida, kobj); + + if (!entry-show) + return res; + res = entry-show(pool, page); + return res; +} + +static struct attribute *percpu_ida_def_attrs[] = { +
[PATCH 0/3] percpu_ida: support to export allocation/free information
Hi, These patches support to export percpu_ida allocation/free information via sysfs, so that percpu_ida performance can be monitored, and at least two use cases: - some parameters(such as percpu_max_size) from its users are very sensitive to performance - the data is helpful for verifying patches which try to improve percpu_ida Thanks, -- Ming Lei -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 4/7] arm64: Add a description on 48-bit address space with 4KB pages
On Tue, Apr 29, 2014 at 05:59:27AM +0100, Jungseok Lee wrote: --- a/Documentation/arm64/memory.txt +++ b/Documentation/arm64/memory.txt @@ -8,10 +8,11 @@ This document describes the virtual memory layout used by the AArch64 Linux kernel. The architecture allows up to 4 levels of translation tables with a 4KB page size and up to 3 levels with a 64KB page size. -AArch64 Linux uses 3 levels of translation tables with the 4KB page -configuration, allowing 39-bit (512GB) virtual addresses for both user -and kernel. With 64KB pages, only 2 levels of translation tables are -used but the memory layout is the same. +AArch64 Linux uses 3 levels and 4 levels of translation tables with +the 4KB page configuration, allowing 39-bit (512GB) and 48-bit (256TB) +virtual addresses, respectively, for both user and kernel. With 64KB +pages, only 2 levels of translation tables are used but the memory layout +is the same. Any reason why we couldn't use 48-bit address space with 64K pages (implying 3 levels)? -AArch64 Linux memory layout with 64KB pages: +AArch64 Linux memory layout with 4KB pages + 4 levels: + +StartEnd SizeUse +--- + 256TB user + + 7bfe~124TB vmalloc BTW, maybe as a separate patch we should change the end to be exclusive. It becomes harder to modify (I've been through this a few times already ;)) and even follow the changes. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/10] arm64: UEFI support
On 04/29/2014 07:47 AM, Matt Fleming wrote: On Tue, 29 Apr, at 02:47:28PM, Catalin Marinas wrote: Waiting for the tip/x86/efi to be merged first is not a problem. We also need a stable base for testing the arm64 UEFI series, so I assume this series can be based onto tip/x86/efi (would such branch be rebased before hitting mainline?). tip/x86/efi is unlikely to be rebased. Certainly with dependencies like this there would have to be a really good reason to rebase it. Given that Leif's series contains both generic efi and arm64 patches, what's your preference for merging them? I'm happy to add my ack and they go via your tree (or the other way around). I'm happy either way, though if I take them through my tree (and subsequently through tip) you won't have to worry about the merge window rigmarole, which is a plus. So, eveyone happy for me to take these with Catalin's Acked-by? I'm wondering if it would be better to organize it into a separate topic branch. We can still take it through tip, if you want, but it would be better than putting it all into one tree. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] bio: modify __bio_add_page() to accept pages that don't start a new segment
The original behaviour is to refuse to add a new page if the maximum number of segments has been reached, regardless of the fact the page we are going to add can be merged into the last segment or not. Unfortunately, when the system runs under heavy memory fragmentation conditions, a driver may try to add multiple pages to the last segment. The original code won't accept them and EBUSY will be reported to userspace. This patch modifies the function so it refuses to add a page only in case the latter starts a new segment and the maximum number of segments has already been reached. The bug can be easily reproduced with the st driver: 1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE to 16 2) modprobe st buffer_kbs=1024 3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10 dd: error writing ‘/dev/st0’: Device or resource busy Signed-off-by: Maurizio Lombardi mlomb...@redhat.com --- fs/bio.c | 50 -- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/fs/bio.c b/fs/bio.c index 6f0362b..9a3a0b1 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -750,29 +750,31 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page return 0; /* -* we might lose a segment or two here, but rather that than -* make this too complex. +* setup the new entry, we might clear it again later if we +* cannot add the page +*/ + bvec = bio-bi_io_vec[bio-bi_vcnt]; + bvec-bv_page = page; + bvec-bv_len = len; + bvec-bv_offset = offset; + bio-bi_vcnt++; + bio-bi_phys_segments++; + + /* +* Perform a recount if the number of segments is greater +* than queue_max_segments(q). */ - while (bio-bi_phys_segments = queue_max_segments(q)) { + while (bio-bi_phys_segments queue_max_segments(q)) { if (retried_segments) - return 0; + goto failed; retried_segments = 1; blk_recount_segments(q, bio); } /* -* setup the new entry, we might clear it again later if we -* cannot add the page -*/ - bvec = bio-bi_io_vec[bio-bi_vcnt]; - bvec-bv_page = page; - bvec-bv_len = len; - bvec-bv_offset = offset; - - /* * if queue has other restrictions (eg varying max sector size * depending on offset), it can specify a merge_bvec_fn in the * queue to get further control @@ -789,23 +791,27 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page * merge_bvec_fn() returns number of bytes it can accept * at this offset */ - if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) { - bvec-bv_page = NULL; - bvec-bv_len = 0; - bvec-bv_offset = 0; - return 0; - } + if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) + goto failed; } /* If we may be able to merge these biovecs, force a recount */ - if (bio-bi_vcnt (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) + if (bio-bi_vcnt 1 (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) bio-bi_flags = ~(1 BIO_SEG_VALID); - bio-bi_vcnt++; - bio-bi_phys_segments++; done: bio-bi_iter.bi_size += len; return len; + + failed: + bvec-bv_page = NULL; + bvec-bv_len = 0; + bvec-bv_offset = 0; + bio-bi_vcnt--; + if (!retried_segments) + bio-bi_phys_segments--; + + return 0; } /** -- Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] blk-mq: add percpu_ida kobjects
So that the percpu_ida performance can be monitored. Signed-off-by: Ming Lei tom.leim...@gmail.com --- block/blk-mq-sysfs.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 8145b5b..4171ae2 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -329,6 +329,8 @@ void blk_mq_unregister_disk(struct gendisk *disk) kobject_del(ctx-kobj); kobject_put(ctx-kobj); } + percpu_ida_kobject_del(hctx-tags-free_tags); + percpu_ida_kobject_del(hctx-tags-reserved_tags); kobject_del(hctx-kobj); kobject_put(hctx-kobj); } @@ -362,6 +364,11 @@ int blk_mq_register_disk(struct gendisk *disk) if (ret) break; + percpu_ida_kobject_add(hctx-tags-free_tags, + hctx-kobj, free_tags); + percpu_ida_kobject_add(hctx-tags-reserved_tags, + hctx-kobj, reserved_tags); + if (!hctx-nr_ctx) continue; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging: line6: fix possible overrun
Yeah. If this were a brand new driver then returning -EINVAL would be a good idea. Smatch actually warns about this code as well if you turn on the --spammy option. But there are too many of these kinds of warnings and even I can't check them all so the warning is basically useless. In a few months I will have improved the Smatch code to know that the source string is too large so this bug could have been avoided. regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RESEND PATCH V5 0/8] remove cpu_load idx
On Thu, Apr 24, 2014 at 05:20:29PM +0100, Peter Zijlstra wrote: OK, this series is a lot saner, with the exception of 3/8 and dependents. I do still worry a bit for loosing the longer term view for the big domains though. Sadly I don't have any really big machines. I think the entire series is equivalent to setting LB_BIAS to false. So I suppose we could do that for a while and if nobody reports horrible things we could just do this. Anybody? I can't say what will happen on big machines, but I think the LB_BIAS test could be a way to see what happens. I'm not convinced that it won't lead to more task migrations since we will use the instantaneous cpu load (weighted_cpuload()) unfiltered. Morten -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: l2c: prima2: only call l2x0_of_init() on matching nodes
2014-04-28 22:52 GMT+08:00 Russell King - ARM Linux li...@arm.linux.org.uk: On Mon, Apr 28, 2014 at 10:37:09AM -0400, Matt Porter wrote: The fix is tested against bcm281xx and bcm21664 as that is what the l2c cleanup breaks in -next. As mentioned, I don't have the sirfsoc h/w so this first attempt at a fix also breaks their platform. It can be addressed by adding those platform specific compatibles back to the dts, of course. I'd much prefer that the sirfsoc folks fix this...it's going to break other platforms in a multi v7 build. Well, it's about time we got rid of this from platform specific code anyway, taking it away from platform maintainers to mess around with. So that's what I'm doing. It's worth noting that if you build a single zImage with exynos also enabled, then you also end up with an unconditional call from that code to l2x0_of_init() with it's own magic numbers - and that applies before my changes. So let's fix this properly and yank this crap from platform maintainers fingers. i mentioned dropping specific dts compatible prop will break non-csr platforms in the mail thread ARM: prima2: remove L2 cache size override and i said i was going to send v2. you said you need it before rc6. now it has been sent, but i am sorry it is not against next-20140424. -- FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly improving, and getting towards what was expected from it. -barry -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] rwsem: Support optimistic spinning
On Mon, Apr 28, 2014 at 05:50:49PM -0700, Tim Chen wrote: On Mon, 2014-04-28 at 16:10 -0700, Paul E. McKenney wrote: +#ifdef CONFIG_SMP +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) +{ + int retval; + struct task_struct *owner; + + rcu_read_lock(); + owner = ACCESS_ONCE(sem-owner); OK, I'll bite... Why ACCESS_ONCE() instead of rcu_dereference()? We're using it as a speculative check on the sem-owner to see if the owner is running on the cpu. The rcu_read_lock is used for ensuring that the owner-on_cpu memory is still valid. OK, so if we read complete garbage, all that happens is that we lose a bit of performance? If so, I am OK with it as long as there is a comment (which Davidlohr suggested later in this thread). Thanx, Paul (My first question was where is the update side, but this is covered by task_struct allocation and deallocation.) Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: l2c: prima2: only call l2x0_of_init() on matching nodes
2014-04-28 21:40 GMT+08:00 Matt Porter mpor...@linaro.org: On Mon, Apr 28, 2014 at 10:15:33AM +0100, Russell King wrote: On Sun, Apr 27, 2014 at 08:27:40PM -0400, Matt Porter wrote: l2x0_of_init() is executed unconditionally within the sirfsoc_l2x0_init() early initcall. In a multi v7 kernel this causes bcm281xx and bcm21664 platform to fail boot since they have their own pre l2x0 init sequence that is required. Fix this by checking that a matching OF ID is present before calling l2x0_of_init(). Reported-by: Kevin Hilman khil...@linaro.org Signed-off-by: Matt Porter mpor...@linaro.org --- Applies against next-20140424 to fix the issue introduced in 50655e6 ARM: l2c: prima2: remove cache size override Err, this only fixes it because it effectively disables the L2 cache _entirely_ - in the above commit, I kill your private compatible strings. This doesn't make sense. If the cache is already enabled, then the L2C code won't try to enable it again. Ok, please suggest an alternative. You merged this commit..it looks like it had no ack from the platform maintainer..and I don't have hardware. The commit is wrong, we can't have every platform executing sirfsoc's l2x0_of_init() call/parameters by having this stuff in an early initcall like that. It would be pretty straightforward to add those private compatibles back so the approach works. If not, we need to move this to .init_machine where it's guaranteed to only run on sirfsoc. these has been one V1 patch at http://permalink.gmane.org/gmane.linux.ports.arm.kernel/316312 my v2 has moved to init_irq() as Russell's suggestion. -Matt -barry -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/7] Generic serial earlycon
On Tue, Apr 29, 2014 at 6:09 AM, Catalin Marinas catalin.mari...@arm.com wrote: On Fri, Apr 18, 2014 at 11:19:53PM +0100, Rob Herring wrote: Rob Herring (7): x86: move FIX_EARLYCON_MEM kconfig into x86 tty/serial: add generic serial earlycon tty/serial: convert 8250 to generic earlycon tty/serial: pl011: add generic earlycon support tty/serial: add arm/arm64 semihosting earlycon arm64: enable FIX_EARLYCON_MEM kconfig arm64: remove arch specific earlyprintk The series looks fine, you can add: Acked-by: Catalin Marinas catalin.mari...@arm.com Thanks. BTW, are you merging all of them via some other tree or would prefer me to take the arm64-specific patches? Greg has taken it, but there were a few issues, so it may get reposted. Rob -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: OFD (file private) locks and NFS
On Tue, 29 Apr 2014 07:40:08 -0400 (EDT) Matt W. Benjamin m...@linuxbox.com wrote: Hi Jeff, Something which came up on the last Ganesha conn call is that we have a pretty strong need for some ability to wait on a set of locks, and perhaps receive events. Frank Filz believed that you had made a proposal which would cover this. Can you elaborate on that? Thanks, Matt No, there's no mechanism to wait on a set of locks from within the context of a single thread of execution or to receive events. Again, that would be a new API beyond what I've been proposing over the last several months. Some kind of facility to enable one user space thread to wait on multiple blocked locks would definitely be helpful to user space servers. Our current plan is to have a pool of threads, and dispatch blocking locks to them. If that pool is exhausted, all further locks would be dispatched to a single thread that would poll for locks. Frank -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] nohz: move NOHZ code bits out of io_schedule{,_timeout} into a helper
On Fri, Apr 25, 2014 at 08:57:29PM +0200, Denys Vlasenko wrote: Signed-off-by: Denys Vlasenko dvlas...@redhat.com Cc: Frederic Weisbecker fweis...@gmail.com Cc: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Cc: Fernando Luis Vazquez Cao fernando...@lab.ntt.co.jp Cc: Tetsuo Handa penguin-ker...@i-love.sakura.ne.jp Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Andrew Morton a...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com Cc: Oleg Nesterov o...@redhat.com --- kernel/sched/core.c | 33 + 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ffea757..3137980 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4208,6 +4208,21 @@ EXPORT_SYMBOL_GPL(yield_to); * This task is about to go to sleep on IO. Increment rq-nr_iowait so * that process accounting knows that this is a task in IO wait state. */ +#ifdef CONFIG_NO_HZ_COMMON +static __sched void io_wait_end(struct rq *rq) +{ + if (atomic_dec_and_test(rq-nr_iowait)) { + if (raw_smp_processor_id() != cpu_of(rq)) + tick_nohz_iowait_to_idle(cpu_of(rq)); + } +} +#else +static inline void io_wait_end(struct rq *rq) +{ + atomic_dec(rq-nr_iowait); +} +#endif + void __sched io_schedule(void) { struct rq *rq = raw_rq(); @@ -4218,14 +4233,7 @@ void __sched io_schedule(void) current-in_iowait = 1; schedule(); current-in_iowait = 0; -#ifdef CONFIG_NO_HZ_COMMON - if (atomic_dec_and_test(rq-nr_iowait)) { - if (raw_smp_processor_id() != cpu_of(rq)) - tick_nohz_iowait_to_idle(cpu_of(rq)); - } -#else - atomic_dec(rq-nr_iowait); -#endif + io_wait_end(rq); delayacct_blkio_end(); There is much more to unify that the iowait accounting between all the io_schedule() declensions. Peterz I think you had a patch to unify that a few month ago? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: l2c: prima2: only call l2x0_of_init() on matching nodes
On Tue, Apr 29, 2014 at 11:05:06PM +0800, Barry Song wrote: 2014-04-28 22:52 GMT+08:00 Russell King - ARM Linux li...@arm.linux.org.uk: On Mon, Apr 28, 2014 at 10:37:09AM -0400, Matt Porter wrote: The fix is tested against bcm281xx and bcm21664 as that is what the l2c cleanup breaks in -next. As mentioned, I don't have the sirfsoc h/w so this first attempt at a fix also breaks their platform. It can be addressed by adding those platform specific compatibles back to the dts, of course. I'd much prefer that the sirfsoc folks fix this...it's going to break other platforms in a multi v7 build. Well, it's about time we got rid of this from platform specific code anyway, taking it away from platform maintainers to mess around with. So that's what I'm doing. It's worth noting that if you build a single zImage with exynos also enabled, then you also end up with an unconditional call from that code to l2x0_of_init() with it's own magic numbers - and that applies before my changes. So let's fix this properly and yank this crap from platform maintainers fingers. i mentioned dropping specific dts compatible prop will break non-csr platforms in the mail thread ARM: prima2: remove L2 cache size override and i said i was going to send v2. you said you need it before rc6. now it has been sent, but i am sorry it is not against next-20140424. FFS. IT HASN'T BEEN SENT. All that I did was drop it into linux-next so that more people would get off their fat backsides and test this fscking patch set - something which hasn't happened because no one pays attention to emails sent to mailing lists. I also told you that this was what I was going to do. But... is it really on to hold up such a large patch set which impacts virtually everyone because _you_ don't have time to sort out your small special requirements - no it is not, that's just fscking selfish. Anyway, I've had it with dealing with platform maintainers, I've yanked this patch set, and I'm no longer planning to do anything with it - platform maintainers have destroyed my will to get any of this series into the kernel. So, the L2 cache code is going to remain in its current state, and it's going to rot because it's _FAR_ too much effort dealing with slow people like yourselves, or people who want the series split up, or people who whinge that there aren't any acks there (WELL GET OFF YOUR FAT BACKSIDES AND SEND ME SOME IF YOU CARE ABOUT THIS - no, don't, I'm no longer pushing this series.) This is the last time I'm going to ever try cleaning up any core ARM code. Core ARM maintanence is impossible in this environment with arm-soc split from core ARM stuff, because core ARM stuff /always/ impacts on SoC specific code. You can't get away from that. My position in this community has been made impossible and obsolete by Linaro. I'm at the point of walking away from this crap. -- FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly improving, and getting towards what was expected from it. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/7] tty/serial: convert 8250 to generic earlycon
On Mon, Apr 28, 2014 at 9:56 PM, Yinghai Lu ying...@kernel.org wrote: On Mon, Apr 28, 2014 at 4:24 PM, Rob Herring robherri...@gmail.com wrote: On Sat, Apr 26, 2014 at 1:29 AM, Yinghai Lu ying...@kernel.org wrote: Thanks for finding these. I missed them in my build tests. This should fix them: diff --git a/drivers/tty/serial/8250/8250_early.c b/drivers/tty/serial/8250/8250_early.c index e83c9db..2094c3b 100644 --- a/drivers/tty/serial/8250/8250_early.c +++ b/drivers/tty/serial/8250/8250_early.c @@ -156,6 +156,11 @@ static int __init early_serial8250_setup(struct earlycon_device *device, EARLYCON_DECLARE(uart8250, early_serial8250_setup); EARLYCON_DECLARE(uart, early_serial8250_setup); +int __init setup_early_serial8250_console(char *cmdline) +{ + return setup_earlycon(cmdline, uart8250, early_serial8250_setup); +} + int serial8250_find_port_for_earlycon(void) { struct earlycon_device *device = early_device; that only handle uart8250,, may need to add more lines to handle uart, That is on purpose because the only 2 users use uart8250. I consider this a legacy interface and use of uart is horrible because there are lots of uarts which are not 8250. Rob +int __init setup_early_serial8250_console(char *cmdline) +{ + char *options; + options = strstr(cmdline, uart8250,); + if (options) + return setup_earlycon(cmdline, uart8250, early_serial8250_setup); + + options = strstr(cmdline, uart,); + if (options) + return setup_earlycon(cmdline, uart, early_serial8250_setup); + + return 0; +} + Thanks Yinghai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[3.15-rc3] rtmutex-debug assertion.
Just hit this while fuzzing the futex() syscall. WARNING: CPU: 2 PID: 6202 at kernel/locking/rtmutex-debug.c:151 debug_rt_mutex_proxy_unlock+0x4e/0x60() DEBUG_LOCKS_WARN_ON(!rt_mutex_owner(lock)) Modules linked in: tun fuse ipt_ULOG nfnetlink bnep can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can_raw can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 cfg80211 coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm xfs libcrc32c btusb bluetooth snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_timer snd microcode serio_raw pcspkr usb_debug 6lowpan_iphc rfkill shpchp ptp pps_core soundcore CPU: 2 PID: 6202 Comm: trinity-c63 Not tainted 3.15.0-rc3+ #201 0009 de725d52 880099befbd8 92746dad 880099befc20 880099befc10 9206d46d 88020951c010 88009d718000 88009d718000 c90011408680 c90011408688 Call Trace: [92746dad] dump_stack+0x4e/0x7a [9206d46d] warn_slowpath_common+0x7d/0xa0 [9206d4ec] warn_slowpath_fmt+0x5c/0x80 [920c533e] debug_rt_mutex_proxy_unlock+0x4e/0x60 [920c4d77] rt_mutex_proxy_unlock+0x17/0x40 [920ead7a] free_pi_state+0x6a/0xb0 [920eade0] unqueue_me_pi+0x20/0x40 [920ebfc2] futex_lock_pi.isra.18+0x262/0x3f0 [92096910] ? hrtimer_get_res+0x50/0x50 [920edb2c] do_futex+0x2ec/0xb60 [92349897] ? debug_smp_processor_id+0x17/0x20 [920bf3ee] ? put_lock_stats.isra.23+0xe/0x30 [920bf756] ? lock_release_holdtime.part.24+0xe6/0x160 [920a3cdd] ? get_parent_ip+0xd/0x50 [9275698b] ? preempt_count_sub+0x6b/0xf0 [92751f51] ? _raw_spin_unlock+0x31/0x50 [920ee420] SyS_futex+0x80/0x180 [9275b0e4] tracesys+0xdd/0xe2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] x86/PCI: Support additional MMIO range capabilities
On 4/29/2014 5:20 AM, Borislav Petkov wrote: On Tue, Apr 29, 2014 at 09:33:09AM +0200, Andreas Herrmann wrote: I am sure, it's because some server systems had MMIO ECS access not enabled in BIOS. I can't remember which systems were affected. If you are referring to accessing PCI ECS ranges via 0xCF8, then yes, BIOS disable this as described below in the BKDG. The BIOS may use either configuration space access mechanism during boot. Before booting the OS, BIOS must disable IO access to ECS, enable MMIO configuration and build an ACPI defined MCFG table. BIOS ACPI code must use MMIO to access configuration space. Ok, now AMD people: what's the story with IO ECS, can we assume that on everything after F10h, BIOS has a sensible MCFG and we can limit this to F10h only? I like Bjorn's idea but we need to make sure a working MCFG is ubiquitous. Which begs the real question: Suravee, why are you even touching IO ECS provided F15h and later have a MCFG? Or, do they? As I was trying to generalize the logic inside amd_bus.c, which seems to be used mainly as a fallback mechanism, I tried to maintain the existing code, which does many things: 1. Setup numa_node information (if PXM doesn't exist) 2. Probe NB for MMIO resources (if MCFG doesn't exist) 3. Probe NB for IO resources 4. Setup IO ECS In the new code, the IO ECS was needed to retrieve the AMD_NB_F1_MMIO_BASE_LIMIT_HI_REG (offset 0x180) during the early initialization as part of (2) logic. However, this register exists only on the newer systems. However, as you mentioned, for (2) we can assume that the MCFG exists for most of the systems (family10h and later), and should be used instead. The main purpose of this patch set is mainly to deal with the the node information (1). So, we might need to split these all up and handle them separately as needed where (2) and (3) will be used as fallback for older systems where MCFG does not exist. I am not sure if where we need (4). Suravee -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit x86: NMI nesting still buggy?
On Tue, 29 Apr 2014, Steven Rostedt wrote: According to 38.4 of [1], when SMM mode is entered while the CPU is handling NMI, the end result might be that upon exit from SMM, NMIs will be re-enabled and latched NMI delivered as nested [2]. Note, if this were true, then the x86_64 hardware would be extremely buggy. That's because NMIs are not made to be nested. If SMM's come in during an NMI and re-enables the NMI, then *all* software would break. That would basically make NMIs useless. The only time I've ever witness problems (and I stress NMIs all the time), is when the NMI itself does a fault. Which my patch set handles properly. Yes, it indeed does. In the scenario I have outlined, the race window is extremely small, plus NMIs don't happen that often, plus SMIs don't happen that often, plus (hopefully) many BIOSes don't enable NMIs upon SMM exit. The problem is, that Intel documentation is clear in this respect, and explicitly states it can happen. And we are violating that, which makes me rather nervous -- it'd be very nice to know what is the background of 38.4 section text in the Intel docs. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging: line6: fix possible overrun
On Tue, Apr 29, 2014 at 04:47:11PM +0200, Takashi Iwai wrote: At Mon, 28 Apr 2014 01:44:25 +0300, Dan Carpenter wrote: On Sun, Apr 27, 2014 at 10:00:43PM +0200, Mateusz Guzik wrote: and a WARN_ON + -EINVAL in line6_init_audio to catch future offenders. Returning -EINVAL is a bad idea because it would break the driver completely and make it unusable. Well I would vote for returning the error anyway. I'm trying to be polite, but you are talking about adding regressions deliberately... It's very rare for people to deliberately add regressions to the kernel. I have only seen it one time before. I don't think Dan would be against returning -EINVAL if all the offender codes have been fixed (e.g. truncating strings to fit with the fixed arrays) at first. Then it'd be a good help to catch any future bugs. But, having -EINVAL without fixing the caller side means essentially that you're introducing the breakage intentionally although you know it certainly breaks, which is obviously bad. We clearly have a serious miscommunication here (and apparently it started with me not addressing the concern of complete driver breakage). line6_init_audio consumers have to be fixed first, no doubt about that. I was only commenting on catching *future* offenders, which I thought would implictly mean *afterwards*. With that in mind it would seem we are in agreement after all. :-) As far getting this done maybe OP is interested. -- Mateusz Guzik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/10] arm64: UEFI support
On Tue, 29 Apr, at 07:56:20AM, H. Peter Anvin wrote: I'm wondering if it would be better to organize it into a separate topic branch. We can still take it through tip, if you want, but it would be better than putting it all into one tree. Sure, that makes sense. I'll do that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch V2 0/9] I2C ACPI operation region handler support
On Tuesday, April 29, 2014 09:54:46 AM Lan Tianyu wrote: On 2014年04月29日 06:51, Rafael J. Wysocki wrote: On Monday, April 28, 2014 10:27:39 PM Lan Tianyu wrote: ACPI 5.0 spec(5.5.2.4.5) defines GenericSerialBus(i2c, spi, uart) operation region. It allows ACPI aml code able to access such kind of devices to implement some ACPI standard method. On the Asus T100TA, Bios use GenericSerialBus operation region to access i2c device to get battery info. So battery function depends on the I2C operation region support. Here is the bug link. https://bugzilla.kernel.org/show_bug.cgi?id=69011 This patchset is to add I2C ACPI operation region handler support. Change Since V1: Fix some code style and memory leak issues in Patch 7 Is it the only patch that has changed from v1? I also remove a redundant semicolon in the PATCH 8. Sorry. I didn't notice these patches are already in your tree. I will produce divergence patches based on your bleeding-edge branch. No need for that, I'll use the new versions. -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V4] Add support for flag status register on Micron chips.
Some new Micron flash chips require reading the flag status register to determine when operations have completed. Furthermore, chips with multi-die stacks of the 65nm 256Mb QSPI also require reading the status register before reading the flag status register. This patch adds support for the flag status register in the n25q512ax3 and n25q00 Micron QSPI flash chips. Signed-off-by: Graham Moore grmo...@altera.com --- V4: Do not set nor-wait_till_ready if driver has already set it. V3: Rebase to l2-mtd spinor branch. V2: Remove leading underscore in function names. Remove type cast in dev_err call and use the proper format specifier instead. --- drivers/mtd/spi-nor/spi-nor.c | 52 + include/linux/mtd/spi-nor.h |4 2 files changed, 56 insertions(+) diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c index d6f44d5..7e2817e 100644 --- a/drivers/mtd/spi-nor/spi-nor.c +++ b/drivers/mtd/spi-nor/spi-nor.c @@ -48,6 +48,25 @@ static int read_sr(struct spi_nor *nor) } /* + * Read the flag status register, returning its value in the location + * Return the status register value. + * Returns negative if error occurred. + */ +static int read_fsr(struct spi_nor *nor) +{ + int ret; + u8 val; + + ret = nor-read_reg(nor, SPINOR_OP_RDFSR, val, 1); + if (ret 0) { + pr_err(error %d reading FSR\n, ret); + return ret; + } + + return val; +} + +/* * Read configuration register, returning its value in the * location. Return the configuration register value. * Returns negative if error occured. @@ -165,6 +184,32 @@ static int spi_nor_wait_till_ready(struct spi_nor *nor) return -ETIMEDOUT; } +static int spi_nor_wait_till_fsr_ready(struct spi_nor *nor) +{ + unsigned long deadline; + int sr; + int fsr; + + deadline = jiffies + MAX_READY_WAIT_JIFFIES; + + do { + cond_resched(); + + sr = read_sr(nor); + if (sr 0) { + break; + } else if (!(sr SR_WIP)) { + fsr = read_fsr(nor); + if (fsr 0) + break; + if (fsr FSR_READY) + return 0; + } + } while (!time_after_eq(jiffies, deadline)); + + return -ETIMEDOUT; +} + /* * Service routine to read status register until ready, or timeout occurs. * Returns non-zero if error. @@ -402,6 +447,7 @@ struct flash_info { #defineSECT_4K_PMC 0x10/* SPINOR_OP_BE_4K_PMC works uniformly */ #defineSPI_NOR_DUAL_READ 0x20/* Flash supports Dual Read */ #defineSPI_NOR_QUAD_READ 0x40/* Flash supports Quad Read */ +#defineUSE_FSR 0x80/* use flag status register */ }; #define INFO(_jedec_id, _ext_id, _sector_size, _n_sectors, _flags) \ @@ -488,6 +534,8 @@ const struct spi_device_id spi_nor_ids[] = { { n25q128a13, INFO(0x20ba18, 0, 64 * 1024, 256, 0) }, { n25q256a,INFO(0x20ba19, 0, 64 * 1024, 512, SECT_4K) }, { n25q512a,INFO(0x20bb20, 0, 64 * 1024, 1024, SECT_4K) }, + { n25q512ax3, INFO(0x20ba20, 0, 64 * 1024, 1024, USE_FSR) }, + { n25q00, INFO(0x20ba21, 0, 64 * 1024, 2048, USE_FSR) }, /* PMC */ { pm25lv512, INFO(0,0, 32 * 1024,2, SECT_4K_PMC) }, @@ -965,6 +1013,10 @@ int spi_nor_scan(struct spi_nor *nor, const struct spi_device_id *id, else mtd-_write = spi_nor_write; + if ((info-flags USE_FSR) + nor-wait_till_ready == spi_nor_wait_till_ready) + nor-wait_till_ready = spi_nor_wait_till_fsr_ready; + /* prefer small sector erase if possible */ if (info-flags SECT_4K) { nor-erase_opcode = SPINOR_OP_BE_4K; diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h index 5324184..9e6294f 100644 --- a/include/linux/mtd/spi-nor.h +++ b/include/linux/mtd/spi-nor.h @@ -34,6 +34,7 @@ #define SPINOR_OP_SE 0xd8/* Sector erase (usually 64KiB) */ #define SPINOR_OP_RDID 0x9f/* Read JEDEC ID */ #define SPINOR_OP_RDCR 0x35/* Read configuration register */ +#define SPINOR_OP_RDFSR0x70/* Read flag status register */ /* 4-byte address opcodes - used on Spansion and some Macronix flashes. */ #define SPINOR_OP_READ40x13/* Read data bytes (low frequency) */ @@ -66,6 +67,9 @@ #define SR_QUAD_EN_MX 0x40/* Macronix Quad I/O */ +/* Flag Status Register bits */ +#define FSR_READY 0x80 + /* Configuration Register bits. */ #define CR_QUAD_EN_SPAN0x2 /* Spansion Quad I/O */ -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to
Re: [PATCH 04/16] perf, mmap: Factor out perf_get_fd()
On 25.04.14 16:52:05, Peter Zijlstra wrote: But no, I don't think that helps, its still true that the moment you get a fd another thread can immediately close(). That would drop the last ref and free it, meanwhile perf_event_open() is happily poking at it. Now I think you could cure this by adding an extra ref before calling your perf_get_fd() and dropping that extra ref at the end, where we used to have fd_install(). Yes, right. I have a solution now which increments the event's ref count before creating the file descriptor using try_get_event()/ put_event(). The patch also does not remove get_unused_fd_flags() and the err_fd error handler. Have an update already of a rebase version but still need to test it. Would it be ok to split the patch set and send in a first step only the first 4 patches that refactor the perf mmap code? Thanks, -Robert -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 2/3] documentation: Record rcu_dereference() value mishandling
On Tue, Apr 29, 2014 at 01:42:13AM -0400, Pranith Kumar wrote: Minor nits below: Other than that Acked-by: Pranith Kumar bobby.pr...@gmail.com On Tue, Apr 29, 2014 at 1:04 AM, Andev debian...@gmail.com wrote: From: Paul E. McKenney paul...@linux.vnet.ibm.com Recent LKML discussings (see http://lwn.net/Articles/586838/ and http://lwn.net/Articles/588300/ for the LWN writeups) brought out some ways of misusing the return value from rcu_dereference() that are not necessarily completely intuitive. This commit therefore documents what can and cannot safely be done with these values. Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com snip + + o The pointer is never dereferenced after being compared. + Since there are no subsequent dereferences, the compiler + cannot use anything it learned from the comparison + to reorder the non-existent subsequent dereferences. + This sort of comparison occurs frequently when scanning + RCU-protected circular linked lists. + + o The comparison is against a pointer pointer that duplicate pointer, remove one Good catch, fixed! + references memory that was initialized a long time ago. + The reason this is safe is that even if misordering + occurs, the misordering will not affect the accesses + that follow the comparison. So exactly how long ago is + a long time ago? Here are some possibilities: snip + o All of the accesses following the comparison are stores, + so that a control dependency preserves the needed ordering. + That said, it is easy to get control dependencies wrong. + Please see the CONTROL DEPENDENCIES section of + Documentation/memory-barriers.txt for more details. + + o The pointers compared not-equal -and- the compiler does add in are - The pointers compared are not-equal... Actually, compared is a verb here. But that use is a bit obscure, so taking your suggestion as a bug report. I changed it to read: The pointers are not equal -and- the compiler does not have enough information to deduce the value of the pointer. Fair enough? + not have enough information to deduce the value of the + pointer. Note that the volatile cast in rcu_dereference() + will normally prevent the compiler from knowing too much. + +o Disable any value-speculation optimizations that your compiler + might provide, especially if you are making use of feedback-based + optimizations that take data collected from prior runs. Such + value-speculation optimizations reorder operations by design. + + There is one exception to this rule: Value-speculation + optimizations that leverage the branch-prediction hardware are + safe on strongly ordered systems (such as x86), but not on weakly + ordered systems (such as ARM or Power). Choose your compiler + command-line options wisely! + + +EXAMPLE OF AMPLIFIED RCU-USAGE BUG + +Because updaters can run concurrently with RCU readers, RCU readers can +see stale and/or inconsistent values. If RCU readers need fresh or +consistent values, which they sometimes do, they need to take proper +precautions. To see this, consider the following code fragment: + + struct foo { + int a; + int b; + int c; + }; + struct foo *gp1; + struct foo *gp2; + + void updater(void) + { + struct foo *p; + + p = kmalloc(...); + if (p == NULL) + deal_with_it(); + p-a = 42; /* Each field in its own cache line. */ + p-b = 43; + p-c = 44; + rcu_assign_pointer(gp1, p); + p-b = 143; + p-c = 144; + rcu_assign_pointer(gp2, p); + } + + void reader(void) + { + struct foo *p; + struct foo *q; + int r1, r2; + + p = rcu_dereference(gp2); + r1 = p-b; /* Guaranteed to get 143. */ + q = rcu_dereference(gp1); + if (p == q) { + /* The compiler decides that q-c is same as p-c. */ + r2 = p-c; /* Could get 44 on weakly order system. */ + } + } + +You might be surprised that the outcome (r1 == 143 r2 == 44) is possible, +but you should not be. After all, the updater might have been invoked +a second time between the time reader() loaded into r1 and the time +that it loaded into r2. The fact that this same
Re: [PATCH] ARM: l2c: prima2: only call l2x0_of_init() on matching nodes
2014-04-29 23:14 GMT+08:00 Russell King - ARM Linux li...@arm.linux.org.uk: On Tue, Apr 29, 2014 at 11:05:06PM +0800, Barry Song wrote: 2014-04-28 22:52 GMT+08:00 Russell King - ARM Linux li...@arm.linux.org.uk: On Mon, Apr 28, 2014 at 10:37:09AM -0400, Matt Porter wrote: The fix is tested against bcm281xx and bcm21664 as that is what the l2c cleanup breaks in -next. As mentioned, I don't have the sirfsoc h/w so this first attempt at a fix also breaks their platform. It can be addressed by adding those platform specific compatibles back to the dts, of course. I'd much prefer that the sirfsoc folks fix this...it's going to break other platforms in a multi v7 build. Well, it's about time we got rid of this from platform specific code anyway, taking it away from platform maintainers to mess around with. So that's what I'm doing. It's worth noting that if you build a single zImage with exynos also enabled, then you also end up with an unconditional call from that code to l2x0_of_init() with it's own magic numbers - and that applies before my changes. So let's fix this properly and yank this crap from platform maintainers fingers. i mentioned dropping specific dts compatible prop will break non-csr platforms in the mail thread ARM: prima2: remove L2 cache size override and i said i was going to send v2. you said you need it before rc6. now it has been sent, but i am sorry it is not against next-20140424. FFS. IT HASN'T BEEN SENT. All that I did was drop it into linux-next so that more people would get off their fat backsides and test this fscking patch set - something which hasn't happened because no one pays attention to emails sent to mailing lists. so your point is people don't pay attention to your mails? or you are ignored? i think that is 100% not real. i think your opinions and mails are always respected as you are the chief arm linux expert. I also told you that this was what I was going to do. But... is it really on to hold up such a large patch set which impacts virtually everyone because _you_ don't have time to sort out your small special requirements - no it is not, that's just fscking selfish. Anyway, I've had it with dealing with platform maintainers, I've yanked this patch set, and I'm no longer planning to do anything with it - platform maintainers have destroyed my will to get any of this series into the kernel. no, i am trying to follow your suggestion to make patch set merged and l2 codes cleaned. i have been trying to follow your will until now, and from the beginning. So, the L2 cache code is going to remain in its current state, and it's going to rot because it's _FAR_ too much effort dealing with slow people like yourselves, or people who want the series split up, or people who whinge that there aren't any acks there (WELL GET OFF YOUR FAT BACKSIDES AND SEND ME SOME IF YOU CARE ABOUT THIS - no, don't, I'm no longer pushing this series.) people might be selfish, but people might have some reasons to response slowly, like holiday or family issue. how about taking it easy? it doesn't prove you are not respected by platform maintainers. This is the last time I'm going to ever try cleaning up any core ARM code. Core ARM maintanence is impossible in this environment with arm-soc split from core ARM stuff, because core ARM stuff /always/ impacts on SoC specific code. You can't get away from that. My position in this community has been made impossible and obsolete by Linaro. I'm at the point of walking away from this crap. just fix the relationship and communication, that is good enough. you have done things so well, there is no reason to give up. -- FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly improving, and getting towards what was expected from it. -barry -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/3] Miscellaneous fixes for 3.16
On Mon, Apr 28, 2014 at 08:23:45PM -0700, Josh Triplett wrote: On Mon, Apr 28, 2014 at 04:56:00PM -0700, Paul E. McKenney wrote: Hello! This series provides miscellaneous fixes: 1. Apply ACCESS_ONCE() to unprotected -gp_flags accesses. 2. Fix typo in comment, courtesy of Liu Ping Fan. 3. Make RCU CPU stall warnings print grace-period numbers in signed format to improve readability of stall-warning output. 4. Make cpu_needs_another_gp() take future grace-period needs into account. 5. Remove unused -preemptible field from the rcu_data structure, courtesty of Iulia Manda. 6. Apply ACCESS_ONCE() to unprotected -jiffies_stall accesses, courtesty of Iulia Manda. 7. Make callers responsible for grace-period kthread wakeup in order to avoid potential silent grace-period stalls. 8. Remove extern from RCU function declarations, courtesy of Iulia Manda. 9. Apply ACCESS_ONCE() to additional -jiffies_stall accesses, courtesy of Himangi Saraogi. 10. Add event tracing to dyntick_save_progress_counter(), courtesy of Andreea-Cristina Bernat. 11. Make rcu_init_one() use nr_cpu_ids instead of NR_CPUS for data-structure setup limit check, courtesy of Himangi Saraogi. 12. Remove redundant kfree_call_rcu() definition by using the rcu_state pointer, courtesy of Andreea-Cristina Bernat. 13. Merge rcu_sched_force_quiescent_state() definition with rcu_force_quiescent_state() by using the rcu_state pointer, courtesy of Andreea-Cristina Bernat. 14. Document RCU_INIT_POINTER()'s lack of ordering guarantees. 15. Automatically bind RCU's grace-period kthreads to timekeeping CPU for NO_HZ_FULL builds. 16. Make large and small sysidle systems use equivalent state machine. 17. Remove duplicate resched_cpu() declaration, courtesy of Pranith Kumar. 18. Replace deprecated __this_cpu_ptr() uses with raw_cpu_ptr(), courtesy of Christoph Lameter. 19. Make softirq processing provide a quiescent state only once per full pass over all softirqs rather than once per action, courtesy of Eric Dumazet. For all 19: Reviewed-by: Josh Triplett j...@joshtriplett.org And thank you for these reviews as well, applied! Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RT 0/3] Linux 3.2.57-rt84-rc1
28.04.2014 17:39, Steven Rostedt пишет: On Mon, 28 Apr 2014 02:15:28 +0400 Pavel Vasilyev pa...@pavlinux.ru wrote: 27.04.2014 18:39, Steven Rostedt пишет: Dear RT Folks, This is the RT stable review cycle of patch 3.2.57-rt84-rc1. Please scream at me if I messed something up. Please test the patches too. More than two years our thin clients (about 5000 machines, Intel Atom, x86_32) work with RCU_BOOST. CONFIG_RCU_BOOST=y CONFIG_RCU_BOOST_PRIO=80 CONFIG_RCU_BOOST_DELAY=400 Is this just a confirmation of having RCU_BOOST default y for PREEMPT_RT is a good thing? Only 3.2-rt -- Pavel. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Input: implement managed polled input devices
On Tue, Apr 29, 2014 at 08:09:39AM +0200, David Herrmann wrote: Hi On Tue, Apr 29, 2014 at 5:23 AM, Dmitry Torokhov dmitry.torok...@gmail.com wrote: Managed resources are becoming more and more popular in drivers. Let's implement managed polled input devices, to complement managed regular input devices. Similarly to managed regular input devices only one new call devm_input_allocate_polled_device() is added and the rest of APIs is modified to work with both managed and non-managed devices. Signed-off-by: Dmitry Torokhov dmitry.torok...@gmail.com --- drivers/input/input-polldev.c | 113 +- include/linux/input-polldev.h | 3 ++ 2 files changed, 115 insertions(+), 1 deletion(-) diff --git a/drivers/input/input-polldev.c b/drivers/input/input-polldev.c index 4b19190..27961fc 100644 --- a/drivers/input/input-polldev.c +++ b/drivers/input/input-polldev.c @@ -176,6 +176,90 @@ struct input_polled_dev *input_allocate_polled_device(void) } EXPORT_SYMBOL(input_allocate_polled_device); +struct input_polled_devres { + struct input_polled_dev *polldev; +}; + +static int devm_input_polldev_match(struct device *dev, void *res, void *data) +{ + struct input_polled_devres *devres = res; + + return devres-polldev == data; +} + +static void devm_input_polldev_release(struct device *dev, void *res) +{ + struct input_polled_devres *devres = res; + struct input_polled_dev *polldev = devres-polldev; + + dev_dbg(dev, %s: dropping reference/freeing %s\n, + __func__, dev_name(polldev-input-dev)); + + input_put_device(polldev-input); + kfree(polldev); +} + +static void devm_input_polldev_unregister(struct device *dev, void *res) +{ + struct input_polled_devres *devres = res; + struct input_polled_dev *polldev = devres-polldev; + + dev_dbg(dev, %s: unregistering device %s\n, + __func__, dev_name(polldev-input-dev)); + input_unregister_device(polldev-input); + + /* +* Note that we are still holding extra reference to the input +* device so it will stick around until devm_input_polldev_release() +* is called. +*/ +} + +/** + * devm_input_allocate_polled_device - allocate managed polled device + * @dev: device owning the polled device being created + * + * Returns prepared struct input_polled_dev or %NULL. + * + * Managed polled input devices do not need to be explicitly unregistered + * or freed as it will be done automatically when owner device unbinds + * from * its driver (or binding fails). Once such managed polled device + * is allocated, it is ready to be set up and registered in the same + * fashion as regular polled input devices (using + * input_register_polled_device() function). + * + * If you want to manually unregister and free such managed polled devices, + * it can be still done by calling input_unregister_polled_device() and + * input_free_polled_device(), although it is rarely needed. + * + * NOTE: the owner device is set up as parent of input device and users + * should not override it. + */ +struct input_polled_dev *devm_input_allocate_polled_device(struct device *dev) +{ + struct input_polled_dev *polldev; + struct input_polled_devres *devres; + + devres = devres_alloc(devm_input_polldev_release, sizeof(*devres), + GFP_KERNEL); + if (!devres) + return NULL; + + polldev = input_allocate_polled_device(); + if (!polldev) { + devres_free(devres); + return NULL; + } + + polldev-input-dev.parent = dev; + polldev-devres_managed = true; + + devres-polldev = polldev; + devres_add(dev, devres); + + return polldev; +} + /** * input_free_polled_device - free memory allocated for polled device * @dev: device to free @@ -186,7 +270,12 @@ EXPORT_SYMBOL(input_allocate_polled_device); void input_free_polled_device(struct input_polled_dev *dev) { if (dev) { - input_free_device(dev-input); + if (dev-devres_managed) + WARN_ON(devres_destroy(dev-input-dev.parent, + devm_input_polldev_release, + devm_input_polldev_match, + dev)); + input_put_device(dev-input); kfree(dev); } } @@ -204,9 +293,19 @@ EXPORT_SYMBOL(input_free_polled_device); */ int input_register_polled_device(struct input_polled_dev *dev) { + struct input_polled_devres *devres = NULL; struct input_dev *input = dev-input;
Re: usermodehelper lock error at resume
On 4/29/2014 5:14 PM, Takashi Iwai wrote: At Fri, 18 Apr 2014 10:28:05 +0200, Takashi Iwai wrote: [my previous post didn't seem to go out by some reason, so I just resend this; please disregard if you already received it.] Hmm, I still can't see this in LKML archives... Did you guys receive my previous post below? I did, sorry for not responding, I'm buried under stuff at the moment. Rafael Hi, we've received a bug report with 3.14.x kernel regarding the firmware loading of intel BT device at suspend/resume: https://bugzilla.novell.com/show_bug.cgi?id=873790 It's a WARN_ON() that was recently introduced. And, it turned out that the problem basically comes from a small window between the process resume and the clear of usermodehelper lock. The request_firmware() function checks the UMH lock and gives up when it's in DISABLE state. This is for avoiding the invalid f/w loading during suspend/resume phase. The problem is that usermodehelper_enable() is called at the end of thaw_processes(). Thus, a thawed process in between can kick off the f/w loader code path (in this case, via btusb_setup_intel()) even before the call of usermodehelper_enable(). Then usermodehelper_read_trylock() returns an error and request_firmware() spews WARN_ON() in the end. The oneliner patch below seems fixing the problem. But, I'm not quite sure whether it's the best; rather usermodehelper_enable() can be moved there, or better to define yet another state, e.g. UMH_THAWING, instead of reusing UMH_FREEZING? Suggestions? Once when we agree, I'll cook up a proper patch. thanks, Takashi --- diff --git a/kernel/power/process.c b/kernel/power/process.c index 06ec8869dbf1..9c7552f092f2 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -181,6 +181,8 @@ void thaw_processes(void) pm_nosig_freezing = false; oom_killer_enable(); + /* allow request_firmare() at this point */ + __usermodehelper_set_disable_depth(UMH_FREEZING); printk(Restarting tasks ... ); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] bio: modify __bio_add_page() to accept pages that don't start a new segment
The original behaviour is to refuse to add a new page if the maximum number of segments has been reached, regardless of the fact the page we are going to add can be merged into the last segment or not. Unfortunately, when the system runs under heavy memory fragmentation conditions, a driver may try to add multiple pages to the last segment. The original code won't accept them and EBUSY will be reported to userspace. This patch modifies the function so it refuses to add a page only in case the latter starts a new segment and the maximum number of segments has already been reached. The bug can be easily reproduced with the st driver: 1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE to 16 2) modprobe st buffer_kbs=1024 3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10 dd: error writing ‘/dev/st0’: Device or resource busy Signed-off-by: Maurizio Lombardi mlomb...@redhat.com --- fs/bio.c | 51 +-- 1 file changed, 29 insertions(+), 22 deletions(-) diff --git a/fs/bio.c b/fs/bio.c index 6f0362b..a31e12b 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -699,6 +699,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page unsigned int max_sectors) { int retried_segments = 0; + unsigned int bi_phys_segments_orig; struct bio_vec *bvec; /* @@ -750,29 +751,32 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page return 0; /* -* we might lose a segment or two here, but rather that than -* make this too complex. +* setup the new entry, we might clear it again later if we +* cannot add the page +*/ + bvec = bio-bi_io_vec[bio-bi_vcnt]; + bvec-bv_page = page; + bvec-bv_len = len; + bvec-bv_offset = offset; + bio-bi_vcnt++; + bi_phys_segments_orig = bio-bi_phys_segments; + bio-bi_phys_segments++; + + /* +* Perform a recount if the number of segments is greater +* than queue_max_segments(q). */ - while (bio-bi_phys_segments = queue_max_segments(q)) { + while (bio-bi_phys_segments queue_max_segments(q)) { if (retried_segments) - return 0; + goto failed; retried_segments = 1; blk_recount_segments(q, bio); } /* -* setup the new entry, we might clear it again later if we -* cannot add the page -*/ - bvec = bio-bi_io_vec[bio-bi_vcnt]; - bvec-bv_page = page; - bvec-bv_len = len; - bvec-bv_offset = offset; - - /* * if queue has other restrictions (eg varying max sector size * depending on offset), it can specify a merge_bvec_fn in the * queue to get further control @@ -789,23 +793,26 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page * merge_bvec_fn() returns number of bytes it can accept * at this offset */ - if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) { - bvec-bv_page = NULL; - bvec-bv_len = 0; - bvec-bv_offset = 0; - return 0; - } + if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) + goto failed; } /* If we may be able to merge these biovecs, force a recount */ - if (bio-bi_vcnt (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) + if (bio-bi_vcnt 1 (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) bio-bi_flags = ~(1 BIO_SEG_VALID); - bio-bi_vcnt++; - bio-bi_phys_segments++; done: bio-bi_iter.bi_size += len; return len; + + failed: + bvec-bv_page = NULL; + bvec-bv_len = 0; + bvec-bv_offset = 0; + bio-bi_vcnt--; + bio-bi_phys_segments = bi_phys_segments_orig; + + return 0; } /** -- Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/4] thermal: Added Bang-bang thermal governor
On Tue, Apr 29, 2014 at 10:17:56AM +0100, Peter Feuerer wrote: The bang-bang thermal governor uses a hysteresis to switch abruptly on or off a cooling device. It is intended to control fans, which can not be throttled but just switched on or off. Bang-bang cannot be set as default governor as it is intended for special devices only. For those special devices the driver needs to explicitely request it. I don't really understand why step-wise doesn't work for you (AIUI, this governor should be a subset of it. I'll let others comment on that, just a minor comment below. [...] diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c new file mode 100644 index 000..328dde0 --- /dev/null +++ b/drivers/thermal/gov_bang_bang.c @@ -0,0 +1,124 @@ +/* + * gov_bang_bang.c - A simple thermal throttling governor using hysteresis + * + * Copyright (C) 2014 Peter Feuerer pe...@piie.net + * + * Based on step_wise.c with following Copyrights: + * Copyright (C) 2012 Intel Corp + * Copyright (C) 2012 Durgadoss R durgados...@intel.com + * + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, version 2. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + */ + +#include linux/thermal.h + +#include thermal_core.h + +static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip) +{ + long trip_temp; + unsigned long trip_hyst; + struct thermal_instance *instance; + + tz-ops-get_trip_temp(tz, trip, trip_temp); + tz-ops-get_trip_hyst(tz, trip, trip_hyst); + + dev_dbg(tz-device, Trip%d[temp=%ld]:temp=%d:hyst=%ld\n, + trip, trip_temp, tz-temperature, + trip_hyst); + + mutex_lock(tz-lock); + + list_for_each_entry(instance, tz-thermal_instances, tz_node) { + if (instance-trip != trip) + continue; + + /* in case fan is neither on nor off set the fan to active */ + if (instance-target != 0 instance-target != 1) + instance-target = 1; I think you should add a pr_warn() here to warn the user that the governor is being used with a cooling device that seems to support more than one cooling state. Cheers, Javi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bio: modify __bio_add_page() to accept pages that don't start a new segment
Sorry I did a mistake in this patch: on failure I should restore the original value of bi_phys_segments. I'm going to send a new version. Maurizio Lombardi On Tue, Apr 29, 2014 at 04:58:18PM +0200, Maurizio Lombardi wrote: The original behaviour is to refuse to add a new page if the maximum number of segments has been reached, regardless of the fact the page we are going to add can be merged into the last segment or not. Unfortunately, when the system runs under heavy memory fragmentation conditions, a driver may try to add multiple pages to the last segment. The original code won't accept them and EBUSY will be reported to userspace. This patch modifies the function so it refuses to add a page only in case the latter starts a new segment and the maximum number of segments has already been reached. The bug can be easily reproduced with the st driver: 1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE to 16 2) modprobe st buffer_kbs=1024 3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10 dd: error writing ‘/dev/st0’: Device or resource busy Signed-off-by: Maurizio Lombardi mlomb...@redhat.com --- fs/bio.c | 50 -- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/fs/bio.c b/fs/bio.c index 6f0362b..9a3a0b1 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -750,29 +750,31 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page return 0; /* - * we might lose a segment or two here, but rather that than - * make this too complex. + * setup the new entry, we might clear it again later if we + * cannot add the page + */ + bvec = bio-bi_io_vec[bio-bi_vcnt]; + bvec-bv_page = page; + bvec-bv_len = len; + bvec-bv_offset = offset; + bio-bi_vcnt++; + bio-bi_phys_segments++; + + /* + * Perform a recount if the number of segments is greater + * than queue_max_segments(q). */ - while (bio-bi_phys_segments = queue_max_segments(q)) { + while (bio-bi_phys_segments queue_max_segments(q)) { if (retried_segments) - return 0; + goto failed; retried_segments = 1; blk_recount_segments(q, bio); } /* - * setup the new entry, we might clear it again later if we - * cannot add the page - */ - bvec = bio-bi_io_vec[bio-bi_vcnt]; - bvec-bv_page = page; - bvec-bv_len = len; - bvec-bv_offset = offset; - - /* * if queue has other restrictions (eg varying max sector size * depending on offset), it can specify a merge_bvec_fn in the * queue to get further control @@ -789,23 +791,27 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page * merge_bvec_fn() returns number of bytes it can accept * at this offset */ - if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) { - bvec-bv_page = NULL; - bvec-bv_len = 0; - bvec-bv_offset = 0; - return 0; - } + if (q-merge_bvec_fn(q, bvm, bvec) bvec-bv_len) + goto failed; } /* If we may be able to merge these biovecs, force a recount */ - if (bio-bi_vcnt (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) + if (bio-bi_vcnt 1 (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec))) bio-bi_flags = ~(1 BIO_SEG_VALID); - bio-bi_vcnt++; - bio-bi_phys_segments++; done: bio-bi_iter.bi_size += len; return len; + + failed: + bvec-bv_page = NULL; + bvec-bv_len = 0; + bvec-bv_offset = 0; + bio-bi_vcnt--; + if (!retried_segments) + bio-bi_phys_segments--; + + return 0; } /** -- Maurizio Lombardi -- To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 00/24] input: Introduce ff-memless-next as an improved replacement for ff-memless
This patch series: 1) Adds ff-memless-next module [1] 2) Ports all hardware-specific drivers to MLNX's API [2-23] 3) Removes FFML and replaces it with MLNX [24] Signed-off-by: Michal Malý madcatxs...@devoid-pointer.net v4: - Add a summary of changes between MLNX and FFML to the last patch - Remove a stale empty line in hid-sony.c - Add Tested-by: Elias Vanderstuyft elias@gmail.com to hid-lg4ff patch. v3: - Rebase against latest linux-next. Fixes conflict in hid-sony.c and max8997_haptic.c - Updated documentation in ff-memless-next.h. The documentation now describes parameters of the callback function and specifically mentions that HW-specific drivers must not keep a reference to mlnx_effect_command struct to which a pointer is passed in the callback function. - Fix a minor brace inconsistency in hid-lgff I believe that all concerns regarding v2 have been resolved as false alarms. v2: - Add missing msecs to jiffies conversion in ff-memless-next - lgff: Properly convert force on Y axis from MLNX to device range Support periodic effects for joystick_ac device class - lg3ff: Properly convert forces from MLNX to device range - Very minor coding style issues fixed Hi all, I'd confirm that I build v2 and tested on a number of devices (1), and it appears to work OK. The only slight hiccup was with an older version (Xubuntu 12.10) 'ffcfstress' application which did not correctly detect the CF capabilities of my gaming wheel(s). This is believed to be a fault with the application not using correct bit-field testing and appears to have been fixed on later versions (Xubuntu 13.10). I also built v4, but have not yet had time/access to all the devices (other than DS4) to test. Cheers, Simon Tested-by: Simon Wood si...@mungewell.org (1) Devices: Logitech Momo Red, Momo Black, DFGT, WiiWheel, G27 Sony DS3-SA, DS4, Intec 3rd Party PS3 controller Nintendo Wii Remote -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usermodehelper lock error at resume
At Tue, 29 Apr 2014 17:34:32 +0200, Rafael J. Wysocki wrote: On 4/29/2014 5:14 PM, Takashi Iwai wrote: At Fri, 18 Apr 2014 10:28:05 +0200, Takashi Iwai wrote: [my previous post didn't seem to go out by some reason, so I just resend this; please disregard if you already received it.] Hmm, I still can't see this in LKML archives... Did you guys receive my previous post below? I did, sorry for not responding, I'm buried under stuff at the moment. Don't worry, this isn't any urgent issue. (And I've been off in the whole last week in anyway :) I just wondered why this didn't come up in LKML archive. But if the post went out actually, it's fine. thanks, Takashi Rafael Hi, we've received a bug report with 3.14.x kernel regarding the firmware loading of intel BT device at suspend/resume: https://bugzilla.novell.com/show_bug.cgi?id=873790 It's a WARN_ON() that was recently introduced. And, it turned out that the problem basically comes from a small window between the process resume and the clear of usermodehelper lock. The request_firmware() function checks the UMH lock and gives up when it's in DISABLE state. This is for avoiding the invalid f/w loading during suspend/resume phase. The problem is that usermodehelper_enable() is called at the end of thaw_processes(). Thus, a thawed process in between can kick off the f/w loader code path (in this case, via btusb_setup_intel()) even before the call of usermodehelper_enable(). Then usermodehelper_read_trylock() returns an error and request_firmware() spews WARN_ON() in the end. The oneliner patch below seems fixing the problem. But, I'm not quite sure whether it's the best; rather usermodehelper_enable() can be moved there, or better to define yet another state, e.g. UMH_THAWING, instead of reusing UMH_FREEZING? Suggestions? Once when we agree, I'll cook up a proper patch. thanks, Takashi --- diff --git a/kernel/power/process.c b/kernel/power/process.c index 06ec8869dbf1..9c7552f092f2 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -181,6 +181,8 @@ void thaw_processes(void) pm_nosig_freezing = false; oom_killer_enable(); + /* allow request_firmare() at this point */ + __usermodehelper_set_disable_depth(UMH_FREEZING); printk(Restarting tasks ... ); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/4] acerhdf: Use bang-bang thermal governor
On Tue, Apr 29, 2014 at 10:17:57AM +0100, Peter Feuerer wrote: acerhdf has been doing an on-off fan control using hysteresis by post-manipulating the outcome of thermal subsystem trip point handling. This patch enables acerhdf to use the bang-bang governor, which is intended for on-off controlled fans. CC: Zhang Rui rui.zh...@intel.com Cc: Andreas Mohr a...@lisas.de Cc: Borislav Petkov b...@suse.de Signed-off-by: Peter Feuerer pe...@piie.net --- drivers/platform/x86/Kconfig | 2 +- drivers/platform/x86/acerhdf.c | 48 +++--- 2 files changed, 41 insertions(+), 9 deletions(-) diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index 27df2c5..0c15d89 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -38,7 +38,7 @@ config ACER_WMI config ACERHDF tristate Acer Aspire One temperature and fan driver - depends on THERMAL ACPI + depends on ACPI THERMAL_GOV_BANG_BANG ---help--- This is a driver for Acer Aspire One netbooks. It allows to access the temperature sensor and to control the fan. diff --git a/drivers/platform/x86/acerhdf.c b/drivers/platform/x86/acerhdf.c index 176edbd..f3884f9 100644 --- a/drivers/platform/x86/acerhdf.c +++ b/drivers/platform/x86/acerhdf.c @@ -50,7 +50,7 @@ */ #undef START_IN_KERNEL_MODE -#define DRV_VER 0.5.30 +#define DRV_VER 0.5.31 /* * According to the Atom N270 datasheet, @@ -135,8 +135,8 @@ struct bios_settings_t { const char *vendor; const char *product; const char *version; - unsigned char fanreg; - unsigned char tempreg; + u8 fanreg; + u8 tempreg; struct fancmd cmd; int mcmd_enable; }; @@ -259,6 +259,17 @@ static const struct bios_settings_t bios_tbl[] = { static const struct bios_settings_t *bios_cfg __read_mostly; +/* + * this struct is used to instruct thermal layer to use bang_bang instead of + * default governor for acerhdf + */ +static struct thermal_zone_params acerhdf_zone_params = { + .governor_name = bang_bang, + .no_hwmon = 0, + .num_tbps = 0, + .tbp = 0, +}; You don't need to initialize statics to 0. checkpatch only considers it an error if it finds it in a variable, but I think it also applies to fields in struct. + static int acerhdf_get_temp(int *temp) { u8 read_temp; @@ -436,6 +447,17 @@ static int acerhdf_get_trip_type(struct thermal_zone_device *thermal, int trip, { if (trip == 0) *type = THERMAL_TRIP_ACTIVE; + if (trip == 1) + *type = THERMAL_TRIP_CRITICAL; This looks like an unrelated change that should be on a patch on its own. + + return 0; +} + +static int acerhdf_get_trip_hyst(struct thermal_zone_device *thermal, int trip, + unsigned long *temp) +{ + if (trip == 0) + *temp = fanon - fanoff; return 0; } @@ -445,6 +467,8 @@ static int acerhdf_get_trip_temp(struct thermal_zone_device *thermal, int trip, { if (trip == 0) *temp = fanon; + else if (trip == 1) + *temp = ACERHDF_TEMP_CRIT; return 0; } @@ -464,8 +488,10 @@ static struct thermal_zone_device_ops acerhdf_dev_ops = { .get_mode = acerhdf_get_mode, .set_mode = acerhdf_set_mode, .get_trip_type = acerhdf_get_trip_type, + .get_trip_hyst = acerhdf_get_trip_hyst, .get_trip_temp = acerhdf_get_trip_temp, .get_crit_temp = acerhdf_get_crit_temp, + .notify = NULL, Same as before, no need to initialize static to NULL. Cheers, Javi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
dcache shrink list corruption?
This was reported by IBM for 3.12, but if my analysis is right, it affects current kernel as well as older ones. So the question is: does anything protect the shrink list from concurrent modification by one or more dput() instances? E.g. two dentries are on the shrink list, for both dget(), d_drop() and dput() are called. dput() - dentry_kill() - dentry_lru_del() - d_shrink_del() - list_del_init(). Unlike the LRU list this is only protected with d_lock on the individual dentries, which is not enough to prevent list corruption: list-next = a, list-prev = b a-next = b, a-prev = list b-next = list, b-prev = a CPU1: list_del_init(b) __list_del(a, list) a-next = list ... CPU2: list_del_init(a) __list_del(list, list) list-next = list list-prev = list CPU1: (continuing list_del_init(b)) list-prev = a Attached patch is just a starting point (untested). Not sure how to minimize contention without adding too much complexity. Thanks, Miklos diff --git a/fs/dcache.c b/fs/dcache.c index 40707d88a945..5e0719292e3e 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -357,10 +357,14 @@ static void d_lru_del(struct dentry *dentry) WARN_ON_ONCE(!list_lru_del(dentry-d_sb-s_dentry_lru, dentry-d_lru)); } +static __cacheline_aligned_in_smp DEFINE_SPINLOCK(dcache_shrink_lock); + static void d_shrink_del(struct dentry *dentry) { D_FLAG_VERIFY(dentry, DCACHE_SHRINK_LIST | DCACHE_LRU_LIST); + spin_lock(dcache_shrink_lock); list_del_init(dentry-d_lru); + spin_unlock(dcache_shrink_lock); dentry-d_flags = ~(DCACHE_SHRINK_LIST | DCACHE_LRU_LIST); this_cpu_dec(nr_dentry_unused); } @@ -368,7 +372,9 @@ static void d_shrink_del(struct dentry *dentry) static void d_shrink_add(struct dentry *dentry, struct list_head *list) { D_FLAG_VERIFY(dentry, 0); + spin_lock(dcache_shrink_lock); list_add(dentry-d_lru, list); + spin_unlock(dcache_shrink_lock); dentry-d_flags |= DCACHE_SHRINK_LIST | DCACHE_LRU_LIST; this_cpu_inc(nr_dentry_unused); } @@ -391,7 +397,9 @@ static void d_lru_shrink_move(struct dentry *dentry, struct list_head *list) { D_FLAG_VERIFY(dentry, DCACHE_LRU_LIST); dentry-d_flags |= DCACHE_SHRINK_LIST; + spin_lock(dcache_shrink_lock); list_move_tail(dentry-d_lru, list); + spin_unlock(dcache_shrink_lock); } /* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 0/8] Introduce new cpufreq helper macros
On 29/04/2014 07:17 πμ, Viresh Kumar wrote: On 26 April 2014 01:45, Stratos Karafotis strat...@semaphore.gr wrote: This patch set introduces two freq_table helper macros which can be used for iteration over cpufreq_frequency_table and makes the necessary changes to cpufreq core and drivers that use such an iteration procedure. The motivation was a usage of common procedure to iterate over cpufreq_frequency_table across all drivers and cpufreq core. This was tested on a x86_64 platform. Most files compiled successfully but unfortunately I was not able to compile sh_sir.c pasemi_cpufreq.c and ppc_cbe_cpufreq.c due to lack of cross compiler. Changelog v4 - v5 - Fix warnings in printk format specifier for 32 bit architectures in freq_table.c, longhaul, pasemi, ppc_cbe Doesn't look much has changed and so it stays as is: Acked-by: Viresh Kumar viresh.ku...@linaro.org Thank you very much! Stratos Karafotis -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 23/24] hid: Port hid-lg4ff to ff-memless-next
Port hid-lg4ff to ff-memless-next Signed-off-by: Michal Malý madcatxs...@devoid-pointer.net Tested-by: Tested-by: Elias Vanderstuyft elias@gmail.com Signed-off-by: Simon Wood si...@mungewell.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sched_{set,get}attr() manpage
On Tue, Apr 29, 2014 at 03:08:55PM +0200, Michael Kerrisk (man-pages) wrote: Juri, Dario, Can you have a look at the 2nd part; I'm not at all sure I got the activate/release the right way around. My current thinking was that we activate first, and then release it to go run. But googling the terms only confused me more. I suppose its one of those things that's not actually _that_ well defined. And I hope the ASCII art actually clarifies things better than the terms used. [1] A page describing the sched_setattr() and sched_getattr() APIs NAME sched_setattr, sched_getattr - set and get scheduling policy/attributes SYNOPSIS #include sched.h struct sched_attr { u32 size; u32 sched_policy; u64 sched_flags; /* SCHED_NORMAL, SCHED_BATCH */ s32 sched_nice; /* SCHED_FIFO, SCHED_RR */ u32 sched_priority; /* SCHED_DEADLINE */ u64 sched_runtime; u64 sched_deadline; u64 sched_period; }; int sched_setattr(pid_t pid, const struct sched_attr *attr, unsigned int flags); int sched_getattr(pid_t pid, const struct sched_attr *attr, unsigned int size, unsigned int flags); DESCRIPTION sched_setattr() sets both the scheduling policy and the associated attributes for the process whose ID is specified in pid. sched_setattr() replaces sched_setscheduler(), sched_setparam(), nice() and some of setpriority(). If pid equals zero, the scheduling policy and attributes of the calling process will be set. The interpretation of the argument attr depends on the selected policy. Currently, Linux supports the following normal (i.e., non-real-time) scheduling policies: SCHED_OTHER the standard fair time-sharing policy; SCHED_BATCH for batch style execution of processes; and SCHED_IDLE for running very low priority background jobs. The following real-time policies are also supported, for special time-critical applications that need precise control over the way in which runnable processes are selected for execution: SCHED_FIFO a static priority first-in, first-out policy; SCHED_RRa static priority round-robin policy; and SCHED_DEADLINE a dynamic priority deadline policy. The semantics of each of these policies are detailed in sched(7). sched_attr::size must be set to the size of the structure, as in sizeof(struct sched_attr), if the provided structure is smaller than the kernel structure, any additional fields are assumed '0'. If the provided structure is larger than the kernel structure, the kernel verifies all additional fields are '0' if not the syscall will fail with -E2BIG. sched_attr::sched_policy the desired scheduling policy. sched_attr::sched_flags additional flags that can influence scheduling behaviour. Currently as per Linux kernel 3.14: SCHED_FLAG_RESET_ON_FORK - resets the scheduling policy to: (struct sched_attr){ .sched_policy = SCHED_OTHER, } on fork(). is the only supported flag. sched_attr::sched_nice should only be set for SCHED_OTHER, SCHED_BATCH, the desired nice value [-20,19], see sched(7). sched_attr::sched_priority should only be set for SCHED_FIFO, SCHED_RR, the desired static priority [1,99], see sched(7). sched_attr::sched_runtime sched_attr::sched_deadline sched_attr::sched_period should only be set for SCHED_DEADLINE and are the traditional sporadic task model parameters, see sched(7). The flags argument should be 0. sched_getattr() queries the scheduling policy currently applied to the process identified by pid. Similar to sched_setattr(), sched_getattr() replaces sched_getscheduler(), sched_getparam() and some of getpriority(). If pid equals zero, the policy of the calling process will be retrieved. The size argument should reflect the size of struct sched_attr as known to userspace. The kernel fills out sched_attr::size to the size of its sched_attr structure. If the user provided structure is larger, additional fields are not touched. If the user provided structure is smaller, but the kernel needs to return values outside the provided space, the syscall will fail with -E2BIG. The flags argument should be 0. The other sched_attr fields are filled out as described in sched_setattr(). RETURN VALUE On success, sched_setattr() and sched_getattr() return 0. On error, -1 is returned, and errno is set
Re: [PATCH v2] rwsem: Support optimistic spinning
On Tue, 2014-04-29 at 08:11 -0700, Paul E. McKenney wrote: On Mon, Apr 28, 2014 at 05:50:49PM -0700, Tim Chen wrote: On Mon, 2014-04-28 at 16:10 -0700, Paul E. McKenney wrote: +#ifdef CONFIG_SMP +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) +{ + int retval; + struct task_struct *owner; + + rcu_read_lock(); + owner = ACCESS_ONCE(sem-owner); OK, I'll bite... Why ACCESS_ONCE() instead of rcu_dereference()? We're using it as a speculative check on the sem-owner to see if the owner is running on the cpu. The rcu_read_lock is used for ensuring that the owner-on_cpu memory is still valid. OK, so if we read complete garbage, all that happens is that we lose a bit of performance? Correct. If so, I am OK with it as long as there is a comment (which Davidlohr suggested later in this thread). Yes, we should add some comments to clarify things. Thanks. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit x86: NMI nesting still buggy?
On Tue, 29 Apr 2014 17:24:32 +0200 (CEST) Jiri Kosina jkos...@suse.cz wrote: On Tue, 29 Apr 2014, Steven Rostedt wrote: According to 38.4 of [1], when SMM mode is entered while the CPU is handling NMI, the end result might be that upon exit from SMM, NMIs will be re-enabled and latched NMI delivered as nested [2]. Note, if this were true, then the x86_64 hardware would be extremely buggy. That's because NMIs are not made to be nested. If SMM's come in during an NMI and re-enables the NMI, then *all* software would break. That would basically make NMIs useless. The only time I've ever witness problems (and I stress NMIs all the time), is when the NMI itself does a fault. Which my patch set handles properly. Yes, it indeed does. In the scenario I have outlined, the race window is extremely small, plus NMIs don't happen that often, plus SMIs don't happen that often, plus (hopefully) many BIOSes don't enable NMIs upon SMM exit. The problem is, that Intel documentation is clear in this respect, and explicitly states it can happen. And we are violating that, which makes me rather nervous -- it'd be very nice to know what is the background of 38.4 section text in the Intel docs. You keep saying 38.4, but I don't see any 38.4. Perhaps you meant 34.8? Which BTW is this: 34.8 NMI HANDLING WHILE IN SMM NMI interrupts are blocked upon entry to the SMI handler. If an NMI request occurs during the SMI handler, it is latched and serviced after the processor exits SMM. Only one NMI request will be latched during the SMI handler. If an NMI request is pending when the processor executes the RSM instruction, the NMI is serviced before the next instruction of the interrupted code sequence. This assumes that NMIs were not blocked before the SMI occurred. If NMIs were blocked before the SMI occurred, they are blocked after execution of RSM. Although NMI requests are blocked when the processor enters SMM, they may be enabled through software by executing an IRET instruction. If the SMI handler requires the use of NMI interrupts, it should invoke a dummy interrupt service routine for the purpose of executing an IRET instruction. Once an IRET instruction is executed, NMI interrupt requests are serviced in the same “real mode” manner in which they are handled outside of SMM. A special case can occur if an SMI handler nests inside an NMI handler and then another NMI occurs. During NMI interrupt handling, NMI interrupts are disabled, so normally NMI interrupts are serviced and completed with an IRET instruction one at a time. When the processor enters SMM while executing an NMI handler, the processor saves the SMRAM state save map but does not save the attribute to keep NMI interrupts disabled. Potentially, an NMI could be latched (while in SMM or upon exit) and serviced upon exit of SMM even though the previous NMI handler has still not completed. One or more NMIs could thus be nested inside the first NMI handler. The NMI interrupt handler should take this possibility into consideration. Also, for the Pentium processor, exceptions that invoke a trap or fault handler will enable NMI interrupts from inside of SMM. This behavior is implementation specific for the Pentium processor and is not part of the IA-32 architecture. Read the first paragraph. That sounds like normal operation. The SMM should use the RSM to return and that does not re-enable NMIs if the SMM triggered during an NMI. The above is just stating that the SMM can enable NMIs if it wants to by executing an IRET. Which to me sounds rather buggy to do. Now the third paragraph is rather ambiguous. It sounds like it's still talking about doing an IRET in the SMI handler. As the IRET will enable NMIs, and if the SMI happened while an NMI was happening, the new NMI will happen. In this case, the NMI handler needs to address this. But this really sounds like if you have control of both SMM handlers and NMI handlers, which the Linux kernel certainly does not. Again, I label this as a bug in the BIOS. And again, if the SMM were to trigger a fault, it too would enable NMIs. That is something that the SMM handler should not do. Can you reproduce your problem on different platforms, or is this just one box that exhibits this behavior? If it's only one box, I'm betting it has a BIOS doing nasty things. No where in the Intel text do I see that the operating system is to handle nested NMIs. It needs to handle it if you control the SMMs, which the operating system does not. Sounds like they are talking to the firmware folks. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb: dwc3: debugfs: add snapshot to dump requests trbs events
Hi, On Tue, Apr 29, 2014 at 05:21:42PM -0400, Zhuang Jin Can wrote: On Mon, Apr 28, 2014 at 10:55:36AM -0500, Felipe Balbi wrote: On Mon, Apr 28, 2014 at 04:49:23PM -0400, Zhuang Jin Can wrote: Adds a debugfs file snapshot to dump dwc3 requests, trbs and events. you need to explain what are you trying to provide to our users here. What problem are you trying to solve ? The interface enables users to easily peek into requests, trbs and events to know the current transfer state of each request. If an transfer is stuck, user can use the interface to check why it's stuck (e.g. Is it because a gadget doesn't queued the request? Or it's queued but it's not primed to the controller? Or It's primed to the controller but the TRBs and events indicate the transfer never completes?). User can immediately narrow down the issue without enabling verbose log or reproduce the issue again. It's helpful when we need to deal with some hard-to-reproduce bugs or timing sensitive bugs can't be reproduced with verbose log enabled. this should be part of the commit log in some shape or form. As ep0 requests are more complex than others. It's not included in this patch. For ep0, you could at least print the endpoint phase we are currently in and if we have requests in flight or not. Agree. Will add it in [PATCH v2]. tks + seq_puts(s, busy_slot--|\n); + seq_puts(s,\\\n); + } + if (i == (dep-free_slot DWC3_TRB_MASK)) { + seq_puts(s, free_slot--|\n); + seq_puts(s,\\\n); + } + seq_printf(s, trb[%02d](dma@0x%pad): %08x(bpl), %08x(bph), %08x(size), %08x(ctrl)\n, I'm not sure you need to print out the TRB address. bpl, bph, size and ctrl are desired though. printing out the TRB DMA address helps user to locate the start TRB of a request. I admit that we can achive the same purose using the start_slot of the request. I'll remove it in [PATCH v2]. thanks + i, dep-trb_pool_dma + i * sizeof(*trb), + trb-bpl, trb-bph, trb-size, trb-ctrl); this will be pretty difficult to parse by a human. I would rather see you creating one directory per TRB (and also one directory per endpoint) which holds the details for that entity, so that it looks like: dwc3 |-- current_state (or perhaps a better name, but snapshot isn't very good either) Actually, it's hard to find a perfect name. current_state or snapshot doesn't make too much difference to me. If current_state makes more sense to you, I can change to use this name. Or let me know if you have a better suggestion. the name is important as we will have to deal with it for the next 50 years. We also need to think about someone starting out on dwc3 5 years from now or a QA engineer in whatever OEM trying to provide details of the failure for the development team. It needs to be well thought out. I don't have a better idea but snapshot gives me the idea that we will end up with a copy of everything which we can revisit at any time and that's not true. If we read this file twice there's no guarantee it'll contain the same information. |-- ep2 | |-- direction | |-- maxpacket | |-- number | |-- state | |-- stream_capable | |-- type | |-- trbs | | |-- trb0 | | | |-- bph | | | |-- bpl | | | |-- ctrl | | | |-- size | | |-- trb1 | | | |-- bph | | | |-- bpl | | | |-- ctrl | | | |-- size | | |-- trb2 | | | |-- bph | | | |-- bpl | | | |-- ctrl | | | |-- size | | |-- trb3 | | | |-- bph | | | |-- bpl | | | |-- ctrl | | | |-- size . . . . . . . . . | |-- request0 | | |-- direction | | |-- mapped | | |-- queued | | |-- trb0(symlink to actual trb directory) | | |-- ep2 (symlink to actual ep2 directory) | | |-- usbrequest | | |-- actual | | |-- length | | |-- no_interrupt | | |-- num_mapped_sgs | | |-- num_sgs | | |-- short_not_ok | | |-- status | | |-- stream_id | | |-- zero | |-- request1 | | |-- direction | | |-- mapped | | |-- queued | | |-- trb1(symlink to actual trb directory) | | |-- ep2 (symlink to actual ep2 directory) | | |-- usbrequest | | |-- actual | | |-- length | | |-- no_interrupt | | |-- num_mapped_sgs | | |-- num_sgs | | |-- short_not_ok
Re: 64bit x86: NMI nesting still buggy?
On Tue, 29 Apr 2014 12:09:08 -0400 Steven Rostedt rost...@goodmis.org wrote: Can you reproduce your problem on different platforms, or is this just one box that exhibits this behavior? If it's only one box, I'm betting it has a BIOS doing nasty things. This box probably crashes on all kernels too. My NMI nesting changes did not fix a bug (well, it did as a side effect, see below). It was done to allow NMIs to use IRET so that we could remove stopmachine from ftrace, and instead have it use breakpoints (which return with IRET). The bug that was fixed by this was the ability to do stack traces (sysrq-t) from NMI context. Stack traces can page fault, and when I was debugging hard lock ups and having the NMI do a stack dump of all tasks, another NMI would trigger and corrupt the stack of the NMI doing the dumps. But that was something that would only be seen while debugging, and not something seen in normal operation. I don't see a bug to fix in the kernel. I see a bug to fix in the vendor's BIOS. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3 1/2] FS: Add generic data flush to fsync
On Mon 28-04-14 23:12:39, Fabian Frederick wrote: This patch issues a flush in generic_file_fsync. (Modern filesystems already do it) -Behaviour can be reversed using /sys/devices/.../cache_type -Filesystems can also call __generic_file_fsync with bool flush false The patch looks good. You can add: Reviewed-by: Jan Kara j...@suse.cz Honza Suggested-by: Jan Kara j...@suse.cz Suggested-by: Christoph Hellwig h...@infradead.org Cc: Jan Kara j...@suse.cz Cc: Christoph Hellwig h...@infradead.org Cc: Alexander Viro v...@zeniv.linux.org.uk Cc: Theodore Ts'o ty...@mit.edu Cc: Andrew Morton a...@linux-foundation.org Signed-off-by: Fabian Frederick f...@skynet.be --- V3: __generic_file_fsync = no flush V2: No flag V1: First version with MS_BARRIER flag fs/libfs.c | 36 +--- include/linux/fs.h | 1 + 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index a184424..4877906 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -3,6 +3,7 @@ * Library for filesystems writers. */ +#include linux/blkdev.h #include linux/export.h #include linux/pagemap.h #include linux/slab.h @@ -923,16 +924,19 @@ struct dentry *generic_fh_to_parent(struct super_block *sb, struct fid *fid, EXPORT_SYMBOL_GPL(generic_fh_to_parent); /** - * generic_file_fsync - generic fsync implementation for simple filesystems + * __generic_file_fsync - generic fsync implementation for simple filesystems + * * @file:file to synchronize + * @start: start offset in bytes + * @end: end offset in bytes (inclusive) * @datasync:only synchronize essential metadata if true * * This is a generic implementation of the fsync method for simple * filesystems which track all non-inode metadata in the buffers list * hanging off the address_space structure. */ -int generic_file_fsync(struct file *file, loff_t start, loff_t end, -int datasync) +int __generic_file_fsync(struct file *file, loff_t start, loff_t end, + int datasync) { struct inode *inode = file-f_mapping-host; int err; @@ -952,10 +956,36 @@ int generic_file_fsync(struct file *file, loff_t start, loff_t end, err = sync_inode_metadata(inode, 1); if (ret == 0) ret = err; + out: mutex_unlock(inode-i_mutex); return ret; } +EXPORT_SYMBOL(__generic_file_fsync); + +/** + * generic_file_fsync - generic fsync implementation for simple filesystems + * with flush + * @file:file to synchronize + * @start: start offset in bytes + * @end: end offset in bytes (inclusive) + * @datasync:only synchronize essential metadata if true + * + */ + +int generic_file_fsync(struct file *file, loff_t start, loff_t end, +int datasync) +{ + struct inode *inode = file-f_mapping-host; + int err; + + err = __generic_file_fsync(file, start, end, datasync); + if (err) + return err; + + return blkdev_issue_flush(inode-i_sb-s_bdev, GFP_KERNEL, NULL); + +} EXPORT_SYMBOL(generic_file_fsync); /** diff --git a/include/linux/fs.h b/include/linux/fs.h index 8780312..c3f46e4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2590,6 +2590,7 @@ extern ssize_t simple_read_from_buffer(void __user *to, size_t count, extern ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos, const void __user *from, size_t count); +extern int __generic_file_fsync(struct file *, loff_t, loff_t, int); extern int generic_file_fsync(struct file *, loff_t, loff_t, int); extern int generic_check_addressable(unsigned, u64); -- 1.8.4.5 -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Cocci] [PATCH v2 1/1] scripts/coccinelle: use BIT macro if used
On 04/27/2014 12:50 PM, Javier Martinez Canillas wrote: Using the BIT() macro instead of manually shifting bits makes the code less error prone. If is more readable is a matter of taste so only replace if the file is already using this macro. Signed-off-by: Javier Martinez Canillas jav...@dowhile0.org I don't think this should be enabled by default. It will generate a ton of false positives, not everything that is 1 shifted by something is a single-bit field. E.g. imagine a device with multi-bit fields: #define FOOBAR_A (0 FOOBAR_OFFSET) #define FOOBAR_B (1 FOOBAR_OFFSET) #define FOOBAR_C (2 FOOBAR_OFFSET) #define FOOBAR_D (3 FOOBAR_OFFSET) The script will now suggest to replace FOOBAR_B (1 FOOBAR_OFFSET) with FOOBAR_B BIT(FOOBAR_OFFSET). Which is technically correct, but not semantically. - Lars --- Changes since v1: - Add a rule that checks if the file is already using this macro as suggested by Julia Lawall scripts/coccinelle/api/bit.cocci | 30 ++ 1 file changed, 30 insertions(+) create mode 100644 scripts/coccinelle/api/bit.cocci diff --git a/scripts/coccinelle/api/bit.cocci b/scripts/coccinelle/api/bit.cocci new file mode 100644 index 000..a02cfd3 --- /dev/null +++ b/scripts/coccinelle/api/bit.cocci @@ -0,0 +1,30 @@ +// Use the BIT() macro if is already used +// +// Confidence: High +// Copyright (C) 2014 Javier Martinez Canillas. GPLv2. +// URL: http://coccinelle.lip6.fr/ +// Options: --include-headers + +@hasbitops@ +@@ + +#include linux/bitops.h + +@usesbit@ +@@ + +BIT(...) + +@depends on hasbitops usesbit@ +expression E; +@@ + +- 1 E ++ BIT(E) + +@depends on hasbitops usesbit@ +expression E; +@@ + +- BIT((E)) ++ BIT(E) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] nohz: Fix iowait overcounting if iowait task migrates
On Thu, Apr 24, 2014 at 08:45:58PM +0200, Denys Vlasenko wrote: Before this change, if last IO-blocked task wakes up on a different CPU, the original CPU may stay idle for much longer, and the entire time it stays idle is accounted as iowait time. This change adds struct tick_sched::iowait_exittime member. On entry to idle, it is set to KTIME_MAX. Last IO-blocked task, if migrated, sets it to current time. Note that this can happen only once per each idle period: new iowaiting tasks can't magically appear on idle CPU's rq. If iowait_exittime is set, then (iowait_exittime - idle_entrytime) gets accounted as iowait, and the remaining (now - iowait_exittime) as true idle. Run-tested: /proc/stat counters no longer go backwards. Signed-off-by: Denys Vlasenko dvlas...@redhat.com Cc: Frederic Weisbecker fweis...@gmail.com Cc: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Cc: Fernando Luis Vazquez Cao fernando...@lab.ntt.co.jp Cc: Tetsuo Handa penguin-ker...@i-love.sakura.ne.jp Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Andrew Morton a...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com Cc: Oleg Nesterov o...@redhat.com --- include/linux/tick.h | 2 ++ kernel/sched/core.c | 14 +++ kernel/time/tick-sched.c | 64 3 files changed, 70 insertions(+), 10 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 4de1f9e..1bf653e 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -67,6 +67,7 @@ struct tick_sched { ktime_t idle_exittime; ktime_t idle_sleeptime; ktime_t iowait_sleeptime; + ktime_t iowait_exittime; seqcount_t idle_sleeptime_seq; ktime_t sleep_length; unsigned long last_jiffies; @@ -140,6 +141,7 @@ extern void tick_nohz_irq_exit(void); extern ktime_t tick_nohz_get_sleep_length(void); extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time); extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time); +extern void tick_nohz_iowait_to_idle(int cpu); # else /* !CONFIG_NO_HZ_COMMON */ static inline int tick_nohz_tick_stopped(void) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 268a45e..ffea757 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4218,7 +4218,14 @@ void __sched io_schedule(void) current-in_iowait = 1; schedule(); current-in_iowait = 0; +#ifdef CONFIG_NO_HZ_COMMON + if (atomic_dec_and_test(rq-nr_iowait)) { + if (raw_smp_processor_id() != cpu_of(rq)) + tick_nohz_iowait_to_idle(cpu_of(rq)); Note that even using seqlock doesn't alone help to fix the preemption issue when the above may overwrite the exittime of the next last iowait task from the old rq. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3 2/2] fs/ext4/fsync.c: generic_file_fsync call based on barrier flag
On Mon 28-04-14 23:15:08, Fabian Frederick wrote: generic_file_fsync has been updated to issue a flush for older filesystems. This patch tests for barrier flag in ext4 mount flags and calls the right function. Suggested-by: Jan Kara j...@suse.cz Suggested-by: Christoph Hellwig h...@infradead.org Cc: Jan Kara j...@suse.cz Cc: Christoph Hellwig h...@infradead.org Cc: Alexander Viro v...@zeniv.linux.org.uk Cc: Theodore Ts'o ty...@mit.edu Cc: Andrew Morton a...@linux-foundation.org Signed-off-by: Fabian Frederick f...@skynet.be The patch looks good. You can add: Reviewed-by: Jan Kara j...@suse.cz Honza --- fs/ext4/fsync.c | 4 1 file changed, 4 insertions(+) diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index a8bc47f..fa82c0a 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -108,6 +108,10 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) if (!journal) { ret = generic_file_fsync(file, start, end, datasync); + if (test_opt(inode-i_sb, BARRIER)) + ret = generic_file_fsync(file, start, end, datasync); + else + ret = __generic_file_fsync(file, start, end, datasync); if (!ret !hlist_empty(inode-i_dentry)) ret = ext4_sync_parent(inode); goto out; -- 1.8.4.5 -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3 1/2] FS: Add generic data flush to fsync
On Tue, 29 Apr 2014 18:19:07 +0200 Jan Kara j...@suse.cz wrote: On Mon 28-04-14 23:12:39, Fabian Frederick wrote: This patch issues a flush in generic_file_fsync. (Modern filesystems already do it) -Behaviour can be reversed using /sys/devices/.../cache_type -Filesystems can also call __generic_file_fsync with bool flush false The patch looks good. You can add: Reviewed-by: Jan Kara j...@suse.cz Honza Just noticed I forgot to remove with bool flush false in patch description above (-Filesystems can also call __generic_file_fsync with bool flush false) Tell me if I need to send the patch again. Fabian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview
On Tue, 29 Apr 2014 14:40:37 + R, Durgadoss durgados...@intel.com wrote: -Original Message- From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Tuesday, April 29, 2014 6:33 PM To: R, Durgadoss Cc: Linux PM; Wysocki, Rafael J; LKML; David E. Box; Alan Cox; Accardi, Kristen C Subject: Re: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview On Tue, 29 Apr 2014 02:45:22 + R, Durgadoss durgados...@intel.com wrote: Hi Jacob, -Original Message- From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Monday, April 28, 2014 7:35 PM To: Linux PM; Wysocki, Rafael J; LKML Cc: David E. Box; Alan Cox; R, Durgadoss; Accardi, Kristen C; Jacob Pan Subject: [PATCH 5/5] powercap/rapl: change floor frequency for vallewview RAPL power limit reduce power by limiting CPU P-state and other techniques. On Valleyview, RAPL power limit cannot go to LFM (low frequency mode) if we don't set the floor frequency via IOSF mailbox. This patch enables setting of floor frquency such that RAPL power limit is more effective. Signed-off-by: Jacob Pan jacob.jun@linux.intel.com --- drivers/powercap/intel_rapl.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c index b1cda6f..13e4776 100644 --- a/drivers/powercap/intel_rapl.c +++ b/drivers/powercap/intel_rapl.c @@ -32,6 +32,7 @@ #include asm/processor.h #include asm/cpu_device_id.h +#include asm/iosf_mbi.h /* bitmasks for RAPL MSRs, used by primitive access functions */ #define ENERGY_STATUS_MASK 0x @@ -336,11 +337,17 @@ static int find_nr_power_limit(struct rapl_domain *rd) return i; } +#define VLV_CPU_POWER_BUDGET_CTL (0x2) +static const struct x86_cpu_id valleyview_id[] = { + { X86_VENDOR_INTEL, 6, 0x37}, + {} +}; There are other platforms that have this FloorFreq register as well. And those addresses are not '0x02'. So, we need to have a cpu_id based table to define the address of the floor freq register as well. [This is not specific to valleyview.] Sounds like I need to add an abstraction to capture this. So far, there are only two exceptions so i was hesitate to do so. Thanks for the input. Yes, We at least have few platforms that need this. Also, is there a plan to expose this floor freq ratio through Sysfs for runtime configuration. ? May be through a standard thermal cooling device interface ? why would that be necessary? who will use it? floor freq only affects RAPL, AFAIK. In Linux there is no guaranteed freq anyway. My original patch to enable RAPL as cooling device was abandoned in favor of powercap framework, I am not sure if we should go back. There are user space thermal controls which change RAPL Power limits according to platform's thermal condition as you might be aware. The floor frequency is not used only to transition to LFM ratio. We can transition to any frequency ratio by adjusting this floor frequency (at least on VLV and couple more platforms) Hence while changing RAPL Power Limits, there is a need to adjust this also, to specify which ratio is our Floor (basically we will not go below that). That's why we need an interface for modifying this at run time (along with Power Limits). I understand. What I am proposing here is to have a single knob for user control power, instead of two knobs (power limit and floor freq) which may have conflicts. When thermal throttling is needed, user only cares about power limit, that is why I think it is better to set floor to LFM and let power limit be the only knob. It is simpler. In case freq is a constraint, user should use cpufreq interface. Thanks, Durga + static int set_domain_enable(struct powercap_zone *power_zone, bool mode) { struct rapl_domain *rd = power_zone_to_rapl_domain(power_zone); int nr_powerlimit; - + u32 mdata = 0; if (rd-state DOMAIN_STATE_BIOS_LOCKED) return -EACCES; get_online_cpus(); @@ -350,7 +357,16 @@ static int set_domain_enable(struct powercap_zone *power_zone, bool mode) /* always enable clamp such that p-state can go below OS requested * range. power capping priority over guranteed frequency. */ - rapl_write_data_raw(rd, PL1_CLAMP, mode); + if (x86_match_cpu(valleyview_id)) { + iosf_mbi_read(BT_MBI_UNIT_PMC, BT_MBI_PMC_READ, + VLV_CPU_POWER_BUDGET_CTL, mdata); + mdata = ~(0x7f 8); + mdata |= 1 8; + iosf_mbi_write(BT_MBI_UNIT_PMC, BT_MBI_PMC_WRITE, + VLV_CPU_POWER_BUDGET_CTL,
Re: pid ns feature request
On Mon, Apr 28, 2014 at 6:39 AM, Serge Hallyn serge.hal...@ubuntu.com wrote: Quoting Andy Lutomirski (l...@amacapital.net): On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman ebied...@xmission.com wrote: Andy Lutomirski l...@amacapital.net writes: Unless I'm missing some trick, it's currently rather painful to mount a namespace /proc. You have to actually be in the pid namespace to mount the correct /proc instance, and you can't unmount the old /proc until you've mounted the new /proc. This means that you have to fork into the new pid namespace before you can finish setting it up. Yes. You have to be inside just about all namespaces before you can finish setting them up. I don't know the context in which needed to be inside the pid namespace is a burden. I'm trying to sandbox myself. I unshare everything, setup up new mounts, pivot_root, umount the old stuff, fork, and wait around for the child to finish. This doesn't work: the parent can't mount the new /proc, and the child can't either because it's too late. I'm probably not thinking it through enough... But can't the parent, before forking, do mkdir -p /childproc/proc mount --bind /childproc /childproc mount --make-rshared /childproc then the child mounts its proc under /childproc/proc and have that show up in the parent's tree? Yes, and the --make-rshared /childproc isn't necessary. This is still a bit annoying, since the parent now needs to wait for the child to set up mounts if it wants to do anything that requires all the mounts to be fully set up. This issue certainly isn't a show-stopper, but it might be nice to address if anyone ever adds options to proc to do other sensible namespacy things (e.g. turning off sysctls). --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Warning from kernel/printk/printk.c in linux-next
Jan, I am running linux-next 20140429 on a mx6 board (ARM 32-bit) and after commit 5dc90cb49691755faa (printk: enable interrupts before calling console_trylock_for_printk()) I get the following warning: [ INFO: possible recursive locking detected ] 3.15.0-rc3-next-20140429-1-gac246a5 #1074 Not tainted - swapper/0/0 is trying to acquire lock: (console_lock){+.+...}, at: [808c1358] con_init+0x14/0x29c but task is already holding lock: (console_lock){+.+...}, at: [8006deac] vprintk_emit+0x194/0x514 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(console_lock); lock(console_lock); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by swapper/0/0: #0: (console_lock){+.+...}, at: [8006deac] vprintk_emit+0x194/0x514 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc3-next-20140429-1-gac2464 Backtrace: [80011cbc] (dump_backtrace) from [80011e58] (show_stack+0x18/0x1c) r6:8114b4fc r5: r4: r3: [80011e40] (show_stack) from [8065e65c] (dump_stack+0x88/0xa4) [8065e5d4] (dump_stack) from [80065518] (__lock_acquire+0x1494/0x1c10) r5:808f3f70 r4:80a7d6c0 [80064084] (__lock_acquire) from [80066178] (lock_acquire+0x68/0x7c) r10: r9:412fc09a r8:8000406a r7:0001 r6:6153 r5:808e8000 r4: [80066110] (lock_acquire) from [8006c91c] (console_lock+0x54/0x68) r7:befffbc0 r6:808d6db0 r5:808c1358 r4:0001 [8006c8c8] (console_lock) from [808c1358] (con_init+0x14/0x29c) r5:808e37a4 r4:808e37a4 [808c1344] (con_init) from [808c0ae4] (console_init+0x24/0x38) r6:808d6db0 r5:808e37a4 r4:808e37a4 [808c0ac0] (console_init) from [80894bb4] (start_kernel+0x26c/0x3a4) r5:8094d5c0 r4: [80894948] (start_kernel) from [80008074] (0x80008074) r7:808f5644 r6:808d6dac r5:808f0928 r4:10c5387d Reverting this commit causes the warning to go away. Any suggestions? Regards, Fabio Estevam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v3 1/9] sysrq: Implement __handle_sysrq_nolock to avoid recursive locking in kdb
On Tue, Apr 29, 2014 at 1:59 AM, Daniel Thompson daniel.thomp...@linaro.org wrote: On 28/04/14 18:44, Colin Cross wrote: Is that case documented somewhere in the code comments? Perhaps not near enough to the _nolock but the primary bit of comment is here (and in same file as kdb_sr). --- cut here --- * kdb_main_loop - After initial setup and assignment of the * controlling cpu, all cpus are in this loop. One cpu is in * control and will issue the kdb prompt, the others will spin * until 'go' or cpu switch. --- cut here --- The mechanism kgdb uses to quiesce other CPUs means other CPUs cannot be in irqsave critical sections. One of the advantages of FIQ debugger is that it can be triggered from an FIQ (NMI for those in x86 land), and Jason and I have discussed using FIQs for kgdb to allow interrupting cpus stuck in critical sections. If that gets implemented the above assumption will no longer be correct. Reviewing this I realized I missed one of the most critical points in the above. Today kdb, even if triggered by FIQ/NMI, would still be likely to wedge waiting for the IPI interrupts to be delivered to other processors. Did you and Jason discuss getting the active CPU to quiesce the other processors using FIQ/NMI, or to allow the active CPU to timeout while waiting for them the stop? Daniel. Yes, all cpus would have to get an FIQ/NMI. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: intel_pstate: Change the calculation of next pstate
On 29/04/2014 07:58 πμ, Viresh Kumar wrote: Cc'd Dirk, On 28 April 2014 03:42, Stratos Karafotis strat...@semaphore.gr wrote: Currently the driver calculates the next pstate proportional to core_busy factor and reverse proportional to current pstate. Change the above method and calculate the next pstate independently of current pstate. We must mention why the change is required. Hi Viresh, Actually, I can't say that it's required. :) I just believe that calculation of next p-state should be independent from current one. In my opinion we can't scale the load across different p-states, because it's not always equivalent. For example suppose a load of 100% because of a tight for loop in the current p-state. It will be also a 100% load in any other p-state. It will be wrong if we scale the load in the calculation formula according to the current p-state. I included the test results in the change log to point out an improvement because of this patch. I will enrich more the change log as you suggested. Thanks, Stratos Karafotis -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/10] arm64: UEFI support
On Tue, Apr 29, 2014 at 03:47:13PM +0100, Matt Fleming wrote: On Tue, 29 Apr, at 02:47:28PM, Catalin Marinas wrote: Given that Leif's series contains both generic efi and arm64 patches, what's your preference for merging them? I'm happy to add my ack and they go via your tree (or the other way around). I'm happy either way, though if I take them through my tree (and subsequently through tip) you won't have to worry about the merge window rigmarole, which is a plus. So, eveyone happy for me to take these with Catalin's Acked-by? Fine by me. Just in case I haven't stated it explicitly for this series: Acked-by: Catalin Marinas catalin.mari...@arm.com Thanks. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mmotm 2014-04-24-13-07 uploaded
On Tue, Apr 29, 2014 at 08:07:24AM -0400, Rik van Riel wrote: On 04/28/2014 05:15 PM, Paul E. McKenney wrote: On Mon, Apr 28, 2014 at 01:32:38PM -0700, Randy Dunlap wrote: On 04/28/14 13:06, Paul E. McKenney wrote: Please see below for a patch against next-20140428 that makes this build for me. This is derived from Rik's patch, my patch, and is consistent with Arnd's patch. Thanx, Paul Thnx, works for me. Finally. Good! Rik, how would you like to proceed with this? I guess this fix should go into -mm? Andrew dropped the original patch, so a consolidated patch is needed. I believe that something like the following is what we ended up with. Does this look right to you? Thanx, Paul sysrq,rcu: suppress RCU stall warnings while sysrq runs Some sysrq handlers can run for a long time, because they dump a lot of data onto a serial console. Having RCU stall warnings pop up in the middle of them only makes the problem worse. This patch temporarily disables RCU stall warnings while a sysrq request is handled. Signed-off-by: Rik van Riel r...@redhat.com [ paulmck: Fix build bugs for obscure config options. ] Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Jiri Kosina jkos...@suse.cz Cc: Jiri Slaby jsl...@suse.cz Cc: Joern Engel jo...@logfs.org Cc: Peter Zijlstra pet...@infradead.org Cc: Madper Xie c...@redhat.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Signed-off-by: Andrew Morton a...@linux-foundation.org diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index fc67a89..38d5f9a 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -46,6 +46,7 @@ #include linux/jiffies.h #include linux/syscalls.h #include linux/of.h +#include linux/rcupdate.h #include asm/ptrace.h #include asm/irq_regs.h @@ -511,6 +512,7 @@ void __handle_sysrq(int key, bool check_mask) int orig_log_level; int i; + rcu_sysrq_start(); rcu_read_lock(); /* * Raise the apparent loglevel to maximum so that the sysrq header @@ -554,6 +556,7 @@ void __handle_sysrq(int key, bool check_mask) console_loglevel = orig_log_level; } rcu_read_unlock(); + rcu_sysrq_end(); } void handle_sysrq(int key) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 00a7fd6..f3a672c 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -227,6 +227,17 @@ void rcu_idle_enter(void); void rcu_idle_exit(void); void rcu_irq_enter(void); void rcu_irq_exit(void); +#ifdef CONFIG_RCU_STALL_COMMON +void rcu_sysrq_start(void); +void rcu_sysrq_end(void); +#else /* #ifdef CONFIG_RCU_STALL_COMMON */ +static inline void rcu_sysrq_start(void) +{ +} +static inline void rcu_sysrq_end(void) +{ +} +#endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */ #ifdef CONFIG_RCU_USER_QS void rcu_user_enter(void); diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 4c0a9b0..d22309c 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -320,6 +320,18 @@ int rcu_jiffies_till_stall_check(void) return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; } +void rcu_sysrq_start(void) +{ + if (!rcu_cpu_stall_suppress) + rcu_cpu_stall_suppress = 2; +} + +void rcu_sysrq_end(void) +{ + if (rcu_cpu_stall_suppress == 2) + rcu_cpu_stall_suppress = 0; +} + static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) { rcu_cpu_stall_suppress = 1; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: OMAP5: Switch to THUMB mode if needed on secondary CPU
On Apr 29, 2014, at 2:17 AM, Dave Martin dave.mar...@arm.com wrote: On Mon, Apr 28, 2014 at 06:21:49PM +0100, Joel Fernandes wrote: On 04/28/2014 12:20 PM, Joel Fernandes wrote: On 04/28/2014 11:43 AM, Dave Martin wrote: On Tue, Apr 22, 2014 at 01:31:46PM -0500, Joel Fernandes wrote: On my DRA7 system, when the kernel is built in THUMB mode, the secondary CPU (Cortex A15) fails to come up causing SMP boot on second CPU to timeout. This seems to be because the CPU is in ARM mode once the ROM hands over control to the kernel. Switch to THUMB mode if required once the kernel is control of secondary CPU. On OMAP4 on the other hand, it appears to be in THUMB mode on entry so this is not required and SMP boot works as is. Cc: Santosh Shilimkar santosh.shilim...@ti.com Cc: Russell King li...@arm.linux.org.uk Cc: Nishanth Menon n...@ti.com Cc: Tony Lindgren t...@atomide.com Signed-off-by: Joel Fernandes jo...@ti.com --- arch/arm/mach-omap2/omap-headsmp.S |8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/arm/mach-omap2/omap-headsmp.S b/arch/arm/mach-omap2/omap-headsmp.S index 75e9295..1809dce 100644 --- a/arch/arm/mach-omap2/omap-headsmp.S +++ b/arch/arm/mach-omap2/omap-headsmp.S @@ -1,7 +1,7 @@ /* * Secondary CPU startup routine source file. * - * Copyright (C) 2009 Texas Instruments, Inc. + * Copyright (C) 2014 Texas Instruments, Inc. * * Author: * Santosh Shilimkar santosh.shilim...@ti.com @@ -28,9 +28,13 @@ * code. This routine also provides a holding flag into which * secondary core is held until we're ready for it to initialise. * The primary core will update this flag using a hardware -+ * register AuxCoreBoot0. + * register AuxCoreBoot0. */ ENTRY(omap5_secondary_startup) Are you sure this problem is not caused by the missing ENDPROC() for omap5_secondary_startup? You have END() instead (which may have been accidental). Without ENDPROC(), the symbol is not marked as a function and so the Thumb bit won't be set when taking a pointer -- so the kernel is actually telling the firmware to enter in ARM state. Try changing END() to ENDPROC() without this patch, and see if it makes a difference. If it still doesn't work, then the firmware either doesn't support entering in ARM, or is buggy. Thanks for the suggestion. I'm guessing what you mean is with ENDPROC, interworking code uses bx instead of bl to set thumb mode. But ROM/firmware doesn't have access to symbol table, how would it know the type of the symbol to be ARM or THUMB before it branches? Sorry what I meant is, say its of Type function. What tells the firmware to switch to THUMB? What's typically done is a boot address register is written by the kernel, and the firmware jumps to it after WFE. Using ENTRY(x) ... ENDPROC(x) causes the symbol seen by the linker for x to have the Thumb bit set if the code is Thumb. This means that any reference the linker fixes up for that symbol will have the Thumb bit set appropriately. This applies to any kind of reference, so code in another file that takes the address of the symbol and then passes that address to the firmware should result in the firmware getting an address with the Thumb bit. From the firmware's point of view it just gets a raw address, but the Thumb bit will now be set. The firmware still needs to handle this correctly when jumping, but from the look of the code this may already work on omap3/4. It would be interesting to know whether it works on omap5. Thanks a lot for the explanation. That makes perfect sense. I will try it and let you know if it works on OMAP5. Regards, -Joel Cheers ---Dave ___ linux-arm-kernel mailing list linux-arm-ker...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] of: Add vendor prefix for Linear Technology Corporation
Add Linear Technology Corporation to the list of device tree vendor prefixes. Signed-off-by: Philipp Zabel p.za...@pengutronix.de --- Documentation/devicetree/bindings/vendor-prefixes.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 0f01c9b..3d27991 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -65,6 +65,7 @@ lantiqLantiq Semiconductor lg LG Corporation linux Linux-specific binding lsiLSI Corp. (LSI Logic) +ltcLinear Technology Corporation marvellMarvell Technology Group Ltd. maxim Maxim Integrated Products microchip Microchip Technology Inc. -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] regulator: Add LTC3589 support
This patch adds support for the Linear Technology LTC3589, LTC3589-1, and LTC3589-2 8-output I2C voltage regulator ICs. Signed-off-by: Philipp Zabel p.za...@pengutronix.de --- drivers/regulator/Kconfig | 6 + drivers/regulator/Makefile | 1 + drivers/regulator/ltc3589.c | 564 3 files changed, 571 insertions(+) create mode 100644 drivers/regulator/ltc3589.c diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig index 903eb37..5599a61 100644 --- a/drivers/regulator/Kconfig +++ b/drivers/regulator/Kconfig @@ -265,6 +265,12 @@ config REGULATOR_LP8788 help This driver supports LP8788 voltage regulator chip. +config REGULATOR_LTC3589 + bool LTC3589 8-output voltage regulator + help + This enables support for the LTC3589, LTC3589-1, and LTC3589-2 + 8-output regulators controlled via I2C. + config REGULATOR_MAX14577 tristate Maxim 14577 regulator depends on MFD_MAX14577 diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile index 12ef277..16d429b 100644 --- a/drivers/regulator/Makefile +++ b/drivers/regulator/Makefile @@ -37,6 +37,7 @@ obj-$(CONFIG_REGULATOR_LP872X) += lp872x.o obj-$(CONFIG_REGULATOR_LP8788) += lp8788-buck.o obj-$(CONFIG_REGULATOR_LP8788) += lp8788-ldo.o obj-$(CONFIG_REGULATOR_LP8755) += lp8755.o +obj-$(CONFIG_REGULATOR_LTC3589) += ltc3589.o obj-$(CONFIG_REGULATOR_MAX14577) += max14577.o obj-$(CONFIG_REGULATOR_MAX1586) += max1586.o obj-$(CONFIG_REGULATOR_MAX8649)+= max8649.o diff --git a/drivers/regulator/ltc3589.c b/drivers/regulator/ltc3589.c new file mode 100644 index 000..37e18dc --- /dev/null +++ b/drivers/regulator/ltc3589.c @@ -0,0 +1,564 @@ +/* + * Linear Technology LTC3589,LTC3589-1 regulator support + * + * Copyright (c) 2014 Philipp Zabel p.za...@pengutronix.de, Pengutronix + * + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ +#include linux/i2c.h +#include linux/init.h +#include linux/interrupt.h +#include linux/module.h +#include linux/kernel.h +#include linux/of.h +#include linux/regmap.h +#include linux/regulator/driver.h +#include linux/regulator/of_regulator.h + +#define DRIVER_NAMEltc3589 + +#define LTC3589_IRQSTAT0x02 +#define LTC3589_SCR1 0x07 +#define LTC3589_OVEN 0x10 +#define LTC3589_SCR2 0x12 +#define LTC3589_PGSTAT 0x13 +#define LTC3589_VCCR 0x20 +#define LTC3589_CLIRQ 0x21 +#define LTC3589_B1DTV1 0x23 +#define LTC3589_B1DTV2 0x24 +#define LTC3589_VRRCR 0x25 +#define LTC3589_B2DTV1 0x26 +#define LTC3589_B2DTV2 0x27 +#define LTC3589_B3DTV1 0x29 +#define LTC3589_B3DTV2 0x2a +#define LTC3589_L2DTV1 0x32 +#define LTC3589_L2DTV2 0x33 + +#define LTC3589_IRQSTAT_PGOOD_TIMEOUT BIT(3) +#define LTC3589_IRQSTAT_UNDERVOLT_WARN BIT(4) +#define LTC3589_IRQSTAT_UNDERVOLT_FAULTBIT(5) +#define LTC3589_IRQSTAT_THERMAL_WARN BIT(6) +#define LTC3589_IRQSTAT_THERMAL_FAULT BIT(7) + +#define LTC3589_OVEN_SW1 BIT(0) +#define LTC3589_OVEN_SW2 BIT(1) +#define LTC3589_OVEN_SW3 BIT(2) +#define LTC3589_OVEN_BB_OUTBIT(3) +#define LTC3589_OVEN_LDO2 BIT(4) +#define LTC3589_OVEN_LDO3 BIT(5) +#define LTC3589_OVEN_LDO4 BIT(6) +#define LTC3589_OVEN_SW_CTRL BIT(7) + +#define LTC3589_VCCR_SW1_GOBIT(0) +#define LTC3589_VCCR_SW2_GOBIT(2) +#define LTC3589_VCCR_SW3_GOBIT(4) +#define LTC3589_VCCR_LDO2_GO BIT(6) + +enum ltc3589_variant { + LTC3589, + LTC3589_1, + LTC3589_2, +}; + +enum ltc3589_reg { + LTC3589_SW1, + LTC3589_SW2, + LTC3589_SW3, + LTC3589_BB_OUT, + LTC3589_LDO1, + LTC3589_LDO2, + LTC3589_LDO3, + LTC3589_LDO4, + LTC3589_NUM_REGULATORS, +}; + +struct ltc3589_regulator { + struct regulator_desc desc; + + /* External feedback voltage divider */ + unsigned int r1; + unsigned int r2; +}; + +struct ltc3589 { + struct regmap *regmap; + struct device *dev; + enum ltc3589_variant variant; + struct ltc3589_regulator regulator_descs[LTC3589_NUM_REGULATORS]; + struct regulator_dev *regulators[LTC3589_NUM_REGULATORS]; +}; + +static const int ltc3589_ldo4[] = { + 280, 250, 180, 330, +}; + +static const int
[PATCH 2/3] regulator: ltc3589: Add DT binding documentation
This patch adds the device tree binding documentation for Linear Technology LTC3589, LTC3589-1, and LTC3589-2 8-port regulators. Signed-off-by: Philipp Zabel p.za...@pengutronix.de --- .../devicetree/bindings/regulator/ltc3589.txt | 99 ++ 1 file changed, 99 insertions(+) create mode 100644 Documentation/devicetree/bindings/regulator/ltc3589.txt diff --git a/Documentation/devicetree/bindings/regulator/ltc3589.txt b/Documentation/devicetree/bindings/regulator/ltc3589.txt new file mode 100644 index 000..4229a2e --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/ltc3589.txt @@ -0,0 +1,99 @@ +Linear Technology LTC3589, LTC3589-1, and LTC3589-2 8-output regulators + +Required properties: +- compatible: ltc,ltc3589, ltc,ltc3589-1 or ltc,ltc3589-2 +- reg: I2C slave address + +Required child node: +- regulators: Contains eight regulator child nodes sw1, sw2, sw3, bb-out, + ldo1, ldo2, ldo3, and ldo4, specifying the initialization data as + documented in Documentation/devicetree/bindings/regulator/regulator.txt. + +Each regulator is defined using the standard binding for regulators. The +nodes for sw1, sw2, sw3, bb-out, ldo1, and ldo2 additionally need to specify +the resistor values of their external feedback voltage dividers: + +Required properties (not on ldo3, ldo4): +- ltc,fb-voltage-divider: An array of two integers containing the resistor + values R1 and R2 of the feedback voltage divider in ohms. + +Regulators sw1, sw2, sw3, and ldo2 can regulate the feedback reference from +0.3625 V to 0.75 V in 12.5 mV steps. The output voltage thus ranges between +0.3625 * (1 + R1/R2) V and 0.75 * (1 + R1/R2) V. Regulators bb-out and ldo1 +have a fixed 0.8 V reference and thus output 0.8 * (1 + R1/R2) V. The ldo3 +regulator is fixed to 1.8 V on LTC3589 and to 2.8 V on LTC3589-1,2. The ldo4 +regulator can output between 1.8 V and 3.3 V on LTC3589 and between 1.2 V +and 3.2 V on LTC3589-1,2 in four steps. The ldo1 standby regulator can not +be disabled and thus should have the regulator-always-on property set. + +Example: + + ltc3589: pmic@34 { + compatible = ltc,ltc3589-1; + reg = 0x34; + + regulators { + sw1_reg: sw1 { + regulator-min-microvolt = 591930; + regulator-max-microvolt = 1224671; + ltc,fb-voltage-divider = 10 158000; + regulator-ramp-delay = 7000; + regulator-boot-on; + regulator-always-on; + }; + + sw2_reg: sw2 { + regulator-min-microvolt = 704123; + regulator-max-microvolt = 1456803; + ltc,fb-voltage-divider = 18 191000; + regulator-ramp-delay = 7000; + regulator-boot-on; + regulator-always-on; + }; + + sw3_reg: sw3 { + regulator-min-microvolt = 1341250; + regulator-max-microvolt = 2775000; + ltc,fb-voltage-divider = 27 10; + regulator-ramp-delay = 7000; + regulator-boot-on; + regulator-always-on; + }; + + bb_out_reg: bb-out { + regulator-min-microvolt = 3387341; + regulator-max-microvolt = 3387341; + ltc,fb-voltage-divider = 511000 158000; + regulator-boot-on; + regulator-always-on; + }; + + ldo1_reg: ldo1 { + regulator-min-microvolt = 1306329; + regulator-max-microvolt = 1306329; + ltc,fb-voltage-divider = 10 158000; + regulator-boot-on; + regulator-always-on; + }; + + ldo2_reg: ldo2 { + regulator-min-microvolt = 704123; + regulator-max-microvolt = 1456806; + ltc,fb-voltage-divider = 18 191000; + regulator-ramp-delay = 7000; + regulator-boot-on; + regulator-always-on; + }; + + ldo3_reg: ldo3 { + regulator-min-microvolt = 280; + regulator-max-microvolt = 280; + regulator-boot-on; + }; + +
Re: [PATCH v2 00/10] arm64: UEFI support
On Tue, Apr 29, 2014 at 04:27:07PM +0100, Matt Fleming wrote: I'm wondering if it would be better to organize it into a separate topic branch. We can still take it through tip, if you want, but it would be better than putting it all into one tree. Sure, that makes sense. I'll do that. I've set up a new topic branch, uefi-for-3.16 on git://git.linaro.org/people/leif.lindholm/linux.git Based off tip/x86/efi. / Leif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/