[PATCH 0/2] Tegra124 clock fixes
A few fixes for Tegra124 clocks. These patches will be squashed into the pull request I will send out later today, but I wanted to post them here as well so the changes are known. Peter De Schrijver (2): clk: tegra: fix vi clk for Terga124 clk: tegra: fix pllcx pdiv for Tegra124 drivers/clk/tegra/clk-id.h |1 + drivers/clk/tegra/clk-tegra-periph.c |8 drivers/clk/tegra/clk-tegra124.c |5 - 3 files changed, 13 insertions(+), 1 deletions(-) -- 1.7.7.rc0.72.g4b5ea.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 31/78] Staging: bcm: info leak in ioctl
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Dan Carpenter commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: Dan Carpenter Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- drivers/staging/bcm/Bcmchar.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/bcm/Bcmchar.c b/drivers/staging/bcm/Bcmchar.c index cf30592..c0d612f 100644 --- a/drivers/staging/bcm/Bcmchar.c +++ b/drivers/staging/bcm/Bcmchar.c @@ -1957,6 +1957,7 @@ cntrlEnd: BCM_DEBUG_PRINT(Adapter, DBG_TYPE_OTHERS, OSAL_DBG, DBG_LVL_ALL, "Called IOCTL_BCM_GET_DEVICE_DRIVER_INFO\n"); + memset(, 0, sizeof(DevInfo)); DevInfo.MaxRDMBufferSize = BUFFER_4K; DevInfo.u32DSDStartOffset = EEPROM_CALPARAM_START; DevInfo.u32RxAlignmentCorrection = 0; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 41/78] usb: Disable USB 2.0 Link PM before device reset.
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Sarah Sharp commit dcc01c0864823f91c3bf3ffca6613e2351702b87 upstream. Before the USB core resets a device, we need to disable the L1 timeout for the roothub, if USB 2.0 Link PM is enabled. Otherwise the port may transition into L1 in between descriptor fetches, before we know if the USB device descriptors changed. LPM will be re-enabled after the full device descriptors are fetched, and we can confirm the device still supports USB 2.0 LPM after the reset. We don't need to wait for the USB device to exit L1 before resetting the device, since the xHCI roothub port diagrams show a transition to the Reset state from any of the Ux states (see Figure 34 in the 2012-08-14 xHCI specification update). This patch should be backported to kernels as old as 3.2, that contain the commit 65580b4321eb36f16ae8b5987bfa1bb948fc5112 "xHCI: set USB2 hardware LPM". That was the first commit to enable USB 2.0 hardware-driven Link Power Management. Signed-off-by: Sarah Sharp [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/usb/core/hub.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 7be4e11..86c7421 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -4824,6 +4824,12 @@ static int usb_reset_and_verify_device(struct usb_device *udev) } parent_hub = hdev_to_hub(parent_hdev); + /* Disable USB2 hardware LPM. +* It will be re-enabled by the enumeration process. +*/ + if (udev->usb2_hw_lpm_enabled == 1) + usb_set_usb2_hardware_lpm(udev, 0); + /* Disable LPM while we reset the device and reinstall the alt settings. * Device-initiated LPM settings, and system exit latency settings are * cleared when the device is reset, so we have to set them up again. -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] clk: tegra: fix pllcx pdiv for Tegra124
The post divider field for pllcx on Tegra124 has some more allowed values than the one on Tegra114. Fix the code to reflect this. Signed-off-by: Peter De Schrijver --- drivers/clk/tegra/clk-tegra124.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/clk/tegra/clk-tegra124.c b/drivers/clk/tegra/clk-tegra124.c index 54af043..863c38b 100644 --- a/drivers/clk/tegra/clk-tegra124.c +++ b/drivers/clk/tegra/clk-tegra124.c @@ -263,8 +263,11 @@ static struct div_nmp pllcx_nmp = { static struct pdiv_map pllc_p[] = { { .pdiv = 1, .hw_val = 0 }, { .pdiv = 2, .hw_val = 1 }, + { .pdiv = 3, .hw_val = 2 }, { .pdiv = 4, .hw_val = 3 }, + { .pdiv = 6, .hw_val = 4 }, { .pdiv = 8, .hw_val = 5 }, + { .pdiv = 12, .hw_val = 6 }, { .pdiv = 16, .hw_val = 7 }, { .pdiv = 0, .hw_val = 0 }, }; -- 1.7.7.rc0.72.g4b5ea.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 40/78] USB: mos7840: fix tiocmget error handling
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Johan Hovold commit a91ccd26e75235d86248d018fe3779732bcafd8d upstream. Make sure to return errors from tiocmget rather than rely on uninitialised stack data. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Luis Henriques --- drivers/usb/serial/mos7840.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c index d9368be..08aad01 100644 --- a/drivers/usb/serial/mos7840.c +++ b/drivers/usb/serial/mos7840.c @@ -1707,7 +1707,11 @@ static int mos7840_tiocmget(struct tty_struct *tty) return -ENODEV; status = mos7840_get_uart_reg(port, MODEM_STATUS_REGISTER, ); + if (status != 1) + return -EIO; status = mos7840_get_uart_reg(port, MODEM_CONTROL_REGISTER, ); + if (status != 1) + return -EIO; result = ((mcr & MCR_DTR) ? TIOCM_DTR : 0) | ((mcr & MCR_RTS) ? TIOCM_RTS : 0) | ((mcr & MCR_LOOPBACK) ? TIOCM_LOOP : 0) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/6] Update Davinci watchdog driver
On 11/25/2013 03:06 PM, Sekhar Nori wrote: On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote: These patches are intended to update Davinci watchdog to use WDT core and reuse driver for keystone arch, because Keystone uses the similar IP like Davinci. This series causes a regression on all DaVinci platforms because after the series is applied the watchdog device does not get registered at all. Since you changed the device name please include a patch to fix the platform references too. Thanks, Sekhar Ok, I will replace -- Regards, Ivan Khoronzhuk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 43/78] rt2400pci: fix RSSI read
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Stanislaw Gruszka commit 2bf127a5cc372b9319afcbae10b090663b621c8b upstream. RSSI value is provided on word3 not on word2. Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville Signed-off-by: Luis Henriques --- drivers/net/wireless/rt2x00/rt2400pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/rt2x00/rt2400pci.c b/drivers/net/wireless/rt2x00/rt2400pci.c index d8594a2..dd2160c 100644 --- a/drivers/net/wireless/rt2x00/rt2400pci.c +++ b/drivers/net/wireless/rt2x00/rt2400pci.c @@ -1253,7 +1253,7 @@ static void rt2400pci_fill_rxdone(struct queue_entry *entry, */ rxdesc->timestamp = ((u64)rx_high << 32) | rx_low; rxdesc->signal = rt2x00_get_field32(word2, RXD_W2_SIGNAL) & ~0x08; - rxdesc->rssi = rt2x00_get_field32(word2, RXD_W3_RSSI) - + rxdesc->rssi = rt2x00_get_field32(word3, RXD_W3_RSSI) - entry->queue->rt2x00dev->rssi_offset; rxdesc->size = rt2x00_get_field32(word0, RXD_W0_DATABYTE_COUNT); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 22/78] perf: Fix perf ring buffer memory ordering
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Peter Zijlstra commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream. The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: Victor Kaplansky Tested-by: Victor Kaplansky Signed-off-by: Peter Zijlstra Cc: Mathieu Desnoyers Cc: mich...@ellerman.id.au Cc: Paul McKenney Cc: Michael Neuling Cc: Frederic Weisbecker Cc: an...@samba.org Cc: b...@kernel.crashing.org Link: http://lkml.kernel.org/r/20131025173749.gg19...@laptop.lan Signed-off-by: Ingo Molnar [ luis: backported to 3.5: - file rename: include/uapi/linux/perf_event.h -> include/linux/perf_event.h ] Signed-off-by: Luis Henriques --- include/linux/perf_event.h | 12 +++- kernel/events/ring_buffer.c | 31 +++ 2 files changed, 34 insertions(+), 9 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 3faf0d4..7e72637 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -393,13 +393,15 @@ struct perf_event_mmap_page { /* * Control data for the mmap() data buffer. * -* User-space reading the @data_head value should issue an rmb(), on -* SMP capable platforms, after reading this value -- see -* perf_event_wakeup(). +* User-space reading the @data_head value should issue an smp_rmb(), +* after reading this value. * * When the mapping is PROT_WRITE the @data_tail value should be -* written by userspace to reflect the last read data. In this case -* the kernel will not over-write unread data. +* written by userspace to reflect the last read data, after issueing +* an smp_mb() to separate the data read from the ->data_tail store. +* In this case the kernel will not over-write unread data. +* +* See perf_output_put_handle() for the data ordering. */ __u64 data_head; /* head in the data section */ __u64 data_tail; /* user-space written tail */ diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 6ddaba4..4636ecc 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -75,10 +75,31 @@ again: goto out; /* -* Publish the known good head. Rely on the full barrier implied -* by atomic_dec_and_test() order the rb->head read and this -* write. +* Since the mmap() consumer (userspace) can run on a different CPU: +* +* kernel user +* +* READ ->data_tail READ ->data_head +* smp_mb() (A) smp_rmb() (C) +* WRITE $dataREAD $data +* smp_wmb() (B) smp_mb()(D) +* STORE ->data_head WRITE ->data_tail +* +* Where A pairs with D, and B pairs with C. +* +* I don't think A needs to be a full barrier because we won't in fact +* write data until we see the store from userspace. So we simply don't +* issue the data WRITE until we observe it. Be conservative for now. +* +* OTOH, D needs to be a full barrier since it separates the data READ +* from the tail WRITE. +* +* For B a WMB is sufficient since it separates two WRITEs, and for C +* an RMB is sufficient since it separates two READs. +* +* See perf_output_begin(). */ + smp_wmb(); rb->user_page->data_head = head; /* @@ -142,9 +163,11 @@ int perf_output_begin(struct perf_output_handle *handle, * Userspace could choose to issue a mb() before updating the * tail pointer. So that all reads will be completed before the * write is issued. +* +* See perf_output_put_handle(). */ tail = ACCESS_ONCE(rb->user_page->data_tail); - smp_rmb(); + smp_mb(); offset = head = local_read(>head); head += size; if (unlikely(!perf_output_space(rb, tail, offset, head))) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 44/78] rt2x00: check if device is still available on rt2x00mac_flush()
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Stanislaw Gruszka commit 5671ab05cf2a579218985ef56595387932d78ee4 upstream. Fix random kernel panic with below messages when remove dongle. [ 2212.355447] BUG: unable to handle kernel NULL pointer dereference at 0250 [ 2212.355527] IP: [] rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb] [ 2212.355599] PGD 0 [ 2212.355626] Oops: [#1] SMP [ 2212.355664] Modules linked in: rt2800usb rt2x00usb rt2800lib crc_ccitt rt2x00lib mac80211 cfg80211 tun arc4 fuse rfcomm bnep snd_hda_codec_realtek snd_hda_intel snd_hda_codec btusb uvcvideo bluetooth snd_hwdep x86_pkg_temp_thermal snd_seq coretemp aesni_intel aes_x86_64 snd_seq_device glue_helper snd_pcm ablk_helper videobuf2_vmalloc sdhci_pci videobuf2_memops videobuf2_core sdhci videodev mmc_core serio_raw snd_page_alloc microcode i2c_i801 snd_timer hid_multitouch thinkpad_acpi lpc_ich mfd_core snd tpm_tis wmi tpm tpm_bios soundcore acpi_cpufreq i915 i2c_algo_bit drm_kms_helper drm i2c_core video [last unloaded: cfg80211] [ 2212.356224] CPU: 0 PID: 34 Comm: khubd Not tainted 3.12.0-rc3-wl+ #3 [ 2212.356268] Hardware name: LENOVO 3444CUU/3444CUU, BIOS G6ET93WW (2.53 ) 02/04/2013 [ 2212.356319] task: 880212f687c0 ti: 880212f66000 task.ti: 880212f66000 [ 2212.356392] RIP: 0010:[] [] rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb] [ 2212.356481] RSP: 0018:880212f67750 EFLAGS: 00010202 [ 2212.356519] RAX: 000c RBX: 000c RCX: 0293 [ 2212.356568] RDX: 8801f4dc219a RSI: RDI: 0240 [ 2212.356617] RBP: 880212f67778 R08: a02667e0 R09: 0002 [ 2212.356665] R10: 0001f95254ab4b40 R11: 880212f675be R12: 8801f4dc2150 [ 2212.356712] R13: R14: a02667e0 R15: 000d [ 2212.356761] FS: () GS:88021e20() knlGS: [ 2212.356813] CS: 0010 DS: ES: CR0: 80050033 [ 2212.356852] CR2: 0250 CR3: 01a0c000 CR4: 001407f0 [ 2212.356899] Stack: [ 2212.356917] 000c 8801f4dc2150 a02667e0 [ 2212.356980] 000d 880212f677b8 a03a31ad 8801f4dc219a [ 2212.357038] 8801f4dc2150 8800b93217a0 8801f49bc800 [ 2212.357099] Call Trace: [ 2212.357122] [] ? rt2x00usb_interrupt_txdone+0x90/0x90 [rt2x00usb] [ 2212.357174] [] rt2x00queue_for_each_entry+0xed/0x170 [rt2x00lib] [ 2212.357244] [] rt2x00usb_kick_queue+0x5c/0x60 [rt2x00usb] [ 2212.357314] [] rt2x00queue_flush_queue+0x62/0xa0 [rt2x00lib] [ 2212.357386] [] rt2x00mac_flush+0x30/0x70 [rt2x00lib] [ 2212.357470] [] ieee80211_flush_queues+0xbd/0x140 [mac80211] [ 2212.357555] [] ieee80211_set_disassoc+0x2d2/0x3d0 [mac80211] [ 2212.357645] [] ieee80211_mgd_deauth+0x1d3/0x240 [mac80211] [ 2212.357718] [] ? try_to_wake_up+0xec/0x290 [ 2212.357788] [] ieee80211_deauth+0x18/0x20 [mac80211] [ 2212.357872] [] cfg80211_mlme_deauth+0x9c/0x140 [cfg80211] [ 2212.357913] [] cfg80211_mlme_down+0x5c/0x60 [cfg80211] [ 2212.357962] [] cfg80211_disconnect+0x188/0x1a0 [cfg80211] [ 2212.358014] [] ? __cfg80211_stop_sched_scan+0x1c/0x130 [cfg80211] [ 2212.358067] [] cfg80211_leave+0xc4/0xe0 [cfg80211] [ 2212.358124] [] cfg80211_netdev_notifier_call+0x3ab/0x5e0 [cfg80211] [ 2212.358177] [] ? inetdev_event+0x38/0x510 [ 2212.358217] [] ? __wake_up+0x44/0x50 [ 2212.358254] [] notifier_call_chain+0x4c/0x70 [ 2212.358293] [] raw_notifier_call_chain+0x16/0x20 [ 2212.358361] [] call_netdevice_notifiers_info+0x35/0x60 [ 2212.358429] [] __dev_close_many+0x49/0xd0 [ 2212.358487] [] dev_close_many+0x88/0x100 [ 2212.358546] [] rollback_registered_many+0xb0/0x220 [ 2212.358612] [] unregister_netdevice_many+0x19/0x60 [ 2212.358694] [] ieee80211_remove_interfaces+0x112/0x190 [mac80211] [ 2212.358791] [] ieee80211_unregister_hw+0x4f/0x100 [mac80211] [ 2212.361994] [] rt2x00lib_remove_dev+0x161/0x1a0 [rt2x00lib] [ 2212.365240] [] rt2x00usb_disconnect+0x2e/0x70 [rt2x00usb] [ 2212.368470] [] usb_unbind_interface+0x64/0x1c0 [ 2212.371734] [] __device_release_driver+0x7f/0xf0 [ 2212.374999] [] device_release_driver+0x23/0x30 [ 2212.378131] [] bus_remove_device+0x108/0x180 [ 2212.381358] [] device_del+0x135/0x1d0 [ 2212.384454] [] usb_disable_device+0xb0/0x270 [ 2212.387451] [] usb_disconnect+0xad/0x1d0 [ 2212.390294] [] hub_thread+0x63d/0x1660 [ 2212.393034] [] ? wake_up_atomic_t+0x30/0x30 [ 2212.395728] [] ? hub_port_debounce+0x130/0x130 [ 2212.398412] [] kthread+0xc0/0xd0 [ 2212.401058] [] ? insert_kthread_work+0x40/0x40 [ 2212.403639] [] ret_from_fork+0x7c/0xb0 [ 2212.406193] [] ? insert_kthread_work+0x40/0x40 [ 2212.408732] Code: 24 58 08 00 00 bf 80 00 00 00 e8 3a c3 e0 e0 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 <48> 8b 47 10 48
Re: [PATCH v2 0/6] Update Davinci watchdog driver
On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote: > These patches are intended to update Davinci watchdog to use WDT core > and reuse driver for keystone arch, because Keystone uses the similar > IP like Davinci. This series causes a regression on all DaVinci platforms because after the series is applied the watchdog device does not get registered at all. Since you changed the device name please include a patch to fix the platform references too. Thanks, Sekhar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 48/78] USB:add new zte 3g-dongle's pid to option.c
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Rui li commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li Signed-off-by: Greg Kroah-Hartman Signed-off-by: Luis Henriques --- drivers/usb/serial/option.c | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index c4b313f..dbc6919 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1391,6 +1391,23 @@ static const struct usb_device_id option_ids[] = { .driver_info = (kernel_ulong_t)_intf2_blacklist }, { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1426, 0xff, 0xff, 0xff), /* ZTE MF91 */ .driver_info = (kernel_ulong_t)_intf2_blacklist }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1533, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1534, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1535, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1545, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1546, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1547, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1565, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1566, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1567, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1589, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1590, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1591, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1592, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1594, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1596, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1598, 0xff, 0xff, 0xff) }, + { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1600, 0xff, 0xff, 0xff) }, { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x2002, 0xff, 0xff, 0xff), .driver_info = (kernel_ulong_t)_k3765_z_blacklist }, { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x2003, 0xff, 0xff, 0xff) }, -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 52/78] ALSA: 6fire: Fix probe of multiple cards
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Takashi Iwai commit 9b389a8a022110b4bc055a19b888283544d9eba6 upstream. The probe code of snd-usb-6fire driver overrides the devices[] pointer wrongly without checking whether it's already occupied or not. This would screw up the device disconnection later. Spotted by coverity CID 141423. Signed-off-by: Takashi Iwai Signed-off-by: Luis Henriques --- sound/usb/6fire/chip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/usb/6fire/chip.c b/sound/usb/6fire/chip.c index fc8cc82..f803348 100644 --- a/sound/usb/6fire/chip.c +++ b/sound/usb/6fire/chip.c @@ -101,7 +101,7 @@ static int __devinit usb6fire_chip_probe(struct usb_interface *intf, usb_set_intfdata(intf, chips[i]); mutex_unlock(_mutex); return 0; - } else if (regidx < 0) + } else if (!devices[i] && regidx < 0) regidx = i; } if (regidx < 0) { -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 33/78] lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ming Lei commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei Reported-by: Aaro Koskinen Tested-by: Simon Baatz Cc: Russell King - ARM Linux Cc: Will Deacon Cc: Aaro Koskinen Acked-by: Catalin Marinas Cc: FUJITA Tomonori Cc: Tejun Heo Cc: "James E.J. Bottomley" Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- lib/scatterlist.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 6096e89..8c2f278 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -419,7 +419,8 @@ void sg_miter_stop(struct sg_mapping_iter *miter) if (miter->addr) { miter->__offset += miter->consumed; - if (miter->__flags & SG_MITER_TO_SG) + if ((miter->__flags & SG_MITER_TO_SG) && + !PageSlab(miter->page)) flush_kernel_dcache_page(miter->page); if (miter->__flags & SG_MITER_ATOMIC) { -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [i915] BUG: Bad page state in process Xorg
Hi, It turns out that this seems to be a bug in udl DRM driver. I bisected the problem to this patch: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/udl?id=5dc9e1e87229cb786a5bb58ddd0d60fee6eb4641 With kind regards Thomas Am 22.11.2013 17:18 schrieb Daniel Vetter : > > On Fri, Nov 22, 2013 at 4:54 PM, Thomas Meyer wrote: > >> Am 22.11.2013 um 11:55 schrieb Daniel Vetter : > >> > >> On Fri, Nov 22, 2013 at 11:36 AM, Dave Airlie wrote: > Hi, > >>> > >>> cc'ing mailing list, > >>> > >>> Daniel any ideas? > >> > >> Nope, not really :( And no ideas how to triage this further - if it > >> takes 9 days to hit it eventually we'll have a real hard time. Or does > >> this happen even after just a short X run? > > > > Seems to happen every time while stopping the x server. Also after a short > > run time. > > > > The current fedora 3.11 kernel doesn't show this bug. I'm using fedora 19, > > with a self compiled kernel. > > > > I did turn on config-debug-pagealloc but this didn't show any wrongness. > > In that case I think the bisect is the fastest way to insight - atm > I'm really at loss what could be wrong here. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
[PATCH 3.5 50/78] ahci: disabled FBS prior to issuing software reset
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: xiangliang yu commit 89dafa20f3daab5b3e0c13d0068a28e8e64e2102 upstream. Tested with Marvell 88se9125, attached with one port mulitplier(5 ports) and one disk, we will get following boot log messages if using current code: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 330) ata8.15: Port Multiplier 1.2, 0x1b4b:0x9715 r160, 5 ports, feat 0x1/0x1f ahci :03:00.0: FBS is enabled ata8.00: hard resetting link ata8.00: SATA link down (SStatus 0 SControl 330) ata8.01: hard resetting link ata8.01: SATA link down (SStatus 0 SControl 330) ata8.02: hard resetting link ata8.02: SATA link down (SStatus 0 SControl 330) ata8.03: hard resetting link ata8.03: SATA link up 6.0 Gbps (SStatus 133 SControl 133) ata8.04: hard resetting link ata8.04: failed to resume link (SControl 133) ata8.04: failed to read SCR 0 (Emask=0x40) ata8.04: failed to read SCR 0 (Emask=0x40) ata8.04: failed to read SCR 1 (Emask=0x40) ata8.04: failed to read SCR 0 (Emask=0x40) ata8.03: native sectors (2) is smaller than sectors (976773168) ata8.03: ATA-8: ST3500413AS, JC4B, max UDMA/133 ata8.03: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata8.03: configured for UDMA/133 ata8.04: failed to IDENTIFY (I/O error, err_mask=0x100) ata8.15: hard resetting link ata8.15: SATA link up 6.0 Gbps (SStatus 133 SControl 330) ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133' ata8.15: PMP revalidation failed (errno=-19) ata8.15: hard resetting link ata8.15: SATA link up 6.0 Gbps (SStatus 133 SControl 330) ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133' ata8.15: PMP revalidation failed (errno=-19) ata8.15: limiting SATA link speed to 3.0 Gbps ata8.15: hard resetting link ata8.15: SATA link up 3.0 Gbps (SStatus 123 SControl 320) ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133' ata8.15: PMP revalidation failed (errno=-19) ata8.15: failed to recover PMP after 5 tries, giving up ata8.15: Port Multiplier detaching ata8.03: disabled ata8.00: disabled ata8: EH complete The reason is that current detection code doesn't follow AHCI spec: First,the port multiplier detection process look like this: ahci_hardreset(link, class, deadline) if (class == ATA_DEV_PMP) { sata_pmp_attach(dev)/* will enable FBS */ sata_pmp_init_links(ap, nr_ports); ata_for_each_link(link, ap, EDGE) { sata_std_hardreset(link, class, deadline); if (link_is_online) /* do soft reset */ ahci_softreset(link, class, deadline); } } But, according to chapter 9.3.9 in AHCI spec: Prior to issuing software reset, software shall clear PxCMD.ST to '0' and then clear PxFBS.EN to '0'. The patch test ok with kernel 3.11.1. tj: Patch white space contaminated, applied manually with trivial updates. Signed-off-by: Xiangliang Yu Signed-off-by: Tejun Heo Signed-off-by: Luis Henriques --- drivers/ata/libahci.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c index 47a1fb8..60f41cd 100644 --- a/drivers/ata/libahci.c +++ b/drivers/ata/libahci.c @@ -1249,9 +1249,11 @@ int ahci_do_softreset(struct ata_link *link, unsigned int *class, { struct ata_port *ap = link->ap; struct ahci_host_priv *hpriv = ap->host->private_data; + struct ahci_port_priv *pp = ap->private_data; const char *reason = NULL; unsigned long now, msecs; struct ata_taskfile tf; + bool fbs_disabled = false; int rc; DPRINTK("ENTER\n"); @@ -1261,6 +1263,16 @@ int ahci_do_softreset(struct ata_link *link, unsigned int *class, if (rc && rc != -EOPNOTSUPP) ata_link_warn(link, "failed to reset engine (errno=%d)\n", rc); + /* +* According to AHCI-1.2 9.3.9: if FBS is enable, software shall +* clear PxFBS.EN to '0' prior to issuing software reset to devices +* that is attached to port multiplier. +*/ + if (!ata_is_host_link(link) && pp->fbs_enabled) { + ahci_disable_fbs(ap); + fbs_disabled = true; + } + ata_tf_init(link->device, ); /* issue the first D2H Register FIS */ @@ -1301,6 +1313,10 @@ int ahci_do_softreset(struct ata_link *link, unsigned int *class, } else *class = ahci_dev_classify(ap); + /* re-enable FBS if disabled before */ + if (fbs_disabled) + ahci_enable_fbs(ap); + DPRINTK("EXIT, class=%u\n", *class); return 0; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH 3.5 49/78] libata: Fix display of sata speed
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Gwendal Grignou commit 3e85c3ecbc520751324a191d23bb94873ed01b10 upstream. 6.0 Gbps link speed was not decoded properly: speed was reported at 3.0 Gbps only. Tested: On a machine where libata reports 6.0 Gbps in /var/log/messages: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Before: cat /sys/class/ata_link/link1/sata_spd 3.0 Gbps After: cat /sys/class/ata_link/link1/sata_spd 6.0 Gbps Signed-off-by: Gwendal Grignou Signed-off-by: Tejun Heo Signed-off-by: Luis Henriques --- drivers/ata/libata-transport.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c index c341904..9215677 100644 --- a/drivers/ata/libata-transport.c +++ b/drivers/ata/libata-transport.c @@ -319,25 +319,25 @@ int ata_tport_add(struct device *parent, /* * ATA link attributes */ +static int noop(int x) { return x; } - -#define ata_link_show_linkspeed(field) \ +#define ata_link_show_linkspeed(field, format) \ static ssize_t \ show_ata_link_##field(struct device *dev, \ struct device_attribute *attr, char *buf) \ { \ struct ata_link *link = transport_class_to_link(dev); \ \ - return sprintf(buf,"%s\n", sata_spd_string(fls(link->field))); \ + return sprintf(buf, "%s\n", sata_spd_string(format(link->field))); \ } -#define ata_link_linkspeed_attr(field) \ - ata_link_show_linkspeed(field) \ +#define ata_link_linkspeed_attr(field, format) \ + ata_link_show_linkspeed(field, format) \ static DEVICE_ATTR(field, S_IRUGO, show_ata_link_##field, NULL) -ata_link_linkspeed_attr(hw_sata_spd_limit); -ata_link_linkspeed_attr(sata_spd_limit); -ata_link_linkspeed_attr(sata_spd); +ata_link_linkspeed_attr(hw_sata_spd_limit, fls); +ata_link_linkspeed_attr(sata_spd_limit, fls); +ata_link_linkspeed_attr(sata_spd, noop); static DECLARE_TRANSPORT_CLASS(ata_link_class, -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 51/78] drivers/libata: Set max sector to 65535 for Slimtype DVD A DS8A9SH drive
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Shan Hai commit 0523f037f65dba10191b0fa9c51266f90ba64630 upstream. The "Slimtype DVD A DS8A9SH" drive locks up with following backtrace when the max sector is smaller than 65535 bytes, fix it by adding a quirk to set the max sector to 65535 bytes. INFO: task flush-11:0:663 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-11:0D 5ceb 0 663 2 0x 88026d3b1710 0046 0001 88026f2530c0 88026d365860 88026d3b16e0 812ffd52 88026d4fd3d0 00010001 88026d3b16f0 88026d3b1fd8 Call Trace: [] ? cfq_may_queue+0x52/0xf0 [] schedule+0x18/0x30 [] io_schedule+0x42/0x60 [] get_request_wait+0xeb/0x1f0 [] ? autoremove_wake_function+0x0/0x40 [] ? elv_merge+0x42/0x210 [] __make_request+0x8e/0x4e0 [] generic_make_request+0x21e/0x5e0 [] submit_bio+0x5d/0xd0 [] submit_bh+0xf2/0x130 [] __block_write_full_page+0x1dc/0x3a0 [] ? end_buffer_async_write+0x0/0x120 [] ? blkdev_get_block+0x0/0x70 [] ? blkdev_get_block+0x0/0x70 [] ? end_buffer_async_write+0x0/0x120 [] block_write_full_page_endio+0xde/0x100 [] block_write_full_page+0x10/0x20 [] blkdev_writepage+0x13/0x20 [] __writepage+0x15/0x40 [] write_cache_pages+0x1cf/0x3e0 [] ? __writepage+0x0/0x40 [] generic_writepages+0x22/0x30 [] do_writepages+0x1f/0x40 [] writeback_single_inode+0xe7/0x3b0 [] writeback_sb_inodes+0x184/0x280 [] writeback_inodes_wb+0x6b/0x1a0 [] wb_writeback+0x23b/0x2a0 [] wb_do_writeback+0x17d/0x190 [] bdi_writeback_task+0x4b/0xe0 [] ? bdi_start_fn+0x0/0x100 [] bdi_start_fn+0x81/0x100 [] ? bdi_start_fn+0x0/0x100 [] kthread+0x8e/0xa0 [] ? finish_task_switch+0x54/0xc0 [] kernel_thread_helper+0x4/0x10 [] ? kthread+0x0/0xa0 [] ? kernel_thread_helper+0x0/0x10 The above trace was triggered by "dd if=/dev/zero of=/dev/sr0 bs=2048 count=32768" Signed-off-by: Shan Hai Signed-off-by: Tejun Heo Signed-off-by: Luis Henriques --- drivers/ata/libata-core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 9e47300..705658d 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4075,6 +4075,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { "TORiSAN DVD-ROM DRD-N216", NULL, ATA_HORKAGE_MAX_SEC_128 }, { "QUANTUM DATDAT72-000", NULL, ATA_HORKAGE_ATAPI_MOD16_DMA }, { "Slimtype DVD A DS8A8SH", NULL, ATA_HORKAGE_MAX_SEC_LBA48 }, + { "Slimtype DVD A DS8A9SH", NULL, ATA_HORKAGE_MAX_SEC_LBA48 }, /* Devices we expect to fail diagnostics */ -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 55/78] powerpc/vio: use strcpy in modalias_show
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Prarit Bhargava commit 411cabf79e684171669ad29a0628c400b4431e95 upstream. Commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 used strcat instead of strcpy which can result in an overflow of newlines on the buffer. Signed-off-by: Prarit Bhargava Cc: b...@kernel.crashing.org Cc: b...@decadent.org.uk Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Luis Henriques --- arch/powerpc/kernel/vio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c index b161bae..4869c4e 100644 --- a/arch/powerpc/kernel/vio.c +++ b/arch/powerpc/kernel/vio.c @@ -1521,12 +1521,12 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, dn = dev->of_node; if (!dn) { - strcat(buf, "\n"); + strcpy(buf, "\n"); return strlen(buf); } cp = of_get_property(dn, "compatible", NULL); if (!cp) { - strcat(buf, "\n"); + strcpy(buf, "\n"); return strlen(buf); } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 59/78] powerpc/powernv: Add PE to its own PELTV
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Gavin Shan commit 631ad691b5818291d89af9be607d2fe40be0886e upstream. We need add PE to its own PELTV. Otherwise, the errors originated from the PE might contribute to other PEs. In the result, we can't clear up the error successfully even we're checking and clearing errors during access to PCI config space. Reported-by: kalsh...@in.ibm.com Signed-off-by: Gavin Shan Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Luis Henriques --- arch/powerpc/platforms/powernv/pci-ioda.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index fbdd74d..5da8e8d 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -613,13 +613,23 @@ static int __devinit pnv_ioda_configure_pe(struct pnv_phb *phb, rid_end = pe->rid + 1; } - /* Associate PE in PELT */ + /* +* Associate PE in PELT. We need add the PE into the +* corresponding PELT-V as well. Otherwise, the error +* originated from the PE might contribute to other +* PEs. +*/ rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid, bcomp, dcomp, fcomp, OPAL_MAP_PE); if (rc) { pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc); return -ENXIO; } + + rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number, + pe->pe_number, OPAL_ADD_PE_TO_DOMAIN); + if (rc) + pe_warn(pe, "OPAL error %d adding self to PELTV\n", rc); opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number, OPAL_EEH_ACTION_CLEAR_FREEZE_ALL); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 47/78] ARM: OMAP2+: irq, AM33XX add missing register check
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Markus Pargmann commit 0bebda684857f76548ea48c8886785198701d8d3 upstream. am33xx has a INTC_PENDING_IRQ3 register that is not checked for pending interrupts. This patch adds AM33XX to the ifdef of SOCs that have to check this register. Signed-off-by: Markus Pargmann Signed-off-by: Tony Lindgren Signed-off-by: Luis Henriques --- arch/arm/mach-omap2/irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mach-omap2/irq.c b/arch/arm/mach-omap2/irq.c index 6038a8c..4137499 100644 --- a/arch/arm/mach-omap2/irq.c +++ b/arch/arm/mach-omap2/irq.c @@ -232,7 +232,7 @@ static inline void omap_intc_handle_irq(void __iomem *base_addr, struct pt_regs goto out; irqnr = readl_relaxed(base_addr + 0xd8); -#ifdef CONFIG_SOC_TI81XX +#if IS_ENABLED(CONFIG_SOC_TI81XX) || IS_ENABLED(CONFIG_SOC_AM33XX) if (irqnr) goto out; irqnr = readl_relaxed(base_addr + 0xf8); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Add memory barrier when waiting on futex
We encountered following panic several times: [ 74.671982] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 74.672101] IP: [] wake_futex+0x47/0x80 [ 74.672185] *pdpt = 10108001 *pde = [ 74.672278] Oops: 0002 [#1] PREEMPT SMP [ 74.672403] Modules linked in: atomisp_css2400b0_v2 atomisp_css2400_v2 dfrgx bcm_bt_lpm videobuf_vmalloc videobuf_core hdmi_audio tngdisp bcm4335 kct_daemon(O) cfg80211 [ 74.672815] CPU: 0 PID: 1477 Comm: zygote Tainted: GW O 3.10.1-259934-g0bfb86e #1 [ 74.672855] Hardware name: Intel Corporation Merrifield/SALT BAY, BIOS 404 2013.10.09:15.29.48 [ 74.672894] task: d4c97220 ti: cfaa8000 task.ti: cfaa8000 [ 74.672933] EIP: 0060:[] EFLAGS: 00210246 CPU: 0 [ 74.672975] EIP is at wake_futex+0x47/0x80 [ 74.673012] EAX: EBX: ECX: EDX: [ 74.673049] ESI: def4de5c EDI: EBP: cfaa9eb4 ESP: cfaa9ea0 [ 74.673086] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 74.673123] CR0: 8005003b CR2: 0008 CR3: 10109000 CR4: 001007f0 [ 74.673160] DR0: DR1: DR2: DR3: [ 74.673196] DR6: 0ff0 DR7: 0400 [ 74.673229] Stack: [ 74.673260] 0001 def4de5c c225eb50 cfaa9ee4 c129bc29 [ 74.673536] 7fff c225eb30 b4f38000 ec1a4b40 0f90 7fff 0001 [ 74.673814] b4f38f90 cfaa9f58 c129da0b cfaa9f10 c195d835 0001 [ 74.674092] Call Trace: [ 74.674144] [] futex_wake+0xc9/0x110 [ 74.674195] [] do_futex+0xeb/0x950 [ 74.674246] [] ? sub_preempt_count+0x55/0xe0 [ 74.674293] [] ? wake_up_new_task+0xee/0x190 [ 74.674341] [] ? _raw_spin_unlock_irqrestore+0x3b/0x70 [ 74.674388] [] ? wake_up_new_task+0xee/0x190 [ 74.674436] [] ? do_fork+0xec/0x350 [ 74.674484] [] SyS_futex+0x9b/0x140 [ 74.674533] [] ? SyS_mprotect+0x188/0x1e0 [ 74.674582] [] syscall_call+0x7/0xb On smp systems, setting current task to q->task in queue_me() may not visible immediately to another cpu, some times this will cause panic in wake_futex(). Adding memory barrier to avoid this. Signed-off-by: Leon Ma Signed-off-by: xiaobing tu --- kernel/futex.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 80ba086..792cd41 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1529,6 +1529,7 @@ static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb) plist_node_init(>list, prio); plist_add(>list, >chain); q->task = current; + smp_mb(); spin_unlock(>lock); } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 57/78] ASoC: ak4642: prevent un-necessary changes to SG_SL1
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Phil Edworthy commit 7b5bfb82882b9b1c8423ce0ed6852ca3762d967a upstream. If you record the sound during playback, the playback sound becomes silent. Modify so that the codec driver does not clear SG_SL1::DACL bit which is controlled under widget Signed-off-by: Phil Edworthy Signed-off-by: Kuninori Morimoto Signed-off-by: Mark Brown Signed-off-by: Luis Henriques --- sound/soc/codecs/ak4642.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/ak4642.c b/sound/soc/codecs/ak4642.c index b3e24f2..7e4245f 100644 --- a/sound/soc/codecs/ak4642.c +++ b/sound/soc/codecs/ak4642.c @@ -262,7 +262,7 @@ static int ak4642_dai_startup(struct snd_pcm_substream *substream, * This operation came from example code of * "ASAHI KASEI AK4642" (japanese) manual p94. */ - snd_soc_write(codec, SG_SL1, PMMP | MGAIN0); + snd_soc_update_bits(codec, SG_SL1, PMMP | MGAIN0, PMMP | MGAIN0); snd_soc_write(codec, TIMER, ZTM(0x3) | WTM(0x3)); snd_soc_write(codec, ALC_CTL1, ALC | LMTH0); snd_soc_update_bits(codec, PW_MGMT1, PMADL, PMADL); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 53/78] usb: wusbcore: set the RPIPE wMaxPacketSize value correctly
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Thomas Pugliese commit 7b6bc07ab554e929c85d51b3d5b26cf7f12c6a3b upstream. For isochronous endpoints, set the RPIPE wMaxPacketSize value using wOverTheAirPacketSize from the endpoint companion descriptor instead of wMaxPacketSize from the normal endpoint descriptor. Signed-off-by: Thomas Pugliese Signed-off-by: Greg Kroah-Hartman [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/usb/wusbcore/wa-rpipe.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/usb/wusbcore/wa-rpipe.c b/drivers/usb/wusbcore/wa-rpipe.c index f0d546c..ca1031b 100644 --- a/drivers/usb/wusbcore/wa-rpipe.c +++ b/drivers/usb/wusbcore/wa-rpipe.c @@ -332,7 +332,10 @@ static int rpipe_aim(struct wa_rpipe *rpipe, struct wahc *wa, /* FIXME: compute so seg_size > ep->maxpktsize */ rpipe->descr.wBlocks = cpu_to_le16(16); /* given */ /* ep0 maxpktsize is 0x200 (WUSB1.0[4.8.1]) */ - rpipe->descr.wMaxPacketSize = cpu_to_le16(ep->desc.wMaxPacketSize); + if (usb_endpoint_xfer_isoc(>desc)) + rpipe->descr.wMaxPacketSize = epcd->wOverTheAirPacketSize; + else + rpipe->descr.wMaxPacketSize = ep->desc.wMaxPacketSize; rpipe->descr.bHSHubAddress = 0; /* reserved: zero */ rpipe->descr.bHSHubPort = wusb_port_no_to_idx(urb->dev->portnum); /* FIXME: use maximum speed as supported or recommended by device */ -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 54/78] usb: wusbcore: change WA_SEGS_MAX to a legal value
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Thomas Pugliese commit f74b75e7f920c700636a669c7d16d12e9202 upstream. change WA_SEGS_MAX to a number that is legal according to the WUSB spec. Signed-off-by: Thomas Pugliese Signed-off-by: Greg Kroah-Hartman [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/usb/wusbcore/wa-xfer.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/usb/wusbcore/wa-xfer.c b/drivers/usb/wusbcore/wa-xfer.c index 1ebc17e..8cf9003 100644 --- a/drivers/usb/wusbcore/wa-xfer.c +++ b/drivers/usb/wusbcore/wa-xfer.c @@ -90,7 +90,8 @@ #include "wusbhc.h" enum { - WA_SEGS_MAX = 255, + /* [WUSB] section 8.3.3 allocates 7 bits for the segment index. */ + WA_SEGS_MAX = 128, }; enum wa_seg_status { @@ -444,7 +445,7 @@ static ssize_t __wa_xfer_setup_sizes(struct wa_xfer *xfer, xfer->seg_size = (xfer->seg_size / maxpktsize) * maxpktsize; xfer->segs = (urb->transfer_buffer_length + xfer->seg_size - 1) / xfer->seg_size; - if (xfer->segs >= WA_SEGS_MAX) { + if (xfer->segs > WA_SEGS_MAX) { dev_err(dev, "BUG? ops, number of segments %d bigger than %d\n", (int)(urb->transfer_buffer_length / xfer->seg_size), WA_SEGS_MAX); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 62/78] qeth: avoid buffer overflow in snmp ioctl
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ursula Braun commit 6fb392b1a63ae36c31f62bc3fc8630b49d602b62 upstream. Check user-defined length in snmp ioctl request and allow request only if it fits into a qeth command buffer. Signed-off-by: Ursula Braun Signed-off-by: Frank Blaschka Reviewed-by: Heiko Carstens Reported-by: Nico Golde Reported-by: Fabian Yamaguchi Signed-off-by: David S. Miller Signed-off-by: Luis Henriques --- drivers/s390/net/qeth_core_main.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index e118e1e..c3121f7 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -4355,7 +4355,7 @@ int qeth_snmp_command(struct qeth_card *card, char __user *udata) struct qeth_cmd_buffer *iob; struct qeth_ipa_cmd *cmd; struct qeth_snmp_ureq *ureq; - int req_len; + unsigned int req_len; struct qeth_arp_query_info qinfo = {0, }; int rc = 0; @@ -4371,6 +4371,10 @@ int qeth_snmp_command(struct qeth_card *card, char __user *udata) /* skip 4 bytes (data_len struct member) to get req_len */ if (copy_from_user(_len, udata + sizeof(int), sizeof(int))) return -EFAULT; + if (req_len > (QETH_BUFSIZE - IPA_PDU_HEADER_SIZE - + sizeof(struct qeth_ipacmd_hdr) - + sizeof(struct qeth_ipacmd_setadpparms_hdr))) + return -EINVAL; ureq = memdup_user(udata, req_len + sizeof(struct qeth_snmp_ureq_hdr)); if (IS_ERR(ureq)) { QETH_CARD_TEXT(card, 2, "snmpnome"); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 63/78] cris: media platform drivers: fix build
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit 72a0c5571351f5184195754d23db3e14495b2080 upstream. On cris arch, the functions below aren't defined: drivers/media/platform/sh_veu.c: In function 'sh_veu_reg_read': drivers/media/platform/sh_veu.c:228:2: error: implicit declaration of function 'ioread32' [-Werror=implicit-function-declaration] drivers/media/platform/sh_veu.c: In function 'sh_veu_reg_write': drivers/media/platform/sh_veu.c:234:2: error: implicit declaration of function 'iowrite32' [-Werror=implicit-function-declaration] drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_read': drivers/media/platform/vsp1/vsp1.h:66:2: error: implicit declaration of function 'ioread32' [-Werror=implicit-function-declaration] drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_write': drivers/media/platform/vsp1/vsp1.h:71:2: error: implicit declaration of function 'iowrite32' [-Werror=implicit-function-declaration] drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_read': drivers/media/platform/vsp1/vsp1.h:66:2: error: implicit declaration of function 'ioread32' [-Werror=implicit-function-declaration] drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_write': drivers/media/platform/vsp1/vsp1.h:71:2: error: implicit declaration of function 'iowrite32' [-Werror=implicit-function-declaration] drivers/media/platform/soc_camera/rcar_vin.c: In function 'rcar_vin_setup': drivers/media/platform/soc_camera/rcar_vin.c:284:3: error: implicit declaration of function 'iowrite32' [-Werror=implicit-function-declaration] drivers/media/platform/soc_camera/rcar_vin.c: In function 'rcar_vin_request_capture_stop': drivers/media/platform/soc_camera/rcar_vin.c:353:2: error: implicit declaration of function 'ioread32' [-Werror=implicit-function-declaration] Yet, they're available, as CONFIG_GENERIC_IOMAP is defined. What happens is that asm/io.h was not including asm-generic/iomap.h. Suggested-by: Ben Hutchings Signed-off-by: Mauro Carvalho Chehab Cc: Mikael Starvik Cc: Jesper Nilsson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- arch/cris/include/asm/io.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/cris/include/asm/io.h b/arch/cris/include/asm/io.h index ac12ae2..db9a16c 100644 --- a/arch/cris/include/asm/io.h +++ b/arch/cris/include/asm/io.h @@ -3,6 +3,7 @@ #include/* for __va, __pa */ #include +#include #include struct cris_io_operations -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 70/78] tracing: Fix potential out-of-bounds in trace_get_user()
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Steven Rostedt commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream. Andrey reported the following report: ERROR: AddressSanitizer: heap-buffer-overflow on address 8800359c99f3 8800359c99f3 is located 0 bytes to the right of 243-byte region [8800359c9900, 8800359c99f3) Accessed by thread T13003: #0 810dd2da (asan_report_error+0x32a/0x440) #1 810dc6b0 (asan_check_region+0x30/0x40) #2 810dd4d3 (__tsan_write1+0x13/0x20) #3 811cd19e (ftrace_regex_release+0x1be/0x260) #4 812a1065 (__fput+0x155/0x360) #5 812a12de (fput+0x1e/0x30) #6 8111708d (task_work_run+0x10d/0x140) #7 810ea043 (do_exit+0x433/0x11f0) #8 810eaee4 (do_group_exit+0x84/0x130) #9 810eafb1 (SyS_exit_group+0x21/0x30) #10 81928782 (system_call_fastpath+0x16/0x1b) Allocated by thread T5167: #0 810dc778 (asan_slab_alloc+0x48/0xc0) #1 8128337c (__kmalloc+0xbc/0x500) #2 811d9d54 (trace_parser_get_init+0x34/0x90) #3 811cd7b3 (ftrace_regex_open+0x83/0x2e0) #4 811cda7d (ftrace_filter_open+0x2d/0x40) #5 8129b4ff (do_dentry_open+0x32f/0x430) #6 8129b668 (finish_open+0x68/0xa0) #7 812b66ac (do_last+0xb8c/0x1710) #8 812b7350 (path_openat+0x120/0xb50) #9 812b8884 (do_filp_open+0x54/0xb0) #10 8129d36c (do_sys_open+0x1ac/0x2c0) #11 8129d4b7 (SyS_open+0x37/0x50) #12 81928782 (system_call_fastpath+0x16/0x1b) Shadow bytes around the buggy address: 8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa 8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb 8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap redzone: fa Heap kmalloc redzone: fb Freed heap region: fd Shadow gap:fe The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;' Although the crash happened in ftrace_regex_open() the real bug occurred in trace_get_user() where there's an incrementation to parser->idx without a check against the size. The way it is triggered is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop that reads the last character stores it and then breaks out because there is no more characters. Then the last character is read to determine what to do next, and the index is incremented without checking size. Then the caller of trace_get_user() usually nulls out the last character with a zero, but since the index is equal to the size, it writes a nul character after the allocated space, which can corrupt memory. Luckily, only root user has write access to this file. Link: http://lkml.kernel.org/r/20131009222323.04fd1...@gandalf.local.home Reported-by: Andrey Konovalov Signed-off-by: Steven Rostedt Signed-off-by: Luis Henriques --- kernel/trace/trace.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 09739c6..d570df8 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -578,9 +578,12 @@ int trace_get_user(struct trace_parser *parser, const char __user *ubuf, if (isspace(ch)) { parser->buffer[parser->idx] = 0; parser->cont = false; - } else { + } else if (parser->idx < parser->size - 1) { parser->cont = true; parser->buffer[parser->idx++] = ch; + } else { + ret = -EINVAL; + goto out; } *ppos += read; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 71/78] ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ivan Djelic commit 455bd4c430b0c0a361f38e8658a0d6cb469942b5 upstream. Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on assumptions about the implementation of memset and similar functions. The current ARM optimized memset code does not return the value of its first argument, as is usually expected from standard implementations. For instance in the following function: void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter) { memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter)); waiter->magic = waiter; INIT_LIST_HEAD(>list); } compiled as: 800554d0 : 800554d0: e92d4008push{r3, lr} 800554d4: e1a1mov r0, r1 800554d8: e3a02010mov r2, #16 ; 0x10 800554dc: e3a01011mov r1, #17 ; 0x11 800554e0: eb04426ebl 80165ea0 800554e4: e1a03000mov r3, r0 800554e8: e583000cstr r0, [r3, #12] 800554ec: e583str r0, [r3] 800554f0: e5830004str r0, [r3, #4] 800554f4: e8bd8008pop {r3, pc} GCC assumes memset returns the value of pointer 'waiter' in register r0; causing register/memory corruptions. This patch fixes the return value of the assembly version of memset. It adds a 'mov' instruction and merges an additional load+store into existing load/store instructions. For ease of review, here is a breakdown of the patch into 4 simple steps: Step 1 == Perform the following substitutions: ip -> r8, then r0 -> ip, and insert 'mov ip, r0' as the first statement of the function. At this point, we have a memset() implementation returning the proper result, but corrupting r8 on some paths (the ones that were using ip). Step 2 == Make sure r8 is saved and restored when (! CALGN(1)+0) == 1: save r8: - str lr, [sp, #-4]! + stmfd sp!, {r8, lr} and restore r8 on both exit paths: - ldmeqfd sp!, {pc} @ Now <64 bytes to go. + ldmeqfd sp!, {r8, pc} @ Now <64 bytes to go. (...) tst r2, #16 stmneia ip!, {r1, r3, r8, lr} - ldr lr, [sp], #4 + ldmfd sp!, {r8, lr} Step 3 == Make sure r8 is saved and restored when (! CALGN(1)+0) == 0: save r8: - stmfd sp!, {r4-r7, lr} + stmfd sp!, {r4-r8, lr} and restore r8 on both exit paths: bgt 3b - ldmeqfd sp!, {r4-r7, pc} + ldmeqfd sp!, {r4-r8, pc} (...) tst r2, #16 stmneia ip!, {r4-r7} - ldmfd sp!, {r4-r7, lr} + ldmfd sp!, {r4-r8, lr} Step 4 == Rewrite register list "r4-r7, r8" as "r4-r8". Signed-off-by: Ivan Djelic Reviewed-by: Nicolas Pitre Signed-off-by: Dirk Behme Signed-off-by: Russell King Cc: Eric Bénard Signed-off-by: Luis Henriques --- arch/arm/lib/memset.S | 85 ++- 1 file changed, 44 insertions(+), 41 deletions(-) diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S index 650d592..d912e73 100644 --- a/arch/arm/lib/memset.S +++ b/arch/arm/lib/memset.S @@ -19,9 +19,9 @@ 1: subsr2, r2, #4 @ 1 do we have enough blt 5f @ 1 bytes to align with? cmp r3, #2 @ 1 - strltb r1, [r0], #1@ 1 - strleb r1, [r0], #1@ 1 - strbr1, [r0], #1@ 1 + strltb r1, [ip], #1@ 1 + strleb r1, [ip], #1@ 1 + strbr1, [ip], #1@ 1 add r2, r2, r3 @ 1 (r2 = r2 - (4 - r3)) /* * The pointer is now aligned and the length is adjusted. Try doing the @@ -29,10 +29,14 @@ */ ENTRY(memset) - andsr3, r0, #3 @ 1 unaligned? +/* + * Preserve the contents of r0 for the return value. + */ + mov ip, r0 + andsr3, ip, #3 @ 1 unaligned? bne 1b @ 1 /* - * we know that the pointer in r0 is aligned to a word boundary. + * we know that the pointer in ip is aligned to a word boundary. */ orr r1, r1, r1, lsl #8 orr r1, r1, r1, lsl #16 @@ -43,29 +47,28 @@ ENTRY(memset) #if ! CALGN(1)+0 /* - * We need an extra register for this loop - save the return address and - * use the LR + * We need 2 extra registers for this loop - use r8 and the LR */ - str lr, [sp, #-4]! - mov ip, r1 + stmfd sp!, {r8, lr} + mov r8, r1 mov lr, r1 2: subsr2, r2, #64 - stmgeia r0!, {r1, r3, ip, lr} @ 64 bytes at a time. - stmgeia r0!, {r1, r3, ip, lr} - stmgeia r0!, {r1, r3, ip, lr} - stmgeia r0!, {r1, r3, ip, lr} + stmgeia ip!, {r1, r3, r8, lr} @ 64 bytes at a time. + stmgeia ip!, {r1, r3, r8, lr} + stmgeia ip!, {r1, r3, r8, lr}
[PATCH 3.5 58/78] ahci: Add Device IDs for Intel Wildcat Point-LP
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: James Ralston commit 9f961a5f6efc87a79571d7166257b36af28ffcfe upstream. This patch adds the AHCI-mode SATA Device IDs for the Intel Wildcat Point-LP PCH. Signed-off-by: James Ralston Signed-off-by: Tejun Heo Signed-off-by: Luis Henriques --- drivers/ata/ahci.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 9270f35..d0f8a93 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -301,6 +301,10 @@ static const struct pci_device_id ahci_pci_tbl[] = { { PCI_VDEVICE(INTEL, 0x8d66), board_ahci }, /* Wellsburg RAID */ { PCI_VDEVICE(INTEL, 0x8d6e), board_ahci }, /* Wellsburg RAID */ { PCI_VDEVICE(INTEL, 0x23a3), board_ahci }, /* Coleto Creek AHCI */ + { PCI_VDEVICE(INTEL, 0x9c83), board_ahci }, /* Wildcat Point-LP AHCI */ + { PCI_VDEVICE(INTEL, 0x9c85), board_ahci }, /* Wildcat Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c87), board_ahci }, /* Wildcat Point-LP RAID */ + { PCI_VDEVICE(INTEL, 0x9c8f), board_ahci }, /* Wildcat Point-LP RAID */ /* JMicron 360/1/3/5/6, match class to avoid IDE function */ { PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 73/78] usb: fix cleanup after failure in hub_configure()
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Krzysztof Mazur commit d0308d4b6b02597f39fc31a9bddf7bb3faad5622 upstream. If the hub_configure() fails after setting the hdev->maxchild the hub->ports might be NULL or point to uninitialized kzallocated memory causing NULL pointer dereference in hub_quiesce() during cleanup. Now after such error the hdev->maxchild is set to 0 to avoid cleanup of uninitialized ports. Signed-off-by: Krzysztof Mazur Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Luis Henriques --- drivers/usb/core/hub.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index b5503b0..b79aa83 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1562,6 +1562,7 @@ static int hub_configure(struct usb_hub *hub, return 0; fail: + hdev->maxchild = 0; dev_err (hub_dev, "config failed, %s (err %d)\n", message, ret); /* hub_disconnect() frees urb and descriptor */ -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 67/78] backlight: atmel-pwm-bl: fix gpio polarity in remove
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Johan Hovold commit ad5066d4c2b1d696749f8d7816357c23b648c4d3 upstream. Make sure to honour gpio polarity also at remove so that the backlight is actually disabled on boards with active-low enable pin. Signed-off-by: Johan Hovold Acked-by: Jingoo Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/video/backlight/atmel-pwm-bl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/video/backlight/atmel-pwm-bl.c b/drivers/video/backlight/atmel-pwm-bl.c index 4d2bbd8..dab3a0c 100644 --- a/drivers/video/backlight/atmel-pwm-bl.c +++ b/drivers/video/backlight/atmel-pwm-bl.c @@ -211,7 +211,8 @@ static int __exit atmel_pwm_bl_remove(struct platform_device *pdev) struct atmel_pwm_bl *pwmbl = platform_get_drvdata(pdev); if (pwmbl->gpio_on != -1) { - gpio_set_value(pwmbl->gpio_on, 0); + gpio_set_value(pwmbl->gpio_on, + 0 ^ pwmbl->pdata->on_active_low); gpio_free(pwmbl->gpio_on); } pwm_channel_disable(>pwmc); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 68/78] devpts: plug the memory leak in kill_sb
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ilija Hadzic commit 66da0e1f9034140ae2f571ef96e254a25083906c upstream. When devpts is unmounted, there may be a no-longer-used IDR tree hanging off the superblock we are about to kill. This needs to be cleaned up before destroying the SB. The leak is usually not a big deal because unmounting devpts is typically done when shutting down the whole machine. However, shutting down an LXC container instead of a physical machine exposes the problem (the garbage is detectable with kmemleak). Signed-off-by: Ilija Hadzic Cc: Sukadev Bhattiprolu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- fs/devpts/inode.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c index 979c1e3..1ed9d5e 100644 --- a/fs/devpts/inode.c +++ b/fs/devpts/inode.c @@ -483,6 +483,7 @@ static void devpts_kill_sb(struct super_block *sb) { struct pts_fs_info *fsi = DEVPTS_SB(sb); + ida_destroy(>allocated_ptys); kfree(fsi); kill_litter_super(sb); } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 64/78] mm: ensure get_unmapped_area() returns higher address than mmap_min_addr
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Akira Takeuchi commit 2afc745f3e3079ab16c826be4860da2529054dd2 upstream. This patch fixes the problem that get_unmapped_area() can return illegal address and result in failing mmap(2) etc. In case that the address higher than PAGE_SIZE is set to /proc/sys/vm/mmap_min_addr, the address lower than mmap_min_addr can be returned by get_unmapped_area(), even if you do not pass any virtual address hint (i.e. the second argument). This is because the current get_unmapped_area() code does not take into account mmap_min_addr. This leads to two actual problems as follows: 1. mmap(2) can fail with EPERM on the process without CAP_SYS_RAWIO, although any illegal parameter is not passed. 2. The bottom-up search path after the top-down search might not work in arch_get_unmapped_area_topdown(). Note: The first and third chunk of my patch, which changes "len" check, are for more precise check using mmap_min_addr, and not for solving the above problem. [How to reproduce] --- test.c - #include #include #include #include int main(int argc, char *argv[]) { void *ret = NULL, *last_map; size_t pagesize = sysconf(_SC_PAGESIZE); do { last_map = ret; ret = mmap(0, pagesize, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); // printf("ret=%p\n", ret); } while (ret != MAP_FAILED); if (errno != ENOMEM) { printf("ERR: unexpected errno: %d (last map=%p)\n", errno, last_map); } return 0; } --- $ gcc -m32 -o test test.c $ sudo sysctl -w vm.mmap_min_addr=65536 vm.mmap_min_addr = 65536 $ ./test (run as non-priviledge user) ERR: unexpected errno: 1 (last map=0x1) Signed-off-by: Akira Takeuchi Signed-off-by: Kiyoshi Owada Reviewed-by: Naoya Horiguchi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [ luis: backported to 3.5: - dropped changes to struct vm_unmapped_area_info in arch_get_unmapped_area_topdown() as this structure does not exist in 3.5 kernel ] Signed-off-by: Luis Henriques --- mm/mmap.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 7e24763..758ff55 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1443,7 +1443,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr, struct vm_area_struct *vma; unsigned long start_addr; - if (len > TASK_SIZE) + if (len > TASK_SIZE - mmap_min_addr) return -ENOMEM; if (flags & MAP_FIXED) @@ -1452,7 +1452,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr, if (addr) { addr = PAGE_ALIGN(addr); vma = find_vma(mm, addr); - if (TASK_SIZE - len >= addr && + if (TASK_SIZE - len >= addr && addr >= mmap_min_addr && (!vma || addr + len <= vma->vm_start)) return addr; } @@ -1517,7 +1517,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, unsigned long addr = addr0, start_addr; /* requested length too big for entire address space */ - if (len > TASK_SIZE) + if (len > TASK_SIZE - mmap_min_addr) return -ENOMEM; if (flags & MAP_FIXED) @@ -1527,7 +1527,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0, if (addr) { addr = PAGE_ALIGN(addr); vma = find_vma(mm, addr); - if (TASK_SIZE - len >= addr && + if (TASK_SIZE - len >= addr && addr >= mmap_min_addr && (!vma || addr + len <= vma->vm_start)) return addr; } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 66/78] backlight: atmel-pwm-bl: fix reported brightness
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Johan Hovold commit 185d91442550110db67a7dc794a32efcea455a36 upstream. The driver supports 16-bit brightness values, but the value returned from get_brightness was truncated to eight bits. Signed-off-by: Johan Hovold Cc: Jingoo Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- drivers/video/backlight/atmel-pwm-bl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/video/backlight/atmel-pwm-bl.c b/drivers/video/backlight/atmel-pwm-bl.c index 0443a4f..4d2bbd8 100644 --- a/drivers/video/backlight/atmel-pwm-bl.c +++ b/drivers/video/backlight/atmel-pwm-bl.c @@ -70,7 +70,7 @@ static int atmel_pwm_bl_set_intensity(struct backlight_device *bd) static int atmel_pwm_bl_get_intensity(struct backlight_device *bd) { struct atmel_pwm_bl *pwmbl = bl_get_data(bd); - u8 intensity; + u32 intensity; if (pwmbl->pdata->pwm_active_low) { intensity = pwm_channel_readl(>pwmc, PWM_CDTY) - @@ -80,7 +80,7 @@ static int atmel_pwm_bl_get_intensity(struct backlight_device *bd) pwm_channel_readl(>pwmc, PWM_CDTY); } - return intensity; + return intensity & 0x; } static int atmel_pwm_bl_init_pwm(struct atmel_pwm_bl *pwmbl) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 75/78] 8139cp: re-enable interrupts after tx timeout
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: David Woodhouse commit 01ffc0a7f1c1801a2354719dedbc32aff45b987d upstream. Recovery doesn't work too well if we leave interrupts disabled... Signed-off-by: David Woodhouse Acked-by: Francois Romieu Signed-off-by: David S. Miller Cc: Nathan Williams Signed-off-by: Luis Henriques --- drivers/net/ethernet/realtek/8139cp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c index efd3e34..9ac8801 100644 --- a/drivers/net/ethernet/realtek/8139cp.c +++ b/drivers/net/ethernet/realtek/8139cp.c @@ -1252,6 +1252,7 @@ static void cp_tx_timeout(struct net_device *dev) cp_clean_rings(cp); rc = cp_init_rings(cp); cp_start_hw(cp); + cp_enable_irq(cp); netif_wake_queue(dev); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 77/78] Fix a few incorrectly checked [io_]remap_pfn_range() calls
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Linus Torvalds commit 7314e613d5ff9f0934f7a0f74ed7973b903315d1 upstream. Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that really should use the vm_iomap_memory() helper. This trivially converts two of them to the helper, and comments about why the third one really needs to continue to use remap_pfn_range(), and adds the missing size check. Reported-by: Nico Golde Signed-off-by: Linus Torvalds Signed-off-by: Luis Henriques --- drivers/uio/uio.c| 16 +++- drivers/video/au1100fb.c | 26 +- drivers/video/au1200fb.c | 23 +-- 3 files changed, 17 insertions(+), 48 deletions(-) diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c index a783d53..7150752 100644 --- a/drivers/uio/uio.c +++ b/drivers/uio/uio.c @@ -650,16 +650,30 @@ static int uio_mmap_physical(struct vm_area_struct *vma) { struct uio_device *idev = vma->vm_private_data; int mi = uio_find_mem_index(vma); + struct uio_mem *mem; if (mi < 0) return -EINVAL; + mem = idev->info->mem + mi; + + if (vma->vm_end - vma->vm_start > mem->size) + return -EINVAL; vma->vm_flags |= VM_IO | VM_RESERVED; vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + /* +* We cannot use the vm_iomap_memory() helper here, +* because vma->vm_pgoff is the map index we looked +* up above in uio_find_mem_index(), rather than an +* actual page offset into the mmap. +* +* So we just do the physical mmap without a page +* offset. +*/ return remap_pfn_range(vma, vma->vm_start, - idev->info->mem[mi].addr >> PAGE_SHIFT, + mem->addr >> PAGE_SHIFT, vma->vm_end - vma->vm_start, vma->vm_page_prot); } diff --git a/drivers/video/au1100fb.c b/drivers/video/au1100fb.c index fe3b6ec..2169bc0 100644 --- a/drivers/video/au1100fb.c +++ b/drivers/video/au1100fb.c @@ -375,39 +375,15 @@ void au1100fb_fb_rotate(struct fb_info *fbi, int angle) int au1100fb_fb_mmap(struct fb_info *fbi, struct vm_area_struct *vma) { struct au1100fb_device *fbdev; - unsigned int len; - unsigned long start=0, off; fbdev = to_au1100fb_device(fbi); - if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) { - return -EINVAL; - } - - start = fbdev->fb_phys & PAGE_MASK; - len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len); - - off = vma->vm_pgoff << PAGE_SHIFT; - - if ((vma->vm_end - vma->vm_start + off) > len) { - return -EINVAL; - } - - off += start; - vma->vm_pgoff = off >> PAGE_SHIFT; - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); pgprot_val(vma->vm_page_prot) |= (6 << 9); //CCA=6 vma->vm_flags |= VM_IO; - if (io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT, - vma->vm_end - vma->vm_start, - vma->vm_page_prot)) { - return -EAGAIN; - } - - return 0; + return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len); } static struct fb_ops au1100fb_ops = diff --git a/drivers/video/au1200fb.c b/drivers/video/au1200fb.c index 7ca79f0..117be3d 100644 --- a/drivers/video/au1200fb.c +++ b/drivers/video/au1200fb.c @@ -1233,36 +1233,15 @@ static int au1200fb_fb_blank(int blank_mode, struct fb_info *fbi) * method mainly to allow the use of the TLB streaming flag (CCA=6) */ static int au1200fb_fb_mmap(struct fb_info *info, struct vm_area_struct *vma) - { - unsigned int len; - unsigned long start=0, off; struct au1200fb_device *fbdev = info->par; - if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) { - return -EINVAL; - } - - start = fbdev->fb_phys & PAGE_MASK; - len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len); - - off = vma->vm_pgoff << PAGE_SHIFT; - - if ((vma->vm_end - vma->vm_start + off) > len) { - return -EINVAL; - } - - off += start; - vma->vm_pgoff = off >> PAGE_SHIFT; - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); pgprot_val(vma->vm_page_prot) |= _CACHE_MASK; /* CCA=7 */ vma->vm_flags |= VM_IO; - return io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT, - vma->vm_end - vma->vm_start, - vma->vm_page_prot); + return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len); return 0; } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org
[PATCH 3.5 76/78] SUNRPC handle EKEYEXPIRED in call_refreshresult
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Andy Adamson commit eb96d5c97b0825d542e9c4ba5e0a22b519355166 upstream. Currently, when an RPCSEC_GSS context has expired or is non-existent and the users (Kerberos) credentials have also expired or are non-existent, the client receives the -EKEYEXPIRED error and tries to refresh the context forever. If an application is performing I/O, or other work against the share, the application hangs, and the user is not prompted to refresh/establish their credentials. This can result in a denial of service for other users. Users are expected to manage their Kerberos credential lifetimes to mitigate this issue. Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number of times to refresh the gss_context, and then return -EACCES to the application. Signed-off-by: Andy Adamson Signed-off-by: Trond Myklebust Cc: Ben Hutchings [ luis: backported to 3.5 based on bwh backport to 3.2: - adjusted context - dropped changes to nfs4_handle_reclaim_lease_error() ] Signed-off-by: Luis Henriques --- fs/nfs/nfs3proc.c | 6 +++--- fs/nfs/nfs4filelayout.c | 1 - fs/nfs/nfs4proc.c | 18 -- fs/nfs/nfs4state.c | 22 -- fs/nfs/proc.c | 43 --- net/sunrpc/clnt.c | 1 + 6 files changed, 4 insertions(+), 87 deletions(-) diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index fda63e9..c7eb313 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -24,14 +24,14 @@ #define NFSDBG_FACILITYNFSDBG_PROC -/* A wrapper to handle the EJUKEBOX and EKEYEXPIRED error messages */ +/* A wrapper to handle the EJUKEBOX error messages */ static int nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags) { int res; do { res = rpc_call_sync(clnt, msg, flags); - if (res != -EJUKEBOX && res != -EKEYEXPIRED) + if (res != -EJUKEBOX) break; freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME); res = -ERESTARTSYS; @@ -44,7 +44,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags) static int nfs3_async_handle_jukebox(struct rpc_task *task, struct inode *inode) { - if (task->tk_status != -EJUKEBOX && task->tk_status != -EKEYEXPIRED) + if (task->tk_status != -EJUKEBOX) return 0; if (task->tk_status == -EJUKEBOX) nfs_inc_stats(inode, NFSIOS_DELAY); diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c index e134029..8445359 100644 --- a/fs/nfs/nfs4filelayout.c +++ b/fs/nfs/nfs4filelayout.c @@ -169,7 +169,6 @@ static int filelayout_async_handle_error(struct rpc_task *task, break; case -NFS4ERR_DELAY: case -NFS4ERR_GRACE: - case -EKEYEXPIRED: rpc_delay(task, FILELAYOUT_POLL_RETRY_MAX); break; case -NFS4ERR_RETRY_UNCACHED_REP: diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 594ec86..a89661e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -341,7 +341,6 @@ static int nfs4_handle_exception(struct nfs_server *server, int errorcode, struc } case -NFS4ERR_GRACE: case -NFS4ERR_DELAY: - case -EKEYEXPIRED: ret = nfs4_delay(server->client, >timeout); if (ret != 0) break; @@ -1371,13 +1370,6 @@ int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state nfs_inode_find_state_and_recover(state->inode, stateid); nfs4_schedule_stateid_recovery(server, state); - case -EKEYEXPIRED: - /* -* User RPCSEC_GSS context has expired. -* We cannot recover this stateid now, so -* skip it and allow recovery thread to -* proceed. -*/ case -ENOMEM: err = 0; goto out; @@ -3949,7 +3941,6 @@ nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server, case -NFS4ERR_DELAY: nfs_inc_server_stats(server, NFSIOS_DELAY); case -NFS4ERR_GRACE: - case -EKEYEXPIRED: rpc_delay(task, NFS4_POLL_RETRY_MAX); task->tk_status = 0; return -EAGAIN; @@ -4906,15 +4897,6 @@ int nfs4_lock_delegation_recall(struct nfs4_state *state, struct file_lock *fl)
[PATCH 3.5 69/78] netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Patrick McHardy commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream. Some Cisco phones create huge messages that are spread over multiple packets. After calculating the offset of the SIP body, it is validated to be within the packet and the packet is dropped otherwise. This breaks operation of these phones. Since connection tracking is supposed to be passive, just let those packets pass unmodified and untracked. Signed-off-by: Patrick McHardy Signed-off-by: Pablo Neira Ayuso Cc: William Roberts [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- net/netfilter/nf_conntrack_sip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c index 93faf6a..4a8c55b 100644 --- a/net/netfilter/nf_conntrack_sip.c +++ b/net/netfilter/nf_conntrack_sip.c @@ -1468,7 +1468,7 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff, msglen = origlen = end - dptr; if (msglen > datalen) - return NF_DROP; + return NF_ACCEPT; ret = process_sip_msg(skb, ct, dataoff, , ); if (ret != NF_ACCEPT) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 74/78] include/linux/fs.h: disable preempt when acquire i_size_seqcount write lock
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Fan Du commit 74e3d1e17b2e11d175970b85acd44f5927000ba2 upstream. Two rt tasks bind to one CPU core. The higher priority rt task A preempts a lower priority rt task B which has already taken the write seq lock, and then the higher priority rt task A try to acquire read seq lock, it's doomed to lockup. rt task A with lower priority: call write i_size_writert task B with higher priority: call sync, and preempt task A write_seqcount_begin(>i_size_seqcount);i_size_read inode->i_size = i_size; read_seqcount_begin <-- lockup here... So disable preempt when acquiring every i_size_seqcount *write* lock will cure the problem. Signed-off-by: Fan Du Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Zhao Hongjiang Signed-off-by: Luis Henriques --- include/linux/fs.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index 17fd887..65b8b69 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -925,9 +925,11 @@ static inline loff_t i_size_read(const struct inode *inode) static inline void i_size_write(struct inode *inode, loff_t i_size) { #if BITS_PER_LONG==32 && defined(CONFIG_SMP) + preempt_disable(); write_seqcount_begin(>i_size_seqcount); inode->i_size = i_size; write_seqcount_end(>i_size_seqcount); + preempt_enable(); #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT) preempt_disable(); inode->i_size = i_size; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 60/78] perf/ftrace: Fix paranoid level for enabling function tracer
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Steven Rostedt commit 12ae030d54ef250706da5642fc7697cc60ad0df7 upstream. The current default perf paranoid level is "1" which has "perf_paranoid_kernel()" return false, and giving any operations that use it, access to normal users. Unfortunately, this includes function tracing and normal users should not be allowed to enable function tracing by default. The proper level is defined at "-1" (full perf access), which "perf_paranoid_tracepoint_raw()" will only give access to. Use that check instead for enabling function tracing. Reported-by: Dave Jones Reported-by: Vince Weaver Tested-by: Vince Weaver Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Jiri Olsa Cc: Frederic Weisbecker CVE: CVE-2013-2930 Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf") Signed-off-by: Steven Rostedt Signed-off-by: Luis Henriques --- kernel/trace/trace_event_perf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c index fee3752..d01adb7 100644 --- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -26,7 +26,7 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event, { /* The ftrace function trace is allowed only for root. */ if (ftrace_event_is_function(tp_event) && - perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN)) + perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN)) return -EPERM; /* No tracing, just counting, so no obvious leak */ -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 72/78] ARM: 7670/1: fix the memset fix
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Nicolas Pitre commit 418df63adac56841ef6b0f1fcf435bc64d4ed177 upstream. Commit 455bd4c430b0 ("ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations") attempted to fix a compliance issue with the memset return value. However the memset itself became broken by that patch for misaligned pointers. This fixes the above by branching over the entry code from the misaligned fixup code to avoid reloading the original pointer. Also, because the function entry alignment is wrong in the Thumb mode compilation, that fixup code is moved to the end. While at it, the entry instructions are slightly reworked to help dual issue pipelines. Signed-off-by: Nicolas Pitre Tested-by: Alexander Holler Signed-off-by: Russell King Cc: Eric Bénard Signed-off-by: Luis Henriques --- arch/arm/lib/memset.S | 33 + 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S index d912e73..94b0650 100644 --- a/arch/arm/lib/memset.S +++ b/arch/arm/lib/memset.S @@ -14,31 +14,15 @@ .text .align 5 - .word 0 - -1: subsr2, r2, #4 @ 1 do we have enough - blt 5f @ 1 bytes to align with? - cmp r3, #2 @ 1 - strltb r1, [ip], #1@ 1 - strleb r1, [ip], #1@ 1 - strbr1, [ip], #1@ 1 - add r2, r2, r3 @ 1 (r2 = r2 - (4 - r3)) -/* - * The pointer is now aligned and the length is adjusted. Try doing the - * memset again. - */ ENTRY(memset) -/* - * Preserve the contents of r0 for the return value. - */ - mov ip, r0 - andsr3, ip, #3 @ 1 unaligned? - bne 1b @ 1 + andsr3, r0, #3 @ 1 unaligned? + mov ip, r0 @ preserve r0 as return value + bne 6f @ 1 /* * we know that the pointer in ip is aligned to a word boundary. */ - orr r1, r1, r1, lsl #8 +1: orr r1, r1, r1, lsl #8 orr r1, r1, r1, lsl #16 mov r3, r1 cmp r2, #16 @@ -127,4 +111,13 @@ ENTRY(memset) tst r2, #1 strneb r1, [ip], #1 mov pc, lr + +6: subsr2, r2, #4 @ 1 do we have enough + blt 5b @ 1 bytes to align with? + cmp r3, #2 @ 1 + strltb r1, [ip], #1@ 1 + strleb r1, [ip], #1@ 1 + strbr1, [ip], #1@ 1 + add r2, r2, r3 @ 1 (r2 = r2 - (4 - r3)) + b 1b ENDPROC(memset) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 78/78] crypto: ansi_cprng - Fix off by one error in non-block size request
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Neil Horman commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream. Stephan Mueller reported to me recently a error in random number generation in the ansi cprng. If several small requests are made that are less than the instances block size, the remainder for loop code doesn't increment rand_data_valid in the last iteration, meaning that the last bytes in the rand_data buffer gets reused on the subsequent smaller-than-a-block request for random data. The fix is pretty easy, just re-code the for loop to make sure that rand_data_valid gets incremented appropriately Signed-off-by: Neil Horman Reported-by: Stephan Mueller CC: Stephan Mueller CC: Petr Matousek CC: Herbert Xu CC: "David S. Miller" Signed-off-by: Herbert Xu Signed-off-by: Luis Henriques --- crypto/ansi_cprng.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c index 6ddd99e..c21f761 100644 --- a/crypto/ansi_cprng.c +++ b/crypto/ansi_cprng.c @@ -230,11 +230,11 @@ remainder: */ if (byte_count < DEFAULT_BLK_SZ) { empty_rbuf: - for (; ctx->rand_data_valid < DEFAULT_BLK_SZ; - ctx->rand_data_valid++) { + while (ctx->rand_data_valid < DEFAULT_BLK_SZ) { *ptr = ctx->rand_data[ctx->rand_data_valid]; ptr++; byte_count--; + ctx->rand_data_valid++; if (byte_count == 0) goto done; } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 65/78] vsprintf: check real user/group id for %pK
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ryan Mallon commit 312b4e226951f707e120b95b118cbc14f3d162b2 upstream. Some setuid binaries will allow reading of files which have read permission by the real user id. This is problematic with files which use %pK because the file access permission is checked at open() time, but the kptr_restrict setting is checked at read() time. If a setuid binary opens a %pK file as an unprivileged user, and then elevates permissions before reading the file, then kernel pointer values may be leaked. This happens for example with the setuid pppd application on Ubuntu 12.04: $ head -1 /proc/kallsyms T startup_32 $ pppd file /proc/kallsyms pppd: In file /proc/kallsyms: unrecognized option 'c100' This will only leak the pointer value from the first line, but other setuid binaries may leak more information. Fix this by adding a check that in addition to the current process having CAP_SYSLOG, that effective user and group ids are equal to the real ids. If a setuid binary reads the contents of a file which uses %pK then the pointer values will be printed as NULL if the real user is unprivileged. Update the sysctl documentation to reflect the changes, and also correct the documentation to state the kptr_restrict=0 is the default. This is a only temporary solution to the issue. The correct solution is to do the permission check at open() time on files, and to replace %pK with a function which checks the open() time permission. %pK uses in printk should be removed since no sane permission check can be done, and instead protected by using dmesg_restrict. Signed-off-by: Ryan Mallon Cc: Kees Cook Cc: Alexander Viro Cc: Joe Perches Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- Documentation/sysctl/kernel.txt | 25 ++--- lib/vsprintf.c | 33 ++--- 2 files changed, 48 insertions(+), 10 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 6d78841..99d8ab9 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -284,13 +284,24 @@ Default value is "/sbin/hotplug". kptr_restrict: This toggle indicates whether restrictions are placed on -exposing kernel addresses via /proc and other interfaces. When -kptr_restrict is set to (0), there are no restrictions. When -kptr_restrict is set to (1), the default, kernel pointers -printed using the %pK format specifier will be replaced with 0's -unless the user has CAP_SYSLOG. When kptr_restrict is set to -(2), kernel pointers printed using %pK will be replaced with 0's -regardless of privileges. +exposing kernel addresses via /proc and other interfaces. + +When kptr_restrict is set to (0), the default, there are no restrictions. + +When kptr_restrict is set to (1), kernel pointers printed using the %pK +format specifier will be replaced with 0's unless the user has CAP_SYSLOG +and effective user and group ids are equal to the real ids. This is +because %pK checks are done at read() time rather than open() time, so +if permissions are elevated between the open() and the read() (e.g via +a setuid binary) then %pK will not leak kernel pointers to unprivileged +users. Note, this is a temporary solution only. The correct long-term +solution is to do the permission checks at open() time. Consider removing +world read permissions from files that use %pK, and using dmesg_restrict +to protect against uses of %pK in dmesg(8) if leaking kernel pointer +values to unprivileged users is a concern. + +When kptr_restrict is set to (2), kernel pointers printed using +%pK will be replaced with 0's regardless of privileges. == diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 598a73e..b82f4ba 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include /* for PAGE_SIZE */ @@ -1036,11 +1037,37 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr, spec.field_width = default_width; return string(buf, end, "pK-error", spec); } - if (!((kptr_restrict == 0) || - (kptr_restrict == 1 && - has_capability_noaudit(current, CAP_SYSLOG + + switch (kptr_restrict) { + case 0: + /* Always print %pK values */ + break; + case 1: { + /* +* Only print the real pointer value if the current +* process has CAP_SYSLOG and is running with the +* same credentials it
[PATCH 3.5 46/78] alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: KOSAKI Motohiro commit 98d6f4dd84a134d942827584a3c5f67ffd8ec35f upstream. Fedora Ruby maintainer reported latest Ruby doesn't work on Fedora Rawhide on ARM. (http://bugs.ruby-lang.org/issues/9008) Because of, commit 1c6b39ad3f (alarmtimers: Return -ENOTSUPP if no RTC device is present) intruduced to return ENOTSUPP when clock_get{time,res} can't find a RTC device. However this is incorrect. First, ENOTSUPP isn't exported to userland (ENOTSUP or EOPNOTSUP are the closest userland equivlents). Second, Posix and Linux man pages agree that clock_gettime and clock_getres should return EINVAL if clk_id argument is invalid. While the arugment that the clockid is valid, but just not supported on this hardware could be made, this is just a technicality that doesn't help userspace applicaitons, and only complicates error handling. Thus, this patch changes the code to use EINVAL. Cc: Thomas Gleixner Cc: Frederic Weisbecker Reported-by: Vit Ondruch Signed-off-by: KOSAKI Motohiro [jstultz: Tweaks to commit message to include full rational] Signed-off-by: John Stultz Signed-off-by: Luis Henriques --- kernel/time/alarmtimer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c index aa27d39..2cfe9a5 100644 --- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -474,7 +474,7 @@ static int alarm_clock_getres(const clockid_t which_clock, struct timespec *tp) clockid_t baseid = alarm_bases[clock2alarm(which_clock)].base_clockid; if (!alarmtimer_get_rtcdev()) - return -ENOTSUPP; + return -EINVAL; return hrtimer_get_res(baseid, tp); } @@ -491,7 +491,7 @@ static int alarm_clock_get(clockid_t which_clock, struct timespec *tp) struct alarm_base *base = _bases[clock2alarm(which_clock)]; if (!alarmtimer_get_rtcdev()) - return -ENOTSUPP; + return -EINVAL; *tp = ktime_to_timespec(base->gettime()); return 0; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 61/78] ALSA: hda - Add support for CX20952
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Takashi Iwai commit 8f42d7698751a45cd9f7134a5da49bc5b6206179 upstream. It's a superset of the existing CX2075x codecs, so we can reuse the existing parser code. Signed-off-by: Takashi Iwai Signed-off-by: Luis Henriques --- sound/pci/hda/patch_conexant.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c index 5fb90c6..5a48081 100644 --- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -4594,6 +4594,8 @@ static const struct hda_codec_preset snd_hda_preset_conexant[] = { .patch = patch_conexant_auto }, { .id = 0x14f15115, .name = "CX20757", .patch = patch_conexant_auto }, + { .id = 0x14f151d7, .name = "CX20952", + .patch = patch_conexant_auto }, {} /* terminator */ }; @@ -4620,6 +4622,7 @@ MODULE_ALIAS("snd-hda-codec-id:14f15111"); MODULE_ALIAS("snd-hda-codec-id:14f15113"); MODULE_ALIAS("snd-hda-codec-id:14f15114"); MODULE_ALIAS("snd-hda-codec-id:14f15115"); +MODULE_ALIAS("snd-hda-codec-id:14f151d7"); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Conexant HD-audio codec"); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 56/78] can: c_can: Fix RX message handling, handle lost message before EOB
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Markus Pargmann commit 5d0f801a2ccec3b1fdabc3392c8d99ed0413d216 upstream. If we handle end of block messages with higher priority than a lost message, we can run into an endless interrupt loop. This is reproducable with a am335x processor and "cansequence -r" at 1Mbit. As soon as we loose a packet we can't escape from an interrupt loop. This patch fixes the problem by handling lost packets before EOB packets. Signed-off-by: Markus Pargmann Signed-off-by: Marc Kleine-Budde [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/net/can/c_can/c_can.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 64647d4..91d1b5a 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -764,9 +764,6 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota) msg_ctrl_save = priv->read_reg(priv, >regs->ifregs[0].msg_cntrl); - if (msg_ctrl_save & IF_MCONT_EOB) - return num_rx_pkts; - if (msg_ctrl_save & IF_MCONT_MSGLST) { c_can_handle_lost_msg_obj(dev, 0, msg_obj); num_rx_pkts++; @@ -774,6 +771,9 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota) continue; } + if (msg_ctrl_save & IF_MCONT_EOB) + return num_rx_pkts; + if (!(msg_ctrl_save & IF_MCONT_NEWDAT)) continue; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 45/78] rt2800usb: slow down TX status polling
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Stanislaw Gruszka commit 36165fd5b00bf8163f89c21bb16a3e9834555b10 upstream. Polling TX statuses too frequently has two negative effects. First is randomly peek CPU usage, causing overall system functioning delays. Second bad effect is that device is not able to fill TX statuses in H/W register on some workloads and we get lot of timeouts like below: ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2 ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2 ieee80211 phy4: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping This not only cause flood of messages in dmesg, but also bad throughput, since rate scaling algorithm can not work optimally. In the future, we should probably make polling interval be adjusted automatically, but for now just increase values, this make mentioned problems gone. Resolve: https://bugzilla.kernel.org/show_bug.cgi?id=62781 Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville Signed-off-by: Luis Henriques --- drivers/net/wireless/rt2x00/rt2800usb.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/rt2x00/rt2800usb.c b/drivers/net/wireless/rt2x00/rt2800usb.c index fe42f76..c30797e 100644 --- a/drivers/net/wireless/rt2x00/rt2800usb.c +++ b/drivers/net/wireless/rt2x00/rt2800usb.c @@ -143,6 +143,8 @@ static bool rt2800usb_txstatus_timeout(struct rt2x00_dev *rt2x00dev) return false; } +#define TXSTATUS_READ_INTERVAL 100 + static bool rt2800usb_tx_sta_fifo_read_completed(struct rt2x00_dev *rt2x00dev, int urb_status, u32 tx_status) { @@ -170,8 +172,9 @@ static bool rt2800usb_tx_sta_fifo_read_completed(struct rt2x00_dev *rt2x00dev, queue_work(rt2x00dev->workqueue, >txdone_work); if (rt2800usb_txstatus_pending(rt2x00dev)) { - /* Read register after 250 us */ - hrtimer_start(>txstatus_timer, ktime_set(0, 25), + /* Read register after 1 ms */ + hrtimer_start(>txstatus_timer, + ktime_set(0, TXSTATUS_READ_INTERVAL), HRTIMER_MODE_REL); return false; } @@ -196,8 +199,9 @@ static void rt2800usb_async_read_tx_status(struct rt2x00_dev *rt2x00dev) if (test_and_set_bit(TX_STATUS_READING, >flags)) return; - /* Read TX_STA_FIFO register after 500 us */ - hrtimer_start(>txstatus_timer, ktime_set(0, 50), + /* Read TX_STA_FIFO register after 2 ms */ + hrtimer_start(>txstatus_timer, + ktime_set(0, 2*TXSTATUS_READ_INTERVAL), HRTIMER_MODE_REL); } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 34/78] scripts/kallsyms: filter symbols not in kernel address space
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Ming Lei commit f6537f2f0eba4eba3354e48dbe3047db6d8b6254 upstream. This patch uses CONFIG_PAGE_OFFSET to filter symbols which are not in kernel address space because these symbols are generally for generating code purpose and can't be run at kernel mode, so we needn't keep them in /proc/kallsyms. For example, on ARM there are some symbols which may be linked in relocatable code section, then perf can't parse symbols any more from /proc/kallsyms, this patch fixes the problem (introduced b9b32bf70f2fb710b07c94e13afbc729afe221da) Cc: Russell King Cc: linux-arm-ker...@lists.infradead.org Cc: Michal Marek Signed-off-by: Ming Lei Signed-off-by: Rusty Russell [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- scripts/kallsyms.c | 12 +++- scripts/link-vmlinux.sh | 2 ++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 487ac6f..9a11f9f 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -55,6 +55,7 @@ static struct sym_entry *table; static unsigned int table_size, table_cnt; static int all_symbols = 0; static char symbol_prefix_char = '\0'; +static unsigned long long kernel_start_addr = 0; int token_profit[0x1]; @@ -65,7 +66,10 @@ unsigned char best_table_len[256]; static void usage(void) { - fprintf(stderr, "Usage: kallsyms [--all-symbols] [--symbol-prefix=] < in.map > out.S\n"); + fprintf(stderr, "Usage: kallsyms [--all-symbols] " + "[--symbol-prefix=] " + "[--page-offset=] " + "< in.map > out.S\n"); exit(1); } @@ -194,6 +198,9 @@ static int symbol_valid(struct sym_entry *s) int i; int offset = 1; + if (s->addr < kernel_start_addr) + return 0; + /* skip prefix char */ if (symbol_prefix_char && *(s->sym + 1) == symbol_prefix_char) offset++; @@ -646,6 +653,9 @@ int main(int argc, char **argv) if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\'')) p++; symbol_prefix_char = *p; + } else if (strncmp(argv[i], "--page-offset=", 14) == 0) { + const char *p = [i][14]; + kernel_start_addr = strtoull(p, NULL, 16); } else usage(); } diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index cd9c6c6..7a9f7f9 100644 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -78,6 +78,8 @@ kallsyms() kallsymopt=--all-symbols fi + kallsymopt="${kallsymopt} --page-offset=$CONFIG_PAGE_OFFSET" + local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \ ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}" -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 42/78] usb: hub: Clear Port Reset Change during init/resume
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Julius Werner commit e92aee330837e4911553761490a8fb843f2053a6 upstream. This patch adds the Port Reset Change flag to the set of bits that are preemptively cleared on init/resume of a hub. In theory this bit should never be set unexpectedly... in practice it can still happen if BIOS, SMM or ACPI code plays around with USB devices without cleaning up correctly. This is especially dangerous for XHCI root hubs, which don't generate any more Port Status Change Events until all change bits are cleared, so this is a good precaution to have (similar to how it's already done for the Warm Port Reset Change flag). Signed-off-by: Julius Werner Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman [ luis: backported to 3.5: - adjusted context - replaced usb_clear_port_feature() by clear_port_feature() ] Signed-off-by: Luis Henriques --- drivers/usb/core/hub.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 86c7421..b5503b0 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1142,6 +1142,11 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) clear_port_feature(hub->hdev, port1, USB_PORT_FEAT_C_ENABLE); } + if (portchange & USB_PORT_STAT_C_RESET) { + need_debounce_delay = true; + clear_port_feature(hub->hdev, port1, + USB_PORT_FEAT_C_RESET); + } if ((portchange & USB_PORT_STAT_C_BH_RESET) && hub_is_superspeed(hub->hdev)) { need_debounce_delay = true; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 16/78] md: Fix skipping recovery for read-only arrays.
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Lukasz Dorau commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak Signed-off-by: Lukasz Dorau Signed-off-by: NeilBrown Signed-off-by: Luis Henriques --- drivers/md/raid1.c | 1 + drivers/md/raid10.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index aa58c02..0d15abe 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1354,6 +1354,7 @@ static int raid1_spare_active(struct mddev *mddev) } } if (rdev + && rdev->recovery_offset == MaxSector && !test_bit(Faulty, >flags) && !test_and_set_bit(In_sync, >flags)) { count++; diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 5ad042c..2070e9c 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1630,6 +1630,7 @@ static int raid10_spare_active(struct mddev *mddev) } sysfs_notify_dirent_safe(tmp->replacement->sysfs_state); } else if (tmp->rdev + && tmp->rdev->recovery_offset == MaxSector && !test_bit(Faulty, >rdev->flags) && !test_and_set_bit(In_sync, >rdev->flags)) { count++; -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.5 02/78] jfs: fix error path in ialloc
3.5.7.26 -stable review patch. If anyone has any objections, please let me know. -- From: Dave Kleikamp commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp Tested-by: Michael L. Semon Signed-off-by: Luis Henriques --- fs/jfs/jfs_inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/jfs/jfs_inode.c b/fs/jfs/jfs_inode.c index c1a3e60..7f464c5 100644 --- a/fs/jfs/jfs_inode.c +++ b/fs/jfs/jfs_inode.c @@ -95,7 +95,7 @@ struct inode *ialloc(struct inode *parent, umode_t mode) if (insert_inode_locked(inode) < 0) { rc = -EINVAL; - goto fail_unlock; + goto fail_put; } inode_init_owner(inode, parent, mode); @@ -156,7 +156,6 @@ struct inode *ialloc(struct inode *parent, umode_t mode) fail_drop: dquot_drop(inode); inode->i_flags |= S_NOQUOTA; -fail_unlock: clear_nlink(inode); unlock_new_inode(inode); fail_put: -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] dt: binding documentation for bq2415x charger
On Sun 2013-11-24 17:49:31, Sebastian Reichel wrote: > Add devicetree binding documentation for bq2415x charger. > > Signed-off-by: Sebastian Reichel Thanks! > +- reg: integer, i2c address of the device > +- ti,current-limit: integer, current limit in mA Does this need to be "ti," specific? Most of fields are likely to be needed for other li-ion chargers "Specifies maximum current charger can pull from power supply" ? > +- ti,charge-current: integer, charging current in mA ...why/how is it different from current-limit? Is the current-limit on 5V, while the charge-current relative to battery voltage... so that charge-current can be > current-limit when battery voltage is low? "Maximum current that will be supplied to the battery, as determined by voltage on current sense resistor"? > +- ti,weak-battery-voltage: integer, weak battery voltage threshold in mV It would be good to explain what this threshold means. Voltage so low that system needs to be shut down? Hmm, it looks to me like it is "as long as battery is below this voltage, fast charge is not started. Instead, slow 'precharge' is performed." > +- ti,battery-regulation-voltage: integer, battery regulation > voltage in mV "Selects maximum voltage for charging." > +- ti,termination-current: integer, termination current in mA "When current in constant-voltage phase drops below this value, charge is terminated". -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc
On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong wrote: > > On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote: I'm not really following, but note that parent_pte predates EPT (and the use of rcu in kvm), so all the complexity that is the result of trying to pack as many list entries into a cache line can be dropped. Most setups now would have exactly one list entry, which is handled specially antyway. Alternatively, the trick of storing multiple entries in one list entry can be moved to generic code, it may be useful to others. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powerpc/4xx: Fix warning in kilauea.dtb
On Mon, Nov 25, 2013 at 4:40 AM, Ian Campbell wrote: > Currently I see: > DTC arch/powerpc/boot/kilauea.dtb > Warning (reg_format): "reg" property in /plb/ppc4xx-msi@C1000 has invalid > length (12 bytes) (#address-cells == 1, #size-cells == 1) > > It appears that unlike the other platforms handled by 3fb7933850fa > "powerpc/4xx: Adding PCIe MSI support" this platform does not use > address-cells=2. > > Signed-off-by: Ian Campbell > Acked-by: Josh Boyer > Cc: Rupjyoti Sarmah > Cc: Tirumala R Marri > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: devicet...@vger.kernel.org (open list:OPEN FIRMWARE AND...) > Cc: linuxppc-...@lists.ozlabs.org > Cc: linux-kernel@vger.kernel.org > --- > Resending, this hasn't been picked up since June > http://patchwork.ozlabs.org/patch/248234/ Ben, please pick this up. josh > --- > arch/powerpc/boot/dts/kilauea.dts |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/boot/dts/kilauea.dts > b/arch/powerpc/boot/dts/kilauea.dts > index 1613d6e..5ba7f01 100644 > --- a/arch/powerpc/boot/dts/kilauea.dts > +++ b/arch/powerpc/boot/dts/kilauea.dts > @@ -406,7 +406,7 @@ > > MSI: ppc4xx-msi@C1000 { > compatible = "amcc,ppc4xx-msi", "ppc4xx-msi"; > - reg = < 0x0 0xEF62 0x100>; > + reg = <0xEF62 0x100>; > sdr-base = <0x4B0>; > msi-data = <0x>; > msi-mask = <0x>; > -- > 1.7.10.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clk: tegra: use pll_ref as the pll_e parent
On Fri, Nov 22, 2013 at 02:40:35PM +0100, Peter De Schrijver wrote: > On Wed, Oct 30, 2013 at 01:41:29AM +0100, Peter De Schrijver wrote: > > Use pll_ref instead of pll_re_vco as the pll_e parent on Tegra114 and > > Tegra124. Also add a pll_ref table entry for pll_e for Tegra114. > > > > Signed-off-by: Peter De Schrijver > > I will squash this into the next tegra-clk-next as the previous pull request > has never made it. > Looking again at this, I think the Tegra114 and generic PLL part of the patch better stays as a separate patch. The Tegra124 bit will be squashed into the Tegra124 support patch. Cheers, Peter. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: [PATCH 7/8] watchdog: davinci: add "clocks" property
On 11/25/2013 02:15 PM, Mark Rutland wrote: On Mon, Nov 25, 2013 at 10:59:45AM +, ivan.khoronzhuk wrote: On 11/23/2013 07:57 PM, Arnd Bergmann wrote: On Wednesday 06 November 2013, ivan.khoronzhuk wrote: @@ -7,6 +7,10 @@ Required properties: - reg : Should contain WDT registers location and length +- clocks: phandle reference to the controller clock. + Required only for Keystone arch. + See clock-bindings.txt + Optional properties: - timeout-sec: Contains the watchdog timeout in seconds I think it should really be listed under "Optional properties" and the reference to Keystone removed. Note how the binding would need to change otherwise if another platform started to use the clock, which is a little silly. Arnd Ok, I'll move clocks property under "Optional properties" and describe it as following: Optional properties: - timeout-sec : Contains the watchdog timeout in seconds - clocks: phandle reference to the controller clock. Needed if platform uses clocks. See clock-bindings.txt Nit: clocks aren't just phandles, they have a clock-specifier too (which might be 0 cells). Otherwise this looks fine to me. Mark. ... I will replace it to: Optional properties: - timeout-sec : Contains the watchdog timeout in seconds - clocks: the clock feeding the watchdog timer. Needed if platform uses clocks. See clock-bindings.txt Is it OK? -- Regards, Ivan Khoronzhuk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] bq2415x_charger: add DT support
On Sun 2013-11-24 17:49:30, Sebastian Reichel wrote: > This adds DT support to the bq2415x driver. > > Signed-off-by: Sebastian Reichel Reviewed-by: Pavel Machek -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mfd: cros ec: spi: Use consistent function names
Both patches applied, thanks. -- Lee Jones Linaro STMicroelectronics Landing Team Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()
On 11/22/2013 08:02 AM, Yuanhan Liu wrote: > Greetings, > > I got the below dmesg and the first bad commit is > > commit 20545536cd8ea949c61527b6395ec8c0d2c237b1 > Author: Boaz Harrosh > Date: Thu Jul 19 15:22:37 2012 +0300 > > RFC: do_xor_speed Broken on UML do to jiffies > Hi Sir Yuanhan. I understand that you are running exofs_ioctl branch on linux-open-osd.git . Please tell me more why you choose to run this branch it is an experimental pNFS+Ganesha+exofs branch that we are working on around here. It might have problems. Yes this patch has problems, I know. I have it in my tree because I need it if I want to use XOR engine with a UML system. If you do need to run this branch *exofs_ioctl* on your system then it is best you revert this patch. Thanks for the report I think I'll just remove that patch and run with it locally. Cheers Boaz > Remember that hang I reported a while back on UML. Well > I'm at it again, and it still hangs and I found why. > > I have dprinted jiffies and it never advances during the > loop at do_xor_speed. There for it is stuck in an endless > loop. I have also dprinted current_kernel_time() and it > returns the same constant value as well. > > Note that it does usually work on UML, only during > the modprobe of xor.ko while that test is running. It looks > like some lucking is preventing the clock from ticking. > > However ktime_get_ts does work for me so I changed the code > as below, so I can work. See how I put several safety > guards, to never get hangs again. > And I think my time based approach is more accurate then > previous system. > > UML guys please investigate the jiffies issue? what is > xor.ko not doing right? > > Signed-off-by: Boaz Harrosh > > +--++ > | || > +--++ > | boot_successes | 0 | > | boot_failures| 29 | > | WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 | > | initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 | > +--++ > > [0.127025]generic_sse: 148.363 MB/sec > [0.127478] xor: using function: prefetch64-sse (152.727 MB/sec) > [0.128017] [ cut here ] > [0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 > do_one_initcall+0x105/0x115() > [0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with > preemption imbalance > [0.130013] Modules linked in: > [0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 3.12.0-11285-gb242bff #91 > [0.131013] 88000d0dde00 8161acc5 > 88000d0dde48 > [0.132554] 88000d0dde38 81052de9 81000316 > 81a77cfd > [0.133380] > 88000d0dde98 > [0.134213] Call Trace: > [0.134493] [] dump_stack+0x4e/0x7a > [0.135017] [] warn_slowpath_common+0x75/0x8e > [0.135654] [] ? do_one_initcall+0x105/0x115 > [0.136015] [] ? do_xor_speed+0xdd/0xdd > [0.137016] [] warn_slowpath_fmt+0x47/0x49 > [0.137628] [] ? free_pages+0x51/0x53 > [0.138015] [] ? do_xor_speed+0xdd/0xdd > [0.138623] [] do_one_initcall+0x105/0x115 > [0.139017] [] kernel_init_freeable+0x115/0x19b > [0.140016] [] ? do_early_param+0x88/0x88 > [0.140630] [] ? rest_init+0xbd/0xbd > [0.141016] [] kernel_init+0x9/0xfa > [0.141567] [] ret_from_fork+0x7c/0xb0 > [0.142016] [] ? rest_init+0xbd/0xbd > [0.143028] ---[ end trace 19b4eab334350767 ]--- > [0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE > > git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 > 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 -- > git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # 09:25 20+ > 0 Merge branch 'akpm' (patches from Andrew Morton) > git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b # 09:42 20+ > 0 Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs > git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6 # 10:08 20+ > 0 Merge branch 'for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input > git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae # 10:31 20+ > 0 perf header: Fix bogus group name > git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca # 10:54 20+ > 0 Merge branch 'for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha > git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6 # 11:03 20+ > 0 staging: lustre: fix
Re: [PATCH 1/3] power_supply: add power_supply_get_by_phandle
On Sun 2013-11-24 17:49:29, Sebastian Reichel wrote: > Add method to get power supply by device tree phandle. > > Signed-off-by: Sebastian Reichel Reviewed-by: Pavel Machek -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysfs: handle duplicate removal attempts in sysfs_remove_group()
On Monday, November 25, 2013 10:29:00 AM James Bottomley wrote: > On Fri, 2013-11-22 at 11:02 -0500, Tejun Heo wrote: > > Hello, > > > > On Fri, Nov 22, 2013 at 08:43:55AM -0700, Bjorn Helgaas wrote: > > > > So, we do have cases where the parent is removed before the child. I > > > > suppose the parent pci bridge is removed already? AFAICS this > > > > shouldn't break anything but people did seem to expect the removals to > > > > be ordered from child to parent. Bjorn, is this something you expect > > > > to happened? > > > > > > I do not expect a PCI bridge to be removed before the devices below > > > it. We should be removing all the children before removing the parent > > > bridge. > > > > > > But is this related to PCI? I don't see the connection yet. I tried > > > > I'm not sure. It was from thunderbolt and nobody is reporting it on > > other interconnects, so it could be. > > > > > to look into this a bit (my notes are at > > > https://bugzilla.kernel.org/show_bug.cgi?id=65281), but I haven't > > > figured out the big-picture problem yet. > > > > > > I don't have warm fuzzies that adding a "have we already removed this" > > > check is the best resolution, but maybe that's just because I don't > > > understand the problem. > > > > Yeah, the whole thing is sorta pointless. Just issuing removal and > > continuing on should do, IMHO. > > I'd go for that as well. We have huge problems with the _del calls > because visibility is strict hierarchy and it's not always easy to work > out who's underneath us. > > It's going to be really annoying when refcounting works perfectly for > objects, so you can just do puts in any order, but you have to have > _del() called to remove subordinate objects before their parent. Well, that would be fine and dandy, but device_del() calls bus_remove_device() which in turn runs device_release_driver() and the order in which *these* things are done actually matters in general. So yes, after we have released all of the drivers in question, we can do _del() right before the final _put() in any order just fine. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Make the mtdblock read/write skip the bad nand sector
David, On Mon, Nov 25, 2013 at 12:16:10PM +, David Woodhouse wrote: > On Mon, 2013-11-25 at 08:52 -0300, Ezequiel Garcia wrote: > > > > Your understanding is correct: NAND *must* be erased explictly in > > userspace > > before writing. However, keep in mind the following additional > > constraints: > > > > * Writing should be always performed using 'nandwrite', > > not tools such as 'cat' or 'dd'. > > > > * An mtdblock shouldn't be used to access directly the NAND from > > userspace. AFAICS, the primarily usage of mtdblock is to be able to > > mount JFFS2. > > No. You don't need mtdblock to mount JFFS2 at all. > > The mtdblock driver was used in the *very* early days of the MTD system, > on NOR flash with a "traditional" file system. Either in read-only mode > for something like cramfs, or in a very unsafe writeable mode. We > actually put ext2 on it for the Compaq iPaq for a while, before we had > JFFS. > > It was used as a shortcut for mounting JFFS2, and still is by a lot of > people, but it's certainly not necessary. You can turn off CONFIG_BLOCK > entirely and still use JFFS2. > > You should consider mtdblock to be the most basic, primitive, "flash > translation layer" that can possibly exist. And thus, should basically > never use it. I certainly don't approve of trying to extend it. > Thanks a lot for the insight. After reading this, I'm wondering what's preventing us from killing MTD block support altogether. Artem, already suggested it a while back... -- Ezequiel García, Free Electrons Embedded Linux, Kernel and Android Engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Preventing IPI sending races in arch code
On Mon, Nov 25, 2013 at 05:00:18PM +0530, Vineet Gupta wrote: > While we are at it, I wanted to confirm another potential race > (ARC/blackfin..) > The IPI handler clears the interrupt before atomically-read-n-clear the msg > word. > > do_IPI >plat_smp_ops.ipi_clear(irq); >while ((pending = xchg(_data->bits, 0) != 0) > find_next_bit() > switch(next-msg) > > Depending on arch this could lead to an immediate IPI interrupt, and again > ipi_data->bits could get out of syn with IPI senders. I'm obviously lacking in platform knowledge here, what does that ipi_clear() actually do? Tell the platform the interrupt has arrived and it can stop asserting the line? So sure, then someone can again assert the interrupt, but given we just established a protocol for raising the thing; namely something like this: void arch_send_ipi(int cpu, int type) { u32 *pending_ptr = per_cpu_ptr(ipi_bits, cpu); u32 new, old; do { new = old = *pending_ptr; new |= 1U << type; } while (cmpxchg(pending_ptr, old, new) != old) if (!old) /* only raise the actual IPI if we set the first bit */ raise_ipi(cpu); } Who would re-assert it if we have !0 pending? Also, the above can be thought of as a memory ordering issue: STORE pending MB /* implied by cmpxchg */ STORE ipi /* raise the actual thing */ In that case the other end must be: LOAD ipi MB /* implied by xchg */ LOAD pending Which is what your code seems to do. > IMO the while loop is > completely useless specially if IPIs are not coalesced in h/w. Agreed, the while loops seems superfluous. > And we need to move > the xchg ahead of ACK'ing the IPI > > do_IPI >pending = xchg(_data->bits, 0); >plat_smp_ops.ipi_clear(irq); >while (ffs) > switch(next-msg) > ... > > Does that look sane to you. This I'm not at all certain of; continuing with the memory order analogy this would allow for the case where we see 0 pending, set a bit, try and raise the interrupt but then do not because its already assert. And since you just removed the while() loop, we'll be left with a !0 pending vector and nobody processing it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] mfd: cros ec: i2c: Use consistent function names
Rename cros_ec_{probe,remove}_i2c() to cros_ec_i2c_{probe,remove}() for consistency. Signed-off-by: Thierry Reding --- drivers/mfd/cros_ec_i2c.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c index 123044608b63..4f71be99a183 100644 --- a/drivers/mfd/cros_ec_i2c.c +++ b/drivers/mfd/cros_ec_i2c.c @@ -120,7 +120,7 @@ static int cros_ec_command_xfer(struct cros_ec_device *ec_dev, return ret; } -static int cros_ec_probe_i2c(struct i2c_client *client, +static int cros_ec_i2c_probe(struct i2c_client *client, const struct i2c_device_id *dev_id) { struct device *dev = >dev; @@ -150,7 +150,7 @@ static int cros_ec_probe_i2c(struct i2c_client *client, return 0; } -static int cros_ec_remove_i2c(struct i2c_client *client) +static int cros_ec_i2c_remove(struct i2c_client *client) { struct cros_ec_device *ec_dev = i2c_get_clientdata(client); @@ -190,8 +190,8 @@ static struct i2c_driver cros_ec_driver = { .owner = THIS_MODULE, .pm = _ec_i2c_pm_ops, }, - .probe = cros_ec_probe_i2c, - .remove = cros_ec_remove_i2c, + .probe = cros_ec_i2c_probe, + .remove = cros_ec_i2c_remove, .id_table = cros_ec_i2c_id, }; -- 1.8.4.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] mfd: cros ec: spi: Use consistent function names
Rename cros_ec_{probe,remove}_spi() to cros_ec_spi_{probe,remove}() for consistency. Signed-off-by: Thierry Reding --- drivers/mfd/cros_ec_spi.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c index 27be73523b9c..5658ec48838f 100644 --- a/drivers/mfd/cros_ec_spi.c +++ b/drivers/mfd/cros_ec_spi.c @@ -284,7 +284,7 @@ static int cros_ec_command_spi_xfer(struct cros_ec_device *ec_dev, return 0; } -static int cros_ec_probe_spi(struct spi_device *spi) +static int cros_ec_spi_probe(struct spi_device *spi) { struct device *dev = >dev; struct cros_ec_device *ec_dev; @@ -326,7 +326,7 @@ static int cros_ec_probe_spi(struct spi_device *spi) return 0; } -static int cros_ec_remove_spi(struct spi_device *spi) +static int cros_ec_spi_remove(struct spi_device *spi) { struct cros_ec_device *ec_dev; @@ -367,8 +367,8 @@ static struct spi_driver cros_ec_driver_spi = { .owner = THIS_MODULE, .pm = _ec_spi_pm_ops, }, - .probe = cros_ec_probe_spi, - .remove = cros_ec_remove_spi, + .probe = cros_ec_spi_probe, + .remove = cros_ec_spi_remove, .id_table = cros_ec_spi_id, }; -- 1.8.4.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Make the mtdblock read/write skip the bad nand sector
On Mon, 2013-11-25 at 08:52 -0300, Ezequiel Garcia wrote: > > Your understanding is correct: NAND *must* be erased explictly in > userspace > before writing. However, keep in mind the following additional > constraints: > > * Writing should be always performed using 'nandwrite', > not tools such as 'cat' or 'dd'. > > * An mtdblock shouldn't be used to access directly the NAND from > userspace. AFAICS, the primarily usage of mtdblock is to be able to > mount JFFS2. No. You don't need mtdblock to mount JFFS2 at all. The mtdblock driver was used in the *very* early days of the MTD system, on NOR flash with a "traditional" file system. Either in read-only mode for something like cramfs, or in a very unsafe writeable mode. We actually put ext2 on it for the Compaq iPaq for a while, before we had JFFS. It was used as a shortcut for mounting JFFS2, and still is by a lot of people, but it's certainly not necessary. You can turn off CONFIG_BLOCK entirely and still use JFFS2. You should consider mtdblock to be the most basic, primitive, "flash translation layer" that can possibly exist. And thus, should basically never use it. I certainly don't approve of trying to extend it. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: Fwd: [PATCH 7/8] watchdog: davinci: add "clocks" property
On Mon, Nov 25, 2013 at 10:59:45AM +, ivan.khoronzhuk wrote: > On 11/23/2013 07:57 PM, Arnd Bergmann wrote: > > On Wednesday 06 November 2013, ivan.khoronzhuk wrote: > >> @@ -7,6 +7,10 @@ Required properties: > >> > >> - reg : Should contain WDT registers location and length > >> > >> +- clocks: phandle reference to the controller clock. > >> + Required only for Keystone arch. > >> + See clock-bindings.txt > >> + > >> Optional properties: > >> > >> - timeout-sec: Contains the watchdog timeout in seconds > > > > I think it should really be listed under "Optional properties" and the > > reference to Keystone removed. Note how the binding would need > > to change otherwise if another platform started to use the clock, which > > is a little silly. > > > > Arnd > > > > Ok, I'll move clocks property under "Optional properties" and describe it > as following: > > Optional properties: > - timeout-sec : Contains the watchdog timeout in seconds > - clocks: phandle reference to the controller clock. > Needed if platform uses clocks. > See clock-bindings.txt Nit: clocks aren't just phandles, they have a clock-specifier too (which might be 0 cells). Otherwise this looks fine to me. Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 答复: Re: [PATCH 1/1] workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY
On 11/25, zhang.y...@zte.com.cn wrote: > > hte...@gmail.com wrote on 2013/11/23 07:13:43: > > > > > Re: [PATCH 1/1] workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY > > > > On Thu, Nov 14, 2013 at 6:56 AM, Oleg Nesterov wrote: > > > Move the setting of PF_NO_SETAFFINITY up before set_cpus_allowed() > > > in create_worker(). Otherwise userland can change ->cpus_allowed > > > in between. > > > > > > Signed-off-by: Oleg Nesterov > > > > Applied to wq/for-3.13-fixes. > > > > Thanks! > > > > -- > > tejun > > How about the first patch? Let me quote myself: Looks like Zhang is right... But I'd suggest to change flush_old_exec() instead (see "current->flags &= ..."). Do you agree? If yes, could you make v2? flush_old_exec() already clears the unwanted PF_ bits, I do not think we should add another place. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 00/15] kmemcg shrinkers
This patchset implements targeted shrinking for memcg when kmem limits are present. So far, we've been accounting kernel objects but failing allocations when short of memory. This is because our only option would be to call the global shrinker, depleting objects from all caches and breaking isolation. The main idea is to associate per-memcg lists with each of the LRUs. The main LRU still provides a single entry point and when adding or removing an element from the LRU, we use the page information to figure out which memcg it belongs to and relay it to the right list. The bulk of the code is written by Glauber Costa. The only change I introduced myself in this iteration is reworking per-memcg LRU lists. Instead of extending the existing list_lru structure, which seems to be neat as is, I introduced a new one, memcg_list_lru, which aggregates list_lru objects for each kmem-active memcg and keeps them uptodate as memcgs are created/destroyed. I hope this simplified call paths and made the code easier to read. The patchset is based on top of Linux 3.13-rc1. Any comments and proposals are appreciated. == Known issues == * In case kmem limit is less than sum mem limit, reaching memcg kmem limit will result in an attempt to shrink all memcg slabs (see try_to_free_mem_cgroup_kmem()). Although this is better than simply failing allocation as it works now, it is still to be improved. * Since FS shrinkers can't be executed on __GFP_FS allocations, such allocations will fail if memcg kmem limit is less than sum mem limit and the memcg kmem usage is close to its limit. Glauber proposed to schedule a worker which would shrink kmem in the background on such allocations. However, this approach does not eliminate failures completely, it just makes them rarer. I'm thinking on implementing soft limits for memcg kmem so that striking the soft limit will trigger the reclaimer, but won't fail the allocation. I would appreciate any other proposals on how this can be fixed. * Only dcache and icache are reclaimed on memcg pressure. Other FS objects are left for global pressure only. However, it should not be a serious problem to make them reclaimable too by passing on memcg to the FS-layer and letting each FS decide if its internal objects are shrinkable on memcg pressure. == Changes from v10 == * Rework per-memcg list_lru infrastructure. Previous iteration (with full changelog) can be found here: http://www.spinics.net/lists/linux-fsdevel/msg66632.html Glauber Costa (12): memcg: make cache index determination more robust memcg: consolidate callers of memcg_cache_id vmscan: also shrink slab in memcg pressure memcg: move initialization to memcg creation memcg: move stop and resume accounting functions memcg: per-memcg kmem shrinking memcg: scan cache objects hierarchically vmscan: take at least one pass with shrinkers memcg: allow kmem limit to be resized down vmpressure: in-kernel notifications memcg: reap dead memcgs upon global memory pressure memcg: flush memcg items upon memcg destruction Vladimir Davydov (3): memcg,list_lru: add per-memcg LRU list infrastructure memcg,list_lru: add function walking over all lists of a per-memcg LRU super: make icache, dcache shrinkers memcg-aware fs/dcache.c| 25 +- fs/inode.c | 16 +- fs/internal.h |9 +- fs/super.c | 45 +-- include/linux/fs.h |4 +- include/linux/list_lru.h | 77 + include/linux/memcontrol.h | 23 ++ include/linux/shrinker.h |6 +- include/linux/swap.h |2 + include/linux/vmpressure.h |5 + mm/memcontrol.c| 709 +++- mm/vmpressure.c| 53 +++- mm/vmscan.c| 178 +-- 13 files changed, 1018 insertions(+), 134 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 04/15] memcg: move initialization to memcg creation
From: Glauber Costa Those structures are only used for memcgs that are effectively using kmemcg. However, in a later patch I intend to use scan that list inconditionally (list empty meaning no kmem caches present), which simplifies the code a lot. So move the initialization to early kmem creation. Signed-off-by: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- mm/memcontrol.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8924ff1..9ba9975 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3136,8 +3136,6 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg) } memcg->kmemcg_id = num; - INIT_LIST_HEAD(>memcg_slab_caches); - mutex_init(>slab_caches_mutex); return 0; } @@ -5923,6 +5921,8 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss) { int ret; + INIT_LIST_HEAD(>memcg_slab_caches); + mutex_init(>slab_caches_mutex); memcg->kmemcg_id = -1; ret = memcg_propagate_kmem(memcg); if (ret) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 03/15] vmscan: also shrink slab in memcg pressure
From: Glauber Costa Without the surrounding infrastructure, this patch is a bit of a hammer: it will basically shrink objects from all memcgs under memcg pressure. At least, however, we will keep the scan limited to the shrinkers marked as per-memcg. Future patches will implement the in-shrinker logic to filter objects based on its memcg association. Signed-off-by: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- include/linux/memcontrol.h | 17 +++ include/linux/shrinker.h |6 +- mm/memcontrol.c| 16 +- mm/vmscan.c| 50 +++- 4 files changed, 82 insertions(+), 7 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index b3e7a66..d16ba51 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -231,6 +231,9 @@ void mem_cgroup_split_huge_fixup(struct page *head); bool mem_cgroup_bad_page_check(struct page *page); void mem_cgroup_print_bad_page(struct page *page); #endif + +unsigned long +memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone); #else /* CONFIG_MEMCG */ struct mem_cgroup; @@ -427,6 +430,12 @@ static inline void mem_cgroup_replace_page_cache(struct page *oldpage, struct page *newpage) { } + +static inline unsigned long +memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone) +{ + return 0; +} #endif /* CONFIG_MEMCG */ #if !defined(CONFIG_MEMCG) || !defined(CONFIG_DEBUG_VM) @@ -479,6 +488,8 @@ static inline bool memcg_kmem_enabled(void) return static_key_false(_kmem_enabled_key); } +bool memcg_kmem_is_active(struct mem_cgroup *memcg); + /* * In general, we'll do everything in our power to not incur in any overhead * for non-memcg users for the kmem functions. Not even a function call, if we @@ -612,6 +623,12 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp) return __memcg_kmem_get_cache(cachep, gfp); } #else + +static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg) +{ + return false; +} + #define for_each_memcg_cache_index(_idx) \ for (; NULL; ) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 68c0970..7d462b1 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -22,6 +22,9 @@ struct shrink_control { nodemask_t nodes_to_scan; /* current node being shrunk (for NUMA aware shrinkers) */ int nid; + + /* reclaim from this memcg only (if not NULL) */ + struct mem_cgroup *target_mem_cgroup; }; #define SHRINK_STOP (~0UL) @@ -63,7 +66,8 @@ struct shrinker { #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */ /* Flags */ -#define SHRINKER_NUMA_AWARE (1 << 0) +#define SHRINKER_NUMA_AWARE(1 << 0) +#define SHRINKER_MEMCG_AWARE (1 << 1) extern int register_shrinker(struct shrinker *); extern void unregister_shrinker(struct shrinker *); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 144cb4c..8924ff1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -358,7 +358,7 @@ static inline void memcg_kmem_set_active(struct mem_cgroup *memcg) set_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags); } -static bool memcg_kmem_is_active(struct mem_cgroup *memcg) +bool memcg_kmem_is_active(struct mem_cgroup *memcg) { return test_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags); } @@ -958,6 +958,20 @@ mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg, int nid, int zid, return ret; } +unsigned long +memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone) +{ + int nid = zone_to_nid(zone); + int zid = zone_idx(zone); + unsigned long val; + + val = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid, LRU_ALL_FILE); + if (do_swap_account) + val += mem_cgroup_zone_nr_lru_pages(memcg, nid, zid, + LRU_ALL_ANON); + return val; +} + static unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, int nid, unsigned int lru_mask) diff --git a/mm/vmscan.c b/mm/vmscan.c index eea668d..652dfa3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -140,11 +140,41 @@ static bool global_reclaim(struct scan_control *sc) { return !sc->target_mem_cgroup; } + +/* + * kmem reclaim should usually not be triggered when we are doing targetted + * reclaim. It is only valid when global reclaim is triggered, or when the + * underlying memcg has kmem objects. + */ +static bool has_kmem_reclaim(struct scan_control *sc) +{ + return !sc->target_mem_cgroup || + memcg_kmem_is_active(sc->target_mem_cgroup); +} + +static unsigned long +zone_nr_reclaimable_pages(struct scan_control *sc, struct zone *zone) +{
[PATCH v11 07/15] memcg: scan cache objects hierarchically
From: Glauber Costa When reaching shrink_slab, we should descent in children memcg searching for objects that could be shrunk. This is true even if the memcg does not have kmem limits on, since the kmem res_counter will also be billed against the user res_counter of the parent. It is possible that we will free objects and not free any pages, that will just harm the child groups without helping the parent group at all. But at this point, we basically are prepared to pay the price. Signed-off-by: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- include/linux/memcontrol.h |6 mm/memcontrol.c| 13 + mm/vmscan.c| 65 3 files changed, 73 insertions(+), 11 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d16ba51..a513fad 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -488,6 +488,7 @@ static inline bool memcg_kmem_enabled(void) return static_key_false(_kmem_enabled_key); } +bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg); bool memcg_kmem_is_active(struct mem_cgroup *memcg); /* @@ -624,6 +625,11 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp) } #else +static inline bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg) +{ + return false; +} + static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg) { return false; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9be1e8b..f5d7128 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2995,6 +2995,19 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, } #ifdef CONFIG_MEMCG_KMEM +bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg) +{ + struct mem_cgroup *iter; + + for_each_mem_cgroup_tree(iter, memcg) { + if (memcg_kmem_is_active(iter)) { + mem_cgroup_iter_break(memcg, iter); + return true; + } + } + return false; +} + static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg) { return !mem_cgroup_disabled() && !mem_cgroup_is_root(memcg) && diff --git a/mm/vmscan.c b/mm/vmscan.c index cdfc364..36fc133 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -149,7 +149,7 @@ static bool global_reclaim(struct scan_control *sc) static bool has_kmem_reclaim(struct scan_control *sc) { return !sc->target_mem_cgroup || - memcg_kmem_is_active(sc->target_mem_cgroup); + memcg_kmem_should_reclaim(sc->target_mem_cgroup); } static unsigned long @@ -360,12 +360,35 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker, * * Returns the number of slab objects which we shrunk. */ +static unsigned long +shrink_slab_one(struct shrink_control *shrinkctl, struct shrinker *shrinker, + unsigned long nr_pages_scanned, unsigned long lru_pages) +{ + unsigned long freed = 0; + + for_each_node_mask(shrinkctl->nid, shrinkctl->nodes_to_scan) { + if (!node_online(shrinkctl->nid)) + continue; + + if (!(shrinker->flags & SHRINKER_NUMA_AWARE) && + (shrinkctl->nid != 0)) + break; + + freed += shrink_slab_node(shrinkctl, shrinker, +nr_pages_scanned, lru_pages); + + } + + return freed; +} + unsigned long shrink_slab(struct shrink_control *shrinkctl, unsigned long nr_pages_scanned, unsigned long lru_pages) { struct shrinker *shrinker; unsigned long freed = 0; + struct mem_cgroup *root = shrinkctl->target_mem_cgroup; if (nr_pages_scanned == 0) nr_pages_scanned = SWAP_CLUSTER_MAX; @@ -390,19 +413,39 @@ unsigned long shrink_slab(struct shrink_control *shrinkctl, if (shrinkctl->target_mem_cgroup && !(shrinker->flags & SHRINKER_MEMCG_AWARE)) continue; + /* +* In a hierarchical chain, it might be that not all memcgs are +* kmem active. kmemcg design mandates that when one memcg is +* active, its children will be active as well. But it is +* perfectly possible that its parent is not. +* +* We also need to make sure we scan at least once, for the +* global case. So if we don't have a target memcg (saved in +* root), we proceed normally and expect to break in the next +* round. +*/ + do { + struct mem_cgroup *memcg = shrinkctl->target_mem_cgroup; - for_each_node_mask(shrinkctl->nid, shrinkctl->nodes_to_scan) { -
[PATCH v11 08/15] vmscan: take at least one pass with shrinkers
From: Glauber Costa In very low free kernel memory situations, it may be the case that we have less objects to free than our initial batch size. If this is the case, it is better to shrink those, and open space for the new workload then to keep them and fail the new allocations. In particular, we are concerned with the direct reclaim case for memcg. Although this same technique can be applied to other situations just as well, we will start conservative and apply it for that case, which is the one that matters the most. Signed-off-by: Glauber Costa CC: Dave Chinner CC: Carlos Maiolino CC: "Theodore Ts'o" CC: Al Viro --- mm/vmscan.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 36fc133..bfedcdc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -311,20 +311,33 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker, nr_pages_scanned, lru_pages, max_pass, delta, total_scan); - while (total_scan >= batch_size) { + do { unsigned long ret; + unsigned long nr_to_scan = min(batch_size, total_scan); + struct mem_cgroup *memcg = shrinkctl->target_mem_cgroup; + + /* +* Differentiate between "few objects" and "no objects" +* as returned by the count step. +*/ + if (!total_scan) + break; + + if ((total_scan < batch_size) && + !(memcg && memcg_kmem_is_active(memcg))) + break; - shrinkctl->nr_to_scan = batch_size; + shrinkctl->nr_to_scan = nr_to_scan; ret = shrinker->scan_objects(shrinker, shrinkctl); if (ret == SHRINK_STOP) break; freed += ret; - count_vm_events(SLABS_SCANNED, batch_size); - total_scan -= batch_size; + count_vm_events(SLABS_SCANNED, nr_to_scan); + total_scan -= nr_to_scan; cond_resched(); - } + } while (total_scan >= batch_size); /* * move the unused scan count back into the shrinker in a -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 06/15] memcg: per-memcg kmem shrinking
From: Glauber Costa If the kernel limit is smaller than the user limit, we will have situations in which our allocations fail but freeing user pages will buy us nothing. In those, we would like to call a specialized memcg reclaimer that only frees kernel memory and leave the user memory alone. Those are also expected to fail when we account memcg->kmem, instead of when we account memcg->res. Based on that, this patch implements a memcg-specific reclaimer, that only shrinks kernel objects, withouth touching user pages. There might be situations in which there are plenty of objects to shrink, but we can't do it because the __GFP_FS flag is not set. Although they can happen with user pages, they are a lot more common with fs-metadata: this is the case with almost all inode allocation. For those cases, the best we can do is to spawn a worker and fail the current allocation. Signed-off-by: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- include/linux/swap.h |2 + mm/memcontrol.c | 118 +++--- mm/vmscan.c | 44 ++- 3 files changed, 157 insertions(+), 7 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 46ba0c6..367a773 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -309,6 +309,8 @@ extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, extern int __isolate_lru_page(struct page *page, isolate_mode_t mode); extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem, gfp_t gfp_mask, bool noswap); +extern unsigned long try_to_free_mem_cgroup_kmem(struct mem_cgroup *mem, +gfp_t gfp_mask); extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem, gfp_t gfp_mask, bool noswap, struct zone *zone, diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e9bdcf3..9be1e8b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -330,6 +330,8 @@ struct mem_cgroup { atomic_tnumainfo_events; atomic_tnumainfo_updating; #endif + /* when kmem shrinkers can sleep but can't proceed due to context */ + struct work_struct kmemcg_shrink_work; struct mem_cgroup_per_node *nodeinfo[0]; /* WARNING: nodeinfo must be the last member here */ @@ -341,11 +343,14 @@ static size_t memcg_size(void) nr_node_ids * sizeof(struct mem_cgroup_per_node); } +static DEFINE_MUTEX(set_limit_mutex); + /* internal only representation about the status of kmem accounting. */ enum { KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */ KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */ KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */ + KMEM_MAY_SHRINK, /* kmem limit < mem limit, shrink kmem only */ }; /* We account when limit is on, but only after call sites are patched */ @@ -389,6 +394,31 @@ static bool memcg_kmem_test_and_clear_dead(struct mem_cgroup *memcg) return test_and_clear_bit(KMEM_ACCOUNTED_DEAD, >kmem_account_flags); } + +/* + * If the kernel limit is smaller than the user limit, we will have situations + * in which our allocations fail but freeing user pages will buy us nothing. + * In those, we would like to call a specialized memcg reclaimer that only + * frees kernel memory and leaves the user memory alone. + * + * This test exists so we can differentiate between those. Every time one of the + * limits is updated, we need to run it. The set_limit_mutex must be held, so + * they don't change again. + */ +static void memcg_update_shrink_status(struct mem_cgroup *memcg) +{ + mutex_lock(_limit_mutex); + if (res_counter_read_u64(>kmem, RES_LIMIT) < + res_counter_read_u64(>res, RES_LIMIT)) + set_bit(KMEM_MAY_SHRINK, >kmem_account_flags); + else + clear_bit(KMEM_MAY_SHRINK, >kmem_account_flags); + mutex_unlock(_limit_mutex); +} +#else +static void memcg_update_shrink_status(struct mem_cgroup *memcg) +{ +} #endif /* Stuffs for move charges at task migration. */ @@ -2964,8 +2994,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg, memcg_check_events(memcg, page); } -static DEFINE_MUTEX(set_limit_mutex); - #ifdef CONFIG_MEMCG_KMEM static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg) { @@ -3062,15 +3090,54 @@ static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css, } #endif +static int memcg_try_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size) +{ + int retries = MEM_CGROUP_RECLAIM_RETRIES; + struct res_counter *fail_res; +
[PATCH v11 10/15] memcg,list_lru: add function walking over all lists of a per-memcg LRU
Sometimes it can be necessary to iterate over all memcgs' lists of the same memcg-aware LRU. For example shrink_dcache_sb() should prune all dentries no matter what memory cgroup they belong to. Current interface to struct memcg_list_lru, however, only allows per-memcg LRU walks. This patch adds the special method memcg_list_lru_walk_all() which provides the required functionality. Note that this function does not guarantee that all the elements will be processed in the true least-recently-used order, in fact it simply enumerates all kmem-active memcgs and for each of them calls list_lru_walk(), but shrink_dcache_sb(), which is going to be the only user of this function, does not need it. Signed-off-by: Vladimir Davydov Cc: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- include/linux/list_lru.h | 21 ++ mm/memcontrol.c | 55 ++ 2 files changed, 76 insertions(+) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index b3b3b86..ce815cc 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -40,6 +40,16 @@ struct memcg_list_lru { struct list_lru **memcg_lrus; /* rcu-protected array of per-memcg lrus, indexed by memcg_cache_id() */ + /* +* When a memory cgroup is removed, all pointers to its list_lru +* objects stored in memcg_lrus arrays are first marked as dead by +* setting the lowest bit of the address while the actual data free +* happens only after an rcu grace period. If a memcg_lrus reader, +* which should be rcu-protected, faces a dead pointer, it won't +* dereference it. This ensures there will be no use-after-free. +*/ +#define MEMCG_LIST_LRU_DEAD1 + struct list_head list; /* list of all memcg-aware lrus */ /* @@ -160,6 +170,10 @@ struct list_lru * mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg); struct list_lru * mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr); + +unsigned long +memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate, + void *cb_arg, unsigned long nr_to_walk); #else static inline int memcg_list_lru_init(struct memcg_list_lru *lru) { @@ -182,6 +196,13 @@ mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr) { return >global_lru; } + +static inline unsigned long +memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate, + void *cb_arg, unsigned long nr_to_walk) +{ + return list_lru_walk(>global_lru, isolate, cb_arg, nr_to_walk); +} #endif /* CONFIG_MEMCG_KMEM */ #endif /* _LRU_LIST_H */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 84f1ca3..7b4f420 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3915,16 +3915,30 @@ static int alloc_memcg_lru(struct memcg_list_lru *lru, int memcg_id) return err; } + smp_wmb(); VM_BUG_ON(lru->memcg_lrus[memcg_id]); lru->memcg_lrus[memcg_id] = memcg_lru; return 0; } +static void memcg_lru_mark_dead(struct memcg_list_lru *lru, int memcg_id) +{ + struct list_lru *memcg_lru; + + BUG_ON(!lru->memcg_lrus); + memcg_lru = lru->memcg_lrus[memcg_id]; + if (memcg_lru) + lru->memcg_lrus[memcg_id] = (void *)((unsigned long)memcg_lru | +MEMCG_LIST_LRU_DEAD); +} + static void free_memcg_lru(struct memcg_list_lru *lru, int memcg_id) { struct list_lru *memcg_lru = NULL; swap(lru->memcg_lrus[memcg_id], memcg_lru); + memcg_lru = (void *)((unsigned long)memcg_lru & +~MEMCG_LIST_LRU_DEAD); if (memcg_lru) { list_lru_destroy(memcg_lru); kfree(memcg_lru); @@ -3958,6 +3972,17 @@ static void __memcg_destroy_all_lrus(int memcg_id) { struct memcg_list_lru *lru; + /* +* Mark all lru lists of this memcg as dead and free them only after a +* grace period. This is to prevent functions iterating over memcg_lrus +* arrays (e.g. memcg_list_lru_walk_all()) from dereferencing pointers +* pointing to already freed data. +*/ + list_for_each_entry(lru, _lrus_list, list) + memcg_lru_mark_dead(lru, memcg_id); + + synchronize_rcu(); + list_for_each_entry(lru, _lrus_list, list) free_memcg_lru(lru, memcg_id); } @@ -4124,6 +4149,36 @@ mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr) } return mem_cgroup_list_lru(lru, memcg); } + +unsigned long +memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate, + void *cb_arg,
[PATCH v11 09/15] memcg,list_lru: add per-memcg LRU list infrastructure
FS-shrinkers, which shrink dcaches and icaches, keep dentries and inodes in list_lru structures in order to evict least recently used objects. With per-memcg kmem shrinking infrastructure introduced, we have to make those LRU lists per-memcg in order to allow shrinking FS caches that belong to different memory cgroups independently. This patch addresses the issue by introducing struct memcg_list_lru. This struct aggregates list_lru objects for each kmem-active memcg, and keeps it uptodate whenever a memcg is created or destroyed. Its interface is very simple: it only allows to get the pointer to the appropriate list_lru object from a memcg or a kmem ptr, which should be further operated with conventional list_lru methods. Signed-off-by: Vladimir Davydov Cc: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- include/linux/list_lru.h | 56 ++ mm/memcontrol.c | 256 -- 2 files changed, 306 insertions(+), 6 deletions(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index 3ce5417..b3b3b86 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -10,6 +10,8 @@ #include #include +struct mem_cgroup; + /* list_lru_walk_cb has to always return one of those */ enum lru_status { LRU_REMOVED,/* item removed from list */ @@ -31,6 +33,27 @@ struct list_lru { nodemask_t active_nodes; }; +struct memcg_list_lru { + struct list_lru global_lru; + +#ifdef CONFIG_MEMCG_KMEM + struct list_lru **memcg_lrus; /* rcu-protected array of per-memcg + lrus, indexed by memcg_cache_id() */ + + struct list_head list; /* list of all memcg-aware lrus */ + + /* +* The memcg_lrus array is rcu protected, so we can only free it after +* a call to synchronize_rcu(). To avoid multiple calls to +* synchronize_rcu() when many lrus get updated at the same time, which +* is a typical scenario, we will store the pointer to the previous +* version of the array in the old_lrus variable for each lru, and then +* free them all at once after a single call to synchronize_rcu(). +*/ + void *old_lrus; +#endif +}; + void list_lru_destroy(struct list_lru *lru); int list_lru_init(struct list_lru *lru); @@ -128,4 +151,37 @@ list_lru_walk(struct list_lru *lru, list_lru_walk_cb isolate, } return isolated; } + +#ifdef CONFIG_MEMCG_KMEM +int memcg_list_lru_init(struct memcg_list_lru *lru); +void memcg_list_lru_destroy(struct memcg_list_lru *lru); + +struct list_lru * +mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg); +struct list_lru * +mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr); +#else +static inline int memcg_list_lru_init(struct memcg_list_lru *lru) +{ + return list_lru_init(>global_lru); +} + +static inline void memcg_list_lru_destroy(struct memcg_list_lru *lru) +{ + list_lru_destroy(>global_lru); +} + +static inline struct list_lru * +mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg) +{ + return >global_lru; +} + +static inline struct list_lru * +mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr) +{ + return >global_lru; +} +#endif /* CONFIG_MEMCG_KMEM */ + #endif /* _LRU_LIST_H */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f5d7128..84f1ca3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -55,6 +55,7 @@ #include #include #include +#include #include "internal.h" #include #include @@ -3249,6 +3250,8 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, struct kmem_cache *cachep) mutex_unlock(>slab_caches_mutex); } +static int memcg_update_all_lrus(int num_groups); + /* * This ends up being protected by the set_limit mutex, during normal * operation, because that is its main call site. @@ -3273,15 +3276,28 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg) */ memcg_kmem_set_activated(memcg); - ret = memcg_update_all_caches(num+1); - if (ret) { - ida_simple_remove(_limited_groups, num); - memcg_kmem_clear_activated(memcg); - return ret; - } + /* +* We need to update the memcg lru lists before we update the caches. +* Once the caches are updated, they will be able to start hosting +* objects. If a cache is created very quickly and an element is used +* and disposed to the lru quickly as well, we can end up with a NULL +* pointer dereference while trying to add a new element to a memcg +* lru. +*/ + ret = memcg_update_all_lrus(num + 1); + if (ret) + goto out; + + ret = memcg_update_all_caches(num + 1); + if
[PATCH v11 12/15] memcg: allow kmem limit to be resized down
From: Glauber Costa The userspace memory limit can be freely resized down. Upon attempt, reclaim will be called to flush the pages away until we either reach the limit we want or give up. It wasn't possible so far with the kmem limit, since we had no way to shrink the kmem buffers other than using the big hammer of shrink_slab, that effectively frees data around the whole system. The situation flips now that we have a per-memcg shrinker infrastructure. We will proceed analogously to our user memory counterpart and try to shrink our buffers until we either reach the limit we want or give up. Signed-off-by: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Kamezawa Hiroyuki --- mm/memcontrol.c | 43 ++- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b4f420..a605eb0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5581,10 +5581,39 @@ static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css, return simple_read_from_buffer(buf, nbytes, ppos, str, len); } +#ifdef CONFIG_MEMCG_KMEM +/* + * This is slightly different than res or memsw reclaim. We already have + * vmscan behind us to drive the reclaim, so we can basically keep trying until + * all buffers that can be flushed are flushed. We have a very clear signal + * about it in the form of the return value of try_to_free_mem_cgroup_kmem. + */ +static int mem_cgroup_resize_kmem_limit(struct mem_cgroup *memcg, + unsigned long long val) +{ + int ret = -EBUSY; + + for (;;) { + if (signal_pending(current)) { + ret = -EINTR; + break; + } + + ret = res_counter_set_limit(>kmem, val); + if (!ret) + break; + + /* Can't free anything, pointless to continue */ + if (!try_to_free_mem_cgroup_kmem(memcg, GFP_KERNEL)) + break; + } + + return ret; +} + static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val) { int ret = -EINVAL; -#ifdef CONFIG_MEMCG_KMEM struct mem_cgroup *memcg = mem_cgroup_from_css(css); /* * For simplicity, we won't allow this to be disabled. It also can't @@ -5619,16 +5648,15 @@ static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val) * starts accounting before all call sites are patched */ memcg_kmem_set_active(memcg); - } else - ret = res_counter_set_limit(>kmem, val); + } else { + ret = mem_cgroup_resize_kmem_limit(memcg, val); + } out: mutex_unlock(_limit_mutex); mutex_unlock(_create_mutex); -#endif return ret; } -#ifdef CONFIG_MEMCG_KMEM static int memcg_propagate_kmem(struct mem_cgroup *memcg) { int ret = 0; @@ -5665,6 +5693,11 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg) out: return ret; } +#else +static int memcg_update_kmem_limit(struct cgroup *cont, u64 val) +{ + return -EINVAL; +} #endif /* CONFIG_MEMCG_KMEM */ /* -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 11/15] super: make icache, dcache shrinkers memcg-aware
Using the per-memcg LRU infrastructure introduced by previous patches, this patch makes dcache and icache shrinkers memcg-aware. To achieve that, it converts s_dentry_lru and s_inode_lru from list_lru to memcg_list_lru and restricts the reclaim to per-memcg parts of the lists in case of memcg pressure. Other FS objects are currently ignored and only reclaimed on global pressure, because their shrinkers are heavily FS-specific and can't be converted to be memcg-aware so easily. However, we can pass on target memcg to the FS layer and let it decide if per-memcg objects should be reclaimed. Note that with this patch applied we lose global LRU order, but it does not appear to be a critical drawback, because global pressure should try to balance the amount reclaimed from all memcgs. On the other hand, preserving global LRU order would require an extra list_head added to each dentry and inode, which seems to be too costly. Signed-off-by: Vladimir Davydov Cc: Glauber Costa Cc: Dave Chinner Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- fs/dcache.c| 25 +++-- fs/inode.c | 16 ++-- fs/internal.h |9 + fs/super.c | 45 - include/linux/fs.h |4 ++-- 5 files changed, 60 insertions(+), 39 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 4bdb300..e8499db 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -343,18 +343,24 @@ static void dentry_unlink_inode(struct dentry * dentry) #define D_FLAG_VERIFY(dentry,x) WARN_ON_ONCE(((dentry)->d_flags & (DCACHE_LRU_LIST | DCACHE_SHRINK_LIST)) != (x)) static void d_lru_add(struct dentry *dentry) { + struct list_lru *lru = + mem_cgroup_kmem_list_lru(>d_sb->s_dentry_lru, dentry); + D_FLAG_VERIFY(dentry, 0); dentry->d_flags |= DCACHE_LRU_LIST; this_cpu_inc(nr_dentry_unused); - WARN_ON_ONCE(!list_lru_add(>d_sb->s_dentry_lru, >d_lru)); + WARN_ON_ONCE(!list_lru_add(lru, >d_lru)); } static void d_lru_del(struct dentry *dentry) { + struct list_lru *lru = + mem_cgroup_kmem_list_lru(>d_sb->s_dentry_lru, dentry); + D_FLAG_VERIFY(dentry, DCACHE_LRU_LIST); dentry->d_flags &= ~DCACHE_LRU_LIST; this_cpu_dec(nr_dentry_unused); - WARN_ON_ONCE(!list_lru_del(>d_sb->s_dentry_lru, >d_lru)); + WARN_ON_ONCE(!list_lru_del(lru, >d_lru)); } static void d_shrink_del(struct dentry *dentry) @@ -970,9 +976,9 @@ dentry_lru_isolate(struct list_head *item, spinlock_t *lru_lock, void *arg) } /** - * prune_dcache_sb - shrink the dcache - * @sb: superblock - * @nr_to_scan : number of entries to try to free + * prune_dcache_lru - shrink the dcache + * @lru: dentry lru list + * @nr_to_scan: number of entries to try to free * @nid: which node to scan for freeable entities * * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is @@ -982,14 +988,13 @@ dentry_lru_isolate(struct list_head *item, spinlock_t *lru_lock, void *arg) * This function may fail to free any resources if all the dentries are in * use. */ -long prune_dcache_sb(struct super_block *sb, unsigned long nr_to_scan, -int nid) +long prune_dcache_lru(struct list_lru *lru, unsigned long nr_to_scan, int nid) { LIST_HEAD(dispose); long freed; - freed = list_lru_walk_node(>s_dentry_lru, nid, dentry_lru_isolate, - , _to_scan); + freed = list_lru_walk_node(lru, nid, dentry_lru_isolate, + , _to_scan); shrink_dentry_list(); return freed; } @@ -1029,7 +1034,7 @@ void shrink_dcache_sb(struct super_block *sb) do { LIST_HEAD(dispose); - freed = list_lru_walk(>s_dentry_lru, + freed = memcg_list_lru_walk_all(>s_dentry_lru, dentry_lru_isolate_shrink, , UINT_MAX); this_cpu_sub(nr_dentry_unused, freed); diff --git a/fs/inode.c b/fs/inode.c index 4bcdad3..f06a963 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -402,7 +402,10 @@ EXPORT_SYMBOL(ihold); static void inode_lru_list_add(struct inode *inode) { - if (list_lru_add(>i_sb->s_inode_lru, >i_lru)) + struct list_lru *lru = + mem_cgroup_kmem_list_lru(>i_sb->s_inode_lru, inode); + + if (list_lru_add(lru, >i_lru)) this_cpu_inc(nr_unused); } @@ -421,8 +424,10 @@ void inode_add_lru(struct inode *inode) static void inode_lru_list_del(struct inode *inode) { + struct list_lru *lru = + mem_cgroup_kmem_list_lru(>i_sb->s_inode_lru, inode); - if (list_lru_del(>i_sb->s_inode_lru, >i_lru)) + if (list_lru_del(lru, >i_lru)) this_cpu_dec(nr_unused); } @@ -748,14 +753,13 @@ inode_lru_isolate(struct list_head *item,
Re: [PATCH v2] uprobes: Add uprobe_task->dup_work/dup_addr
On 11/24, Masami Hiramatsu wrote: > > Ping? > > Is this already pulled? > I think it is enough discussed and reviewed. Yes, thanks, this is already in tip/perf/core. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 15/15] memcg: flush memcg items upon memcg destruction
From: Glauber Costa When a memcg is destroyed, it won't be imediately released until all objects are gone. This means that if a memcg is restarted with the very same workload - a very common case, the objects already cached won't be billed to the new memcg. This is mostly undesirable since a container can exploit this by restarting itself every time it reaches its limit, and then coming up again with a fresh new limit. Since now we have targeted reclaim, I sustain that we should assume that a memcg that is destroyed should be flushed away. It makes perfect sense if we assume that a memcg that goes away most likely indicates an isolated workload that is terminated. Signed-off-by: Glauber Costa Cc: Mel Gorman Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki --- mm/memcontrol.c | 17 + 1 file changed, 17 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3533d33..471b544 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6453,12 +6453,29 @@ static void memcg_destroy_kmem(struct mem_cgroup *memcg) static void kmem_cgroup_css_offline(struct mem_cgroup *memcg) { + int ret; if (!memcg_kmem_is_active(memcg)) return; cancel_work_sync(>kmemcg_shrink_work); /* +* When a memcg is destroyed, it won't be imediately released until all +* objects are gone. This means that if a memcg is restarted with the +* very same workload - a very common case, the objects already cached +* won't be billed to the new memcg. This is mostly undesirable since a +* container can exploit this by restarting itself every time it +* reaches its limit, and then coming up again with a fresh new limit. +* +* Therefore a memcg that is destroyed should be flushed away. It makes +* perfect sense if we assume that a memcg that goes away indicates an +* isolated workload that is terminated. +*/ + do { + ret = try_to_free_mem_cgroup_kmem(memcg, GFP_KERNEL); + } while (ret); + + /* * kmem charges can outlive the cgroup. In the case of slab * pages, for instance, a page contain objects from various * processes. As we prevent from taking a reference for every -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 14/15] memcg: reap dead memcgs upon global memory pressure
From: Glauber Costa When we delete kmem-enabled memcgs, they can still be zombieing around for a while. The reason is that the objects may still be alive, and we won't be able to delete them at destruction time. The only entry point for that, though, are the shrinkers. The shrinker interface, however, is not exactly tailored to our needs. It could be a little bit better by using the API Dave Chinner proposed, but it is still not ideal since we aren't really a count-and-scan event, but more a one-off flush-all-you-can event that would have to abuse that somehow. Signed-off-by: Glauber Costa Cc: Anton Vorontsov Cc: John Stultz Cc: Andrew Morton Cc: Michal Hocko Cc: Kamezawa Hiroyuki Cc: Johannes Weiner --- mm/memcontrol.c | 80 --- 1 file changed, 77 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a605eb0..3533d33 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -287,8 +287,16 @@ struct mem_cgroup { /* thresholds for mem+swap usage. RCU-protected */ struct mem_cgroup_thresholds memsw_thresholds; - /* For oom notifier event fd */ - struct list_head oom_notify; + union { + /* For oom notifier event fd */ + struct list_head oom_notify; + /* +* we can only trigger an oom event if the memcg is alive. +* so we will reuse this field to hook the memcg in the list +* of dead memcgs. +*/ + struct list_head dead; + }; /* * Should we move charges of a task when a task is moved into this @@ -338,6 +346,29 @@ struct mem_cgroup { /* WARNING: nodeinfo must be the last member here */ }; +#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_SWAP) +static LIST_HEAD(dangling_memcgs); +static DEFINE_MUTEX(dangling_memcgs_mutex); + +static inline void memcg_dangling_del(struct mem_cgroup *memcg) +{ + mutex_lock(_memcgs_mutex); + list_del(>dead); + mutex_unlock(_memcgs_mutex); +} + +static inline void memcg_dangling_add(struct mem_cgroup *memcg) +{ + INIT_LIST_HEAD(>dead); + mutex_lock(_memcgs_mutex); + list_add(>dead, _memcgs); + mutex_unlock(_memcgs_mutex); +} +#else +static inline void memcg_dangling_free(struct mem_cgroup *memcg) {} +static inline void memcg_dangling_add(struct mem_cgroup *memcg) {} +#endif + static size_t memcg_size(void) { return sizeof(struct mem_cgroup) + @@ -6364,6 +6395,41 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css, } #ifdef CONFIG_MEMCG_KMEM +static void memcg_vmpressure_shrink_dead(void) +{ + struct memcg_cache_params *params, *tmp; + struct kmem_cache *cachep; + struct mem_cgroup *memcg; + + mutex_lock(_memcgs_mutex); + list_for_each_entry(memcg, _memcgs, dead) { + mutex_lock(>slab_caches_mutex); + /* The element may go away as an indirect result of shrink */ + list_for_each_entry_safe(params, tmp, +>memcg_slab_caches, list) { + cachep = memcg_params_to_cache(params); + /* +* the cpu_hotplug lock is taken in kmem_cache_create +* outside the slab_caches_mutex manipulation. It will +* be taken by kmem_cache_shrink to flush the cache. +* So we need to drop the lock. It is all right because +* the lock only protects elements moving in and out the +* list. +*/ + mutex_unlock(>slab_caches_mutex); + kmem_cache_shrink(cachep); + mutex_lock(>slab_caches_mutex); + } + mutex_unlock(>slab_caches_mutex); + } + mutex_unlock(_memcgs_mutex); +} + +static void memcg_register_kmem_events(struct cgroup_subsys_state *css) +{ + vmpressure_register_kernel_event(css, memcg_vmpressure_shrink_dead); +} + static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss) { int ret; @@ -6421,6 +6487,10 @@ static void kmem_cgroup_css_offline(struct mem_cgroup *memcg) css_put(>css); } #else +static inline void memcg_register_kmem_events(struct cgroup *cont) +{ +} + static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss) { return 0; @@ -6759,8 +6829,10 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css) if (css->cgroup->id > MEM_CGROUP_ID_MAX) return -ENOSPC; - if (!parent) + if (!parent) { + memcg_register_kmem_events(css); return 0; + } mutex_lock(_create_mutex); @@ -6822,6 +6894,7 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
[PATCH v11 13/15] vmpressure: in-kernel notifications
From: Glauber Costa During the past weeks, it became clear to us that the shrinker interface we have right now works very well for some particular types of users, but not that well for others. The later are usually people interested in one-shot notifications, that were forced to adapt themselves to the count+scan behavior of shrinkers. To do so, they had no choice than to greatly abuse the shrinker interface producing little monsters all over. During LSF/MM, one of the proposals that popped out during our session was to reuse Anton Voronstsov's vmpressure for this. They are designed for userspace consumption, but also provide a well-stablished, cgroup-aware entry point for notifications. This patch extends that to also support in-kernel users. Events that should be generated for in-kernel consumption will be marked as such, and for those, we will call a registered function instead of triggering an eventfd notification. Please note that due to my lack of understanding of each shrinker user, I will stay away from converting the actual users, you are all welcome to do so. Signed-off-by: Glauber Costa Acked-by: Anton Vorontsov Acked-by: Pekka Enberg Reviewed-by: Greg Thelen Cc: Dave Chinner Cc: John Stultz Cc: Andrew Morton Cc: Joonsoo Kim Cc: Michal Hocko Cc: Kamezawa Hiroyuki Cc: Johannes Weiner --- include/linux/vmpressure.h |5 + mm/vmpressure.c| 53 +--- 2 files changed, 55 insertions(+), 3 deletions(-) diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h index 3f3788d..9102e53 100644 --- a/include/linux/vmpressure.h +++ b/include/linux/vmpressure.h @@ -19,6 +19,9 @@ struct vmpressure { /* Have to grab the lock on events traversal or modifications. */ struct mutex events_lock; + /* False if only kernel users want to be notified, true otherwise. */ + bool notify_userspace; + struct work_struct work; }; @@ -38,6 +41,8 @@ extern int vmpressure_register_event(struct cgroup_subsys_state *css, struct cftype *cft, struct eventfd_ctx *eventfd, const char *args); +extern int vmpressure_register_kernel_event(struct cgroup_subsys_state *css, + void (*fn)(void)); extern void vmpressure_unregister_event(struct cgroup_subsys_state *css, struct cftype *cft, struct eventfd_ctx *eventfd); diff --git a/mm/vmpressure.c b/mm/vmpressure.c index e0f6283..730e7c1 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -130,8 +130,12 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned, } struct vmpressure_event { - struct eventfd_ctx *efd; + union { + struct eventfd_ctx *efd; + void (*fn)(void); + }; enum vmpressure_levels level; + bool kernel_event; struct list_head node; }; @@ -147,12 +151,15 @@ static bool vmpressure_event(struct vmpressure *vmpr, mutex_lock(>events_lock); list_for_each_entry(ev, >events, node) { - if (level >= ev->level) { + if (ev->kernel_event) { + ev->fn(); + } else if (vmpr->notify_userspace && level >= ev->level) { eventfd_signal(ev->efd, 1); signalled = true; } } + vmpr->notify_userspace = false; mutex_unlock(>events_lock); return signalled; @@ -222,7 +229,7 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, * we account it too. */ if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS))) - return; + goto schedule; /* * If we got here with no pages scanned, then that is an indicator @@ -239,8 +246,15 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, vmpr->scanned += scanned; vmpr->reclaimed += reclaimed; scanned = vmpr->scanned; + /* +* If we didn't reach this point, only kernel events will be triggered. +* It is the job of the worker thread to clean this up once the +* notifications are all delivered. +*/ + vmpr->notify_userspace = true; spin_unlock(>sr_lock); +schedule: if (scanned < vmpressure_win) return; schedule_work(>work); @@ -324,6 +338,39 @@ int vmpressure_register_event(struct cgroup_subsys_state *css, } /** + * vmpressure_register_kernel_event() - Register kernel-side notification + * @css: css that is interested in vmpressure notifications + * @fn:function to be called when pressure happens + * + * This function register in-kernel users interested in receiving notifications + * about pressure conditions. Pressure
[PATCH v11 02/15] memcg: consolidate callers of memcg_cache_id
From: Glauber Costa Each caller of memcg_cache_id ends up sanitizing its parameters in its own way. Now that the memcg_cache_id itself is more robust, we can consolidate this. Also, as suggested by Michal, a special helper memcg_cache_idx is used when the result is expected to be used directly as an array index to make sure we never accesses in a negative index. Signed-off-by: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Kamezawa Hiroyuki --- mm/memcontrol.c | 49 + 1 file changed, 29 insertions(+), 20 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 02b5176..144cb4c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2960,6 +2960,30 @@ static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg) } /* + * helper for acessing a memcg's index. It will be used as an index in the + * child cache array in kmem_cache, and also to derive its name. This function + * will return -1 when this is not a kmem-limited memcg. + */ +int memcg_cache_id(struct mem_cgroup *memcg) +{ + if (!memcg || !memcg_can_account_kmem(memcg)) + return -1; + return memcg->kmemcg_id; +} + +/* + * This helper around memcg_cache_id is not intented for use outside memcg + * core. It is meant for places where the cache id is used directly as an array + * index + */ +static int memcg_cache_idx(struct mem_cgroup *memcg) +{ + int ret = memcg_cache_id(memcg); + BUG_ON(ret < 0); + return ret; +} + +/* * This is a bit cumbersome, but it is rarely used and avoids a backpointer * in the memcg_cache_params struct. */ @@ -2969,7 +2993,7 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p) VM_BUG_ON(p->is_root_cache); cachep = p->root_cache; - return cache_from_memcg_idx(cachep, memcg_cache_id(p->memcg)); + return cache_from_memcg_idx(cachep, memcg_cache_idx(p->memcg)); } #ifdef CONFIG_SLABINFO @@ -3067,18 +3091,6 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, struct kmem_cache *cachep) } /* - * helper for acessing a memcg's index. It will be used as an index in the - * child cache array in kmem_cache, and also to derive its name. This function - * will return -1 when this is not a kmem-limited memcg. - */ -int memcg_cache_id(struct mem_cgroup *memcg) -{ - if (!memcg || !memcg_can_account_kmem(memcg)) - return -1; - return memcg->kmemcg_id; -} - -/* * This ends up being protected by the set_limit mutex, during normal * operation, because that is its main call site. * @@ -3240,7 +3252,7 @@ void memcg_release_cache(struct kmem_cache *s) goto out; memcg = s->memcg_params->memcg; - id = memcg_cache_id(memcg); + id = memcg_cache_idx(memcg); root = s->memcg_params->root_cache; root->memcg_params->memcg_caches[id] = NULL; @@ -3403,9 +3415,7 @@ static struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg, struct kmem_cache *new_cachep; int idx; - BUG_ON(!memcg_can_account_kmem(memcg)); - - idx = memcg_cache_id(memcg); + idx = memcg_cache_idx(memcg); mutex_lock(_cache_mutex); new_cachep = cache_from_memcg_idx(cachep, idx); @@ -3578,10 +3588,9 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, rcu_read_lock(); memcg = mem_cgroup_from_task(rcu_dereference(current->mm->owner)); - if (!memcg_can_account_kmem(memcg)) - goto out; - idx = memcg_cache_id(memcg); + if (idx < 0) + goto out; /* * barrier to mare sure we're always seeing the up to date value. The -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much
On Fri 22-11-13 15:27:11, Andrew Morton wrote: > On Fri, 8 Nov 2013 11:21:13 +0100 Jan Kara wrote: > > > On Fri 08-11-13 00:46:49, Frederic Weisbecker wrote: > > > On Thu, Nov 07, 2013 at 06:37:17PM -0500, Steven Rostedt wrote: > > > > On Fri, 8 Nov 2013 00:21:51 +0100 > > > > Frederic Weisbecker wrote: > > > > > > > > > > Offloading to a workqueue would be perhaps better, and writing to the > > > > > serial > > > > > console could then be done with interrupts enabled, preemptible > > > > > context, etc... > > > > > > > > Oh God no ;-) Adding workqueue logic into printk just spells a > > > > nightmare of much more complexity for a critical kernel infrastructure. > > > > > > But yeah that's scary, that means workqueues itself can't printk that > > > safely. > > > So, you're right after all. > > Yeah, we've been there (that was actually my initial proposal). But > > Andrew and Steven (rightfully) objected and suggested irq_work should be > > used instead. > > I still hate the patchset and so does everyone else, including you ;) > There must be something smarter we can do. Let's start by restating > the problem: > > CPU A is in printk, emitting log_buf characters to a slow device. > Meanwhile other CPUs come into printk(), see that the system is busy, > dump their load into log_buf then scram, leaving CPU A to do even more > work. > > Correct so far? Yes, correct. > If so, what is the role of local_irq_disabled() in this? Did CPU A > call printk() with local interrupts disabled, or is printk (or the > console driver) causing the irqs-off condition? Where and why is this > IRQ disablement happening? So there are couple of places where we disable interrupts. a) call_console_drivers() which does the printing to console is always called with interrupts disabled. Commonly it is called from console_unlock() which takes care of disabling interrupts. I presume this is because we want to guard against interrupts doing something unexpected with the console while we are printing to it. But I don't really understand console drivers to be sure... b) vprintk_emit() (which is the function usually calling console_unlock()) also disables interrupts to make updates of log_buf interrupt safe. It calls console_unlock() with interrupts disabled which seems to be unnecessary as that function takes care of disabling interrupts itself. It makes the situation somewhat worse because console_unlock() could otherwise enable interrupts from time to time. That being said I've tried to fix this shortcoming in previous versions of the patch set but it didn't seem to make a difference - maybe local_irq_restore(flags); spin_lock_irqsafe(_lock, flags); which is what console_unlock() does, doesn't give APIC enough time to deliver blocked interrupts. c) printk() itself is sometimes called with interrupts disabled. This happens a lot for example from sysrq handlers which is sometimes unpleasant (sysrq-s simply kills large machines) but not a primary concern for me. It doesn't seem to happen too often after an early boot is finished (in particular SCSI messages which make machines unbootable seem to be generated from kernel thread context). But there are some messages like this and if we are unlucky and we get caught in such printk, the machine dies. So I believe we have to reliably handle a situation when printk() itself gets called with interrupts disabled. > Could we fix this problem by not permitting CPUs B, C and D to DoS CPU > A? When CPU B comes into printk() and sees that printk is busy, make > CPU A hand over to CPU B and let CPU A get out of there? We could. In fact I was proposing this in https://lkml.org/lkml/2013/9/5/329 It has the advantage that we won't rely on irq work. If we changed console_trylock() to console_lock() in console_trylock_for_printk() and made console_unlock() only print the messages in log_buf on function entry, it would even make things simpler but it would basically undo your change from ages ago and I'm not sure about consequences. All printk()s could suddently block much more since printk() would essentially become completely synchronous. We could try some more fancy compromise between current "completely async printk" and ancient "completely synchronous printk" but then it gets more complex and so far dependence on irq work seemed as a lesser evil to me. Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 05/15] memcg: move stop and resume accounting functions
From: Glauber Costa I need to move this up a bit, and I am doing it in a separate patch just to reduce churn in the patch that needs it. Signed-off-by: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Hugh Dickins Cc: Kamezawa Hiroyuki Cc: Andrew Morton --- mm/memcontrol.c | 62 +++ 1 file changed, 31 insertions(+), 31 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9ba9975..e9bdcf3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3010,6 +3010,37 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p) return cache_from_memcg_idx(cachep, memcg_cache_idx(p->memcg)); } +/* + * During the creation a new cache, we need to disable our accounting mechanism + * altogether. This is true even if we are not creating, but rather just + * enqueing new caches to be created. + * + * This is because that process will trigger allocations; some visible, like + * explicit kmallocs to auxiliary data structures, name strings and internal + * cache structures; some well concealed, like INIT_WORK() that can allocate + * objects during debug. + * + * If any allocation happens during memcg_kmem_get_cache, we will recurse back + * to it. This may not be a bounded recursion: since the first cache creation + * failed to complete (waiting on the allocation), we'll just try to create the + * cache again, failing at the same point. + * + * memcg_kmem_get_cache is prepared to abort after seeing a positive count of + * memcg_kmem_skip_account. So we enclose anything that might allocate memory + * inside the following two functions. + */ +static inline void memcg_stop_kmem_account(void) +{ + VM_BUG_ON(!current->mm); + current->memcg_kmem_skip_account++; +} + +static inline void memcg_resume_kmem_account(void) +{ + VM_BUG_ON(!current->mm); + current->memcg_kmem_skip_account--; +} + #ifdef CONFIG_SLABINFO static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css, struct cftype *cft, struct seq_file *m) @@ -3278,37 +3309,6 @@ out: kfree(s->memcg_params); } -/* - * During the creation a new cache, we need to disable our accounting mechanism - * altogether. This is true even if we are not creating, but rather just - * enqueing new caches to be created. - * - * This is because that process will trigger allocations; some visible, like - * explicit kmallocs to auxiliary data structures, name strings and internal - * cache structures; some well concealed, like INIT_WORK() that can allocate - * objects during debug. - * - * If any allocation happens during memcg_kmem_get_cache, we will recurse back - * to it. This may not be a bounded recursion: since the first cache creation - * failed to complete (waiting on the allocation), we'll just try to create the - * cache again, failing at the same point. - * - * memcg_kmem_get_cache is prepared to abort after seeing a positive count of - * memcg_kmem_skip_account. So we enclose anything that might allocate memory - * inside the following two functions. - */ -static inline void memcg_stop_kmem_account(void) -{ - VM_BUG_ON(!current->mm); - current->memcg_kmem_skip_account++; -} - -static inline void memcg_resume_kmem_account(void) -{ - VM_BUG_ON(!current->mm); - current->memcg_kmem_skip_account--; -} - static void kmem_cache_destroy_work_func(struct work_struct *w) { struct kmem_cache *cachep; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v11 01/15] memcg: make cache index determination more robust
From: Glauber Costa I caught myself doing something like the following outside memcg core: memcg_id = -1; if (memcg && memcg_kmem_is_active(memcg)) memcg_id = memcg_cache_id(memcg); to be able to handle all possible memcgs in a sane manner. In particular, the root cache will have kmemcg_id = -1 (just because we don't call memcg_kmem_init to the root cache since it is not limitable). We have always coped with that by making sure we sanitize which cache is passed to memcg_cache_id. Although this example is given for root, what we really need to know is whether or not a cache is kmem active. But outside the memcg core testing for root, for instance, is not trivial since we don't export mem_cgroup_is_root. I ended up realizing that this tests really belong inside memcg_cache_id. This patch moves a similar but stronger test inside memcg_cache_id and make sure it always return a meaningful value. Signed-off-by: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Kamezawa Hiroyuki --- mm/memcontrol.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f1a0ae6..02b5176 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3073,7 +3073,9 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, struct kmem_cache *cachep) */ int memcg_cache_id(struct mem_cgroup *memcg) { - return memcg ? memcg->kmemcg_id : -1; + if (!memcg || !memcg_can_account_kmem(memcg)) + return -1; + return memcg->kmemcg_id; } /* -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] firmware/dmi_scan: generalize for use by other archs
Hello all, Resending this patch to a slightly wider audience. The point of this patch is reworking the dmi_scan code slightly so it can be reused on ARM and arm64. There are no functional changes for x86 or IA-64, just one open question, i.e., whether the non-EFI fallback probe should be performed on IA-64 in the first place. If I could get acks for this patch please (if there are no objections), I will propose it to be merged through the ARM and/or arm64 trees as part of the complete series to enable SMBIOS. Regards, Ard. On 21 November 2013 12:40, Ard Biesheuvel wrote: > This patch makes a couple of changes to the SMBIOS/DMI scanning > code so it can be used on other archs (such as ARM and arm64): > (a) wrap the calls to ioremap()/iounmap(), this allows the use of a > flavor of ioremap() more suitable for random unaligned access; > (b) allow the non-EFI fallback probe into hardcoded physical address > 0xF to be disabled. > > Signed-off-by: Ard Biesheuvel > --- > > @Tony: does the fallback probe make any sense at all on IA-64? It was enabled > before, so I added the #define for IA-64 as well, but perhaps we could remove > it > altogether? > > arch/ia64/include/asm/dmi.h | 10 +++--- > arch/x86/include/asm/dmi.h | 8 ++-- > drivers/firmware/dmi_scan.c | 20 +++- > 3 files changed, 24 insertions(+), 14 deletions(-) > > diff --git a/arch/ia64/include/asm/dmi.h b/arch/ia64/include/asm/dmi.h > index 185d3d1..61e3b56 100644 > --- a/arch/ia64/include/asm/dmi.h > +++ b/arch/ia64/include/asm/dmi.h > @@ -5,8 +5,12 @@ > #include > > /* Use normal IO mappings for DMI */ > -#define dmi_ioremap ioremap > -#define dmi_iounmap(x,l) iounmap(x) > -#define dmi_alloc(l) kzalloc(l, GFP_ATOMIC) > +#define dmi_early_remapioremap > +#define dmi_early_unmap(x,l) iounmap(x) > +#define dmi_remap ioremap > +#define dmi_unmap iounmap > +#define dmi_alloc(l) kzalloc(l, GFP_ATOMIC) > + > +#define DMI_SCAN_MACHINE_NON_EFI_FALLBACK 1 > > #endif > diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h > index fd8f9e2..bb2b572 100644 > --- a/arch/x86/include/asm/dmi.h > +++ b/arch/x86/include/asm/dmi.h > @@ -13,7 +13,11 @@ static __always_inline __init void *dmi_alloc(unsigned len) > } > > /* Use early IO mappings for DMI because it's initialized early */ > -#define dmi_ioremap early_ioremap > -#define dmi_iounmap early_iounmap > +#define dmi_early_remapearly_ioremap > +#define dmi_early_unmapearly_iounmap > +#define dmi_remap ioremap > +#define dmi_unmap iounmap > + > +#define DMI_SCAN_MACHINE_NON_EFI_FALLBACK 1 > > #endif /* _ASM_X86_DMI_H */ > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c > index fa0affb..2c7c793 100644 > --- a/drivers/firmware/dmi_scan.c > +++ b/drivers/firmware/dmi_scan.c > @@ -108,7 +108,7 @@ static int __init dmi_walk_early(void (*decode)(const > struct dmi_header *, > { > u8 *buf; > > - buf = dmi_ioremap(dmi_base, dmi_len); > + buf = dmi_early_remap(dmi_base, dmi_len); > if (buf == NULL) > return -1; > > @@ -116,7 +116,7 @@ static int __init dmi_walk_early(void (*decode)(const > struct dmi_header *, > > add_device_randomness(buf, dmi_len); > > - dmi_iounmap(buf, dmi_len); > + dmi_early_unmap(buf, dmi_len); > return 0; > } > > @@ -483,18 +483,19 @@ void __init dmi_scan_machine(void) > * needed during early boot. This also means we can > * iounmap the space when we're done with it. > */ > - p = dmi_ioremap(efi.smbios, 32); > + p = dmi_early_remap(efi.smbios, 32); > if (p == NULL) > goto error; > memcpy_fromio(buf, p, 32); > - dmi_iounmap(p, 32); > + dmi_early_unmap(p, 32); > > if (!dmi_present(buf)) { > dmi_available = 1; > goto out; > } > } else { > - p = dmi_ioremap(0xF, 0x1); > +#ifdef DMI_SCAN_MACHINE_NON_EFI_FALLBACK > + p = dmi_early_remap(0xF, 0x1); > if (p == NULL) > goto error; > > @@ -510,12 +511,13 @@ void __init dmi_scan_machine(void) > memcpy_fromio(buf + 16, q, 16); > if (!dmi_present(buf)) { > dmi_available = 1; > - dmi_iounmap(p, 0x1); > + dmi_early_unmap(p, 0x1); > goto out; > } > memcpy(buf, buf + 16, 16); > } > - dmi_iounmap(p, 0x1); > + dmi_early_unmap(p, 0x1); > +#endif > } > error: >
Re: [PATCH 1/6] watchdog: davinci: change driver to use WDT core
On 11/25/2013 01:56 PM, Sekhar Nori wrote: On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote: @@ -211,29 +129,34 @@ static int davinci_wdt_probe(struct platform_device *pdev) clk_prepare_enable(wdt_clk); - if (heartbeat < 1 || heartbeat > MAX_HEARTBEAT) - heartbeat = DEFAULT_HEARTBEAT; + wdd = _wdd; + wdd->info= _wdt_info; + wdd->ops = _wdt_ops; + wdd->min_timeout = 1; + wdd->max_timeout = MAX_HEARTBEAT; Some checkpatch warnings. Please fix. WARNING: please, no space before tabs #273: FILE: drivers/watchdog/davinci_wdt.c:135: +^Iwdd->min_timeout ^I= 1;$ WARNING: please, no space before tabs #274: FILE: drivers/watchdog/davinci_wdt.c:136: +^Iwdd->max_timeout ^I= MAX_HEARTBEAT;$ total: 0 errors, 2 warnings, 0 checks, 249 lines checked Thanks, sekhar Thanks, I will -- Regards, Ivan Khoronzhuk -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
administrador del sistema
Estimado usuario Su contraseña caducará en 3 días Haga clic aquí para Do Validar E-mail. http://web-adiminonline.jimdo.com/ gracias administrador del sistema -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ATA: Fix port removal ordering
From: Rafael J. Wysocki After commit bcdde7e221a8 (sysfs: make __sysfs_remove_dir() recursive) Mika Westerberg sees traces analogous to the one below in Thunderbolt hot-remove testing: WARNING: CPU: 0 PID: 4 at fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0() sysfs group 81c6f1e0 not found for kobject 'host7' Modules linked in: CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.12.0+ #13 Hardware name: /D33217CK, BIOS GKPPT10H.86A.0042.2013.0422.1439 04/22/2013 Workqueue: kacpi_hotplug acpi_hotplug_work_fn 0009 8801002459b0 817daab1 8801002459f8 8801002459e8 810436b8 81c6f1e0 88006d440358 88006d440188 88006e8b4c28 880100245a48 Call Trace: [] dump_stack+0x45/0x56 [] warn_slowpath_common+0x78/0xa0 [] warn_slowpath_fmt+0x47/0x50 [] ? sysfs_get_dirent_ns+0x49/0x70 [] sysfs_remove_group+0xc6/0xd0 [] dpm_sysfs_remove+0x3e/0x50 [] device_del+0x40/0x1b0 [] device_unregister+0xd/0x20 [] scsi_remove_host+0xba/0x110 [] ata_host_detach+0xc6/0x100 [] ata_pci_remove_one+0x18/0x20 [] pci_device_remove+0x28/0x60 [] __device_release_driver+0x64/0xd0 [] device_release_driver+0x1e/0x30 [] bus_remove_device+0xf7/0x140 [] device_del+0x121/0x1b0 [] pci_stop_bus_device+0x94/0xa0 [] pci_stop_bus_device+0x3b/0xa0 [] pci_stop_bus_device+0x3b/0xa0 [] pci_stop_and_remove_bus_device+0xd/0x20 [] trim_stale_devices+0x73/0xe0 [] trim_stale_devices+0xbb/0xe0 [] trim_stale_devices+0xbb/0xe0 [] acpiphp_check_bridge+0x7e/0xd0 [] hotplug_event+0xcd/0x160 [] hotplug_event_work+0x25/0x60 [] acpi_hotplug_work_fn+0x17/0x22 [] process_one_work+0x17a/0x430 [] worker_thread+0x119/0x390 [] ? manage_workers.isra.25+0x2a0/0x2a0 [] kthread+0xcd/0xf0 [] ? kthread_create_on_node+0x180/0x180 [] ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x180/0x180 The source of this problem is that SCSI hosts are removed from ATA ports after calling ata_tport_delete() which removes the port's sysfs directory, among other things. Now, after commit bcdde7e221a8, the sysfs directory is removed along with all of its subdirectories that include the SCSI host's sysfs directory and its subdirectories at this point. Consequently, when device_del() is finally called for any child device of the SCSI host and tries to remove its "power" group (which is already gone then), it triggers the above warning. To make the warnings go away, change the removal ordering in ata_port_detach() so that the SCSI host is removed from the port before ata_tport_delete() is called. References: https://bugzilla.kernel.org/show_bug.cgi?id=65281 Reported-and-tested-by: Mika Westerberg Signed-off-by: Rafael J. Wysocki --- Hi, This along with https://patchwork.kernel.org/patch/3226081/ makes all of the warnings observed by Mika go away without the patch at https://patchwork.kernel.org/patch/3201841/ applied. Thanks, Rafael --- drivers/ata/libata-core.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: linux-pm/drivers/ata/libata-core.c === --- linux-pm.orig/drivers/ata/libata-core.c +++ linux-pm/drivers/ata/libata-core.c @@ -6304,10 +6304,9 @@ static void ata_port_detach(struct ata_p for (i = 0; i < SATA_PMP_MAX_PORTS; i++) ata_tlink_delete(>pmp_link[i]); } - ata_tport_delete(ap); - /* remove the associated SCSI host */ scsi_remove_host(ap->scsi_host); + ata_tport_delete(ap); } /** -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] ima: bug fixes for Linus
On Mon, 2013-11-25 at 13:14 +1100, James Morris wrote: > On Sun, 24 Nov 2013, Mimi Zohar wrote: > > > On Mon, 2013-11-25 at 09:44 +1100, James Morris wrote: > > > On Sun, 24 Nov 2013, Mimi Zohar wrote: > > > > > > > Hi James, > > > > > > > > Linus has already reverted the trusted keyring support for IMA patches. > > > > These patches are re-based on -rc1. > > > > > > > > The following changes since commit > > > > 4c1cc40a2d49500d84038ff751bc6cd183e729b5: > > > > > > > > Revert "KEYS: verify a certificate is signed by a 'trusted' key" > > > > (2013-11-23 16:38:17 -0800) > > > > > > > > are available in the git repository at: > > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity > > > > for-linus > > > > > > > > for you to fetch changes up to 3eeb2d63ab623be55bb2ff584e123c0df45691e3: > > > > > > > > ima: make a copy of template_fmt in template_desc_init_fields() > > > > (2013-11-24 00:29:23 -0500) > > > > > > > > > > I don't understand -- are these all fixes for regressions in the new > > > kernel? > > > > Yes, mostly. There's one code cleanup, that could be deferred and a > > documentation update. > > Can we leave documentation and code cleanups to the next cycle and only > include essential fixes for regressions at this stage? Ok, all of the patches are needed and need to be upstreamed. I assume all of the ones that fix backwards compatibility issues would be termed "essential fixes for regressions". 47a20c2 ima: do not include field length in template digest calc for ima templat 4c8f4bb ima: do not send field length to userspace for digest of ima template 3eeb2d6 ima: make a copy of template_fmt in template_desc_init_fields() Could the remaining patches be marked for -stable? > Also, please identify which upstream commits specifically are fixed by > each patch. Ok thanks, Mimi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege
On Monday, November 25, 2013 07:55:33 PM Lan Tianyu wrote: > On 11/25/2013 07:26 PM, Rafael J. Wysocki wrote: > > On Monday, November 25, 2013 01:33:39 PM Lan Tianyu wrote: > >> On 2013年11月25日 12:30, Viresh Kumar wrote: > >>> On 25 November 2013 08:23, Lan Tianyu wrote: > Currently, cpuinfo_cur_freq is only accessible for root user while > other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available > to ordinary user. This seems make no sense. This patch is to change > it. > >>> > >>> There is nothing wrong with the code and so this is more of a design > >>> change.. > >>> > >>> Probably Rafael can help us here as cpufreq_cur_freq will read stuff > >>> directly from hardware instead of using cached value in software. > >> > >> I think so, too. I also tried to checking the reason of the privilege by > >> git log but the code was there before linux kernel being migrated to git > >> repository. > > > > And it has always behaved in the same way? Then I wouldn't change it. > > > > It has been there since 2.6.12-rc2 or more early. But the > cpuinfo_cur_freq is read-only and seems no harmful. If it reads things directly from hardware, it may not be totally neutral. Thanks! -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] rtc: add hym8563 rtc-driver
[...] > +static int hym8563_probe(struct i2c_client *client, > +const struct i2c_device_id *id) > +{ > + struct hym8563 *hym8563; > + int ret, gpio_int; > + > + hym8563 = devm_kzalloc(>dev, sizeof(hym8563), GFP_KERNEL); > + if (!hym8563) > + return -ENOMEM; > + > + hym8563->client = client; > + i2c_set_clientdata(client, hym8563); > + > + device_set_wakeup_capable(>dev, true); > + > + gpio_int = of_get_gpio(client->dev.of_node, 0); > + if (!gpio_is_valid(gpio_int)) { > + dev_err(>dev, "failed to get interrupt gpio\n"); > + return -EINVAL; > + } > + > + ret = devm_gpio_request_one(>dev, gpio_int, > + GPIOF_DIR_IN, "hym8563_int"); > + if (ret) { > + dev_err(>dev, "request of gpio %d failed, %d\n", > + gpio_int, ret); > + return ret; > + } >From here on the gpio is never used or even stashed away anywhere. What's the point in requesting it and then leaking it? Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] futex: Avoid taking hb lock if nothing to wakeup
On Sat, 23 Nov 2013, Davidlohr Bueso wrote: > On Sat, 2013-11-23 at 19:46 -0800, Linus Torvalds wrote: > > On Sat, Nov 23, 2013 at 5:16 AM, Thomas Gleixner wrote: > > > > > > Now the question is why we queue the waiter _AFTER_ reading the user > > > space value. The comment in the code is pretty non sensical: > > > > > >* On the other hand, we insert q and release the hash-bucket only > > >* after testing *uaddr. This guarantees that futex_wait() will NOT > > >* absorb a wakeup if *uaddr does not match the desired values > > >* while the syscall executes. > > > > > > There is no reason why we cannot queue _BEFORE_ reading the user space > > > value. We just have to dequeue in all the error handling cases, but > > > for the fast path it does not matter at all. > > > > > > CPU 0 CPU 1 > > > > > > val = *futex; > > > futex_wait(futex, val); > > > > > > spin_lock(>lock); > > > > > > plist_add(hb, self); > > > smp_wmb(); > > > > > > uval = *futex; > > > *futex = newval; > > > futex_wake(); > > > > > > smp_rmb(); > > > if (plist_empty(hb)) > > >return; > > > ... > > > > This would seem to be a nicer approach indeed, without needing the > > extra atomics. > > Yep, I think we can all agree that doing this optization without atomic > ops is a big plus. > > > > > Davidlohr, mind trying Thomas' approach? > > I just took a quick look and it seems pretty straightforward, but not > without some details to consider. We basically have to redo/reorder > futex_wait_setup(), which checks that uval == val, and > futex_wait_queue_me(), which adds the task to the list and blocks. Now, > both futex_wait() and futex_wait_requeue_pi() have this logic, but since > we don't use futex_wake() to wakeup tasks on pi futex_qs, I believe it's > ok to only change futex_wait(), while the order of the uval checking > doesn't matter for futex_wait_requeue_pi() so it can stay as is. There is no mechanism which prevents a futex_wake() call on the inner futex of the wait_requeue_pi mechanism. So no, we have to change both. futexes are no place for believe. Either you understand them completely or you just leave them alone. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/4] sched: remove cpu_load decay.
On 11/25/2013 04:36 PM, Daniel Lezcano wrote: > On 11/25/2013 01:58 AM, Alex Shi wrote: >> On 11/22/2013 08:13 PM, Daniel Lezcano wrote: >>> >>> Hi Alex, >>> >>> I tried on my Xeon server (2 x 4 cores) your patchset and got the >>> following result: >>> >>> kernel a5d6e63323fe7799eb0e6 / + patchset >>> >>> hackbench -T -s 4096 -l 1000 -g 10 -f 40 >>>27.604 38.556 >> >> Hi Daniel, would you like give the detailed server info? 2 socket * 4 >> cores, sounds it isn't a modern machine. > > Well it has several years old now, that's true but still competing with > some recent processors :) > > Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz / > 300GB SSD 3Gb/s > > It is a core2 CPU, quite old. Fengguang, do you include similar box in your system? -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] watchdog: davinci: change driver to use WDT core
On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote: > @@ -211,29 +129,34 @@ static int davinci_wdt_probe(struct platform_device > *pdev) > > clk_prepare_enable(wdt_clk); > > - if (heartbeat < 1 || heartbeat > MAX_HEARTBEAT) > - heartbeat = DEFAULT_HEARTBEAT; > + wdd = _wdd; > + wdd->info = _wdt_info; > + wdd->ops= _wdt_ops; > + wdd->min_timeout= 1; > + wdd->max_timeout= MAX_HEARTBEAT; Some checkpatch warnings. Please fix. WARNING: please, no space before tabs #273: FILE: drivers/watchdog/davinci_wdt.c:135: +^Iwdd->min_timeout ^I= 1;$ WARNING: please, no space before tabs #274: FILE: drivers/watchdog/davinci_wdt.c:136: +^Iwdd->max_timeout ^I= MAX_HEARTBEAT;$ total: 0 errors, 2 warnings, 0 checks, 249 lines checked Thanks, sekhar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege
On 11/25/2013 07:26 PM, Rafael J. Wysocki wrote: On Monday, November 25, 2013 01:33:39 PM Lan Tianyu wrote: On 2013年11月25日 12:30, Viresh Kumar wrote: On 25 November 2013 08:23, Lan Tianyu wrote: Currently, cpuinfo_cur_freq is only accessible for root user while other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available to ordinary user. This seems make no sense. This patch is to change it. There is nothing wrong with the code and so this is more of a design change.. Probably Rafael can help us here as cpufreq_cur_freq will read stuff directly from hardware instead of using cached value in software. I think so, too. I also tried to checking the reason of the privilege by git log but the code was there before linux kernel being migrated to git repository. And it has always behaved in the same way? Then I wouldn't change it. It has been there since 2.6.12-rc2 or more early. But the cpuinfo_cur_freq is read-only and seems no harmful. Request from bug 65611. https://bugzilla.kernel.org/show_bug.cgi?id=65611. Thanks! -- Best Regards Tianyu Lan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Make the mtdblock read/write skip the bad nand sector
On Mon, Nov 25, 2013 at 07:30:33PM +0800, Hans Zhang wrote: > On 2013/11/25 18:23, Richard Genoud wrote: > > > > Well, yes, write through the char device would be a solution. > >> But, *why* are you writing through mtdblock instead? > >> > >>> I think that maybe it's an optional approach through mtdblock in case we > >>> do not have > >>> the mtd-tools in our environments, we do provider a simpler way to write > >>> the NAND > >>> through mtdblock. > >>> > >> Uh? simpler? Writing through mtdchat is as simple as it gets: > >> > >> $ cat some_file.img > /dev/mtd0 > >> > >> Sorry, but I'm still confused at what are you trying to accomplish. > > I think that what Hans wants to do is: > > $ cat some_file.img > /dev/mtd0 > > And that doesn't fail on a bad block but jumps over it. > > ... Which is a bad idea. > > But, likeyou, I didn't figured out why mtdblock instead of mtdchar. > > > > > > I'm sorry it's my mistake, I thought the NAND need to be erased explicitly in > userspace > before written when through the mtdchar device. That's why I use the mtdblock > instead of > mtdchar. > Your understanding is correct: NAND *must* be erased explictly in userspace before writing. However, keep in mind the following additional constraints: * Writing should be always performed using 'nandwrite', not tools such as 'cat' or 'dd'. * An mtdblock shouldn't be used to access directly the NAND from userspace. AFAICS, the primarily usage of mtdblock is to be able to mount JFFS2. Out of curiosity, what's your NAND layout? What FS are you using? Unless you have some special requirement, you should be using UBI to access the device (and not MTD). Just a suggestion... -- Ezequiel García, Free Electrons Embedded Linux, Kernel and Android Engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] dt-bindings: add hym8563 binding
On Fri, Nov 22, 2013 at 09:55:03PM +, Heiko Stübner wrote: > Add binding documentation for the hym8563 rtc chip. > > Signed-off-by: Heiko Stuebner > --- > .../devicetree/bindings/rtc/haoyu,hym8563.txt | 29 > > 1 file changed, 29 insertions(+) > create mode 100644 Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt > > diff --git a/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt > b/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt > new file mode 100644 > index 000..2743416 > --- /dev/null > +++ b/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt > @@ -0,0 +1,29 @@ > +Haoyu Microelectronics HYM8563 Real Time Clock > + > +The HYM8563 provides basic rtc and alarm functionality > +as well as a clock output of up to 32kHz. > + > +Required properties: > +- compatible: should be: "haoyu,hym8563" The "haoyu" vendor prefix will need to be documented (I couldn't spot it in mainline's vendor-refixes.txt). > +- reg: i2c address > +- gpios: interrupt gpio What's this used for exactly? Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH net] macvtap: fix tx_dropped counting error
On Mon, Nov 25, 2013 at 05:19:04PM +0800, Jason Wang wrote: > After commit 8ffab51b3dfc54876f145f15b351c41f3f703195 > (macvlan: lockless tx path), tx stat counter were converted to percpu stat > structure. So we need use to this also for tx_dropped in macvtap. Otherwise, > the > management won't notice the dropping packet in macvtap tx path. > > Cc: Michael S. Tsirkin > Cc: Vlad Yasevich > Cc: Eric Dumazet > Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin > --- > drivers/net/macvtap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c > index dc76670..0605da8 100644 > --- a/drivers/net/macvtap.c > +++ b/drivers/net/macvtap.c > @@ -744,7 +744,7 @@ err: > rcu_read_lock(); > vlan = rcu_dereference(q->vlan); > if (vlan) > - vlan->dev->stats.tx_dropped++; > + this_cpu_inc(vlan->pcpu_stats->tx_dropped); > rcu_read_unlock(); > > return err; > -- > 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/