[PATCH 0/2] Tegra124 clock fixes

2013-11-25 Thread Peter De Schrijver
A few fixes for Tegra124 clocks. These patches will be squashed into the pull
request I will send out later today, but I wanted to post them here as well
so the changes are known.

Peter De Schrijver (2):
  clk: tegra: fix vi clk for Terga124
  clk: tegra: fix pllcx pdiv for Tegra124

 drivers/clk/tegra/clk-id.h   |1 +
 drivers/clk/tegra/clk-tegra-periph.c |8 
 drivers/clk/tegra/clk-tegra124.c |5 -
 3 files changed, 13 insertions(+), 1 deletions(-)

-- 
1.7.7.rc0.72.g4b5ea.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 31/78] Staging: bcm: info leak in ioctl

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Dan Carpenter 

commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream.

The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel
information to user space.

Reported-by: Nico Golde 
Reported-by: Fabian Yamaguchi 
Signed-off-by: Dan Carpenter 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 drivers/staging/bcm/Bcmchar.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/bcm/Bcmchar.c b/drivers/staging/bcm/Bcmchar.c
index cf30592..c0d612f 100644
--- a/drivers/staging/bcm/Bcmchar.c
+++ b/drivers/staging/bcm/Bcmchar.c
@@ -1957,6 +1957,7 @@ cntrlEnd:
 
BCM_DEBUG_PRINT(Adapter, DBG_TYPE_OTHERS, OSAL_DBG, 
DBG_LVL_ALL, "Called IOCTL_BCM_GET_DEVICE_DRIVER_INFO\n");
 
+   memset(, 0, sizeof(DevInfo));
DevInfo.MaxRDMBufferSize = BUFFER_4K;
DevInfo.u32DSDStartOffset = EEPROM_CALPARAM_START;
DevInfo.u32RxAlignmentCorrection = 0;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 41/78] usb: Disable USB 2.0 Link PM before device reset.

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Sarah Sharp 

commit dcc01c0864823f91c3bf3ffca6613e2351702b87 upstream.

Before the USB core resets a device, we need to disable the L1 timeout
for the roothub, if USB 2.0 Link PM is enabled.  Otherwise the port may
transition into L1 in between descriptor fetches, before we know if the
USB device descriptors changed.  LPM will be re-enabled after the
full device descriptors are fetched, and we can confirm the device still
supports USB 2.0 LPM after the reset.

We don't need to wait for the USB device to exit L1 before resetting the
device, since the xHCI roothub port diagrams show a transition to the
Reset state from any of the Ux states (see Figure 34 in the 2012-08-14
xHCI specification update).

This patch should be backported to kernels as old as 3.2, that contain
the commit 65580b4321eb36f16ae8b5987bfa1bb948fc5112 "xHCI: set USB2
hardware LPM".  That was the first commit to enable USB 2.0
hardware-driven Link Power Management.

Signed-off-by: Sarah Sharp 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 drivers/usb/core/hub.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 7be4e11..86c7421 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -4824,6 +4824,12 @@ static int usb_reset_and_verify_device(struct usb_device 
*udev)
}
parent_hub = hdev_to_hub(parent_hdev);
 
+   /* Disable USB2 hardware LPM.
+* It will be re-enabled by the enumeration process.
+*/
+   if (udev->usb2_hw_lpm_enabled == 1)
+   usb_set_usb2_hardware_lpm(udev, 0);
+
/* Disable LPM while we reset the device and reinstall the alt settings.
 * Device-initiated LPM settings, and system exit latency settings are
 * cleared when the device is reset, so we have to set them up again.
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] clk: tegra: fix pllcx pdiv for Tegra124

2013-11-25 Thread Peter De Schrijver
The post divider field for pllcx on Tegra124 has some more allowed values than
the one on Tegra114. Fix the code to reflect this.

Signed-off-by: Peter De Schrijver 
---
 drivers/clk/tegra/clk-tegra124.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/clk/tegra/clk-tegra124.c b/drivers/clk/tegra/clk-tegra124.c
index 54af043..863c38b 100644
--- a/drivers/clk/tegra/clk-tegra124.c
+++ b/drivers/clk/tegra/clk-tegra124.c
@@ -263,8 +263,11 @@ static struct div_nmp pllcx_nmp = {
 static struct pdiv_map pllc_p[] = {
{ .pdiv = 1, .hw_val = 0 },
{ .pdiv = 2, .hw_val = 1 },
+   { .pdiv = 3, .hw_val = 2 },
{ .pdiv = 4, .hw_val = 3 },
+   { .pdiv = 6, .hw_val = 4 },
{ .pdiv = 8, .hw_val = 5 },
+   { .pdiv = 12, .hw_val = 6 },
{ .pdiv = 16, .hw_val = 7 },
{ .pdiv = 0, .hw_val = 0 },
 };
-- 
1.7.7.rc0.72.g4b5ea.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 40/78] USB: mos7840: fix tiocmget error handling

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Johan Hovold 

commit a91ccd26e75235d86248d018fe3779732bcafd8d upstream.

Make sure to return errors from tiocmget rather than rely on
uninitialised stack data.

Signed-off-by: Johan Hovold 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Luis Henriques 
---
 drivers/usb/serial/mos7840.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
index d9368be..08aad01 100644
--- a/drivers/usb/serial/mos7840.c
+++ b/drivers/usb/serial/mos7840.c
@@ -1707,7 +1707,11 @@ static int mos7840_tiocmget(struct tty_struct *tty)
return -ENODEV;
 
status = mos7840_get_uart_reg(port, MODEM_STATUS_REGISTER, );
+   if (status != 1)
+   return -EIO;
status = mos7840_get_uart_reg(port, MODEM_CONTROL_REGISTER, );
+   if (status != 1)
+   return -EIO;
result = ((mcr & MCR_DTR) ? TIOCM_DTR : 0)
| ((mcr & MCR_RTS) ? TIOCM_RTS : 0)
| ((mcr & MCR_LOOPBACK) ? TIOCM_LOOP : 0)
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/6] Update Davinci watchdog driver

2013-11-25 Thread ivan.khoronzhuk

On 11/25/2013 03:06 PM, Sekhar Nori wrote:

On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote:

These patches are intended to update Davinci watchdog to use WDT core
and reuse driver for keystone arch, because Keystone uses the similar
IP like Davinci.


This series causes a regression on all DaVinci platforms because after
the series is applied the watchdog device does not get registered at
all. Since you changed the device name please include a patch to fix the
platform references too.

Thanks,
Sekhar



Ok, I will replace

--
Regards,
Ivan Khoronzhuk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 43/78] rt2400pci: fix RSSI read

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Stanislaw Gruszka 

commit 2bf127a5cc372b9319afcbae10b090663b621c8b upstream.

RSSI value is provided on word3 not on word2.

Signed-off-by: Stanislaw Gruszka 
Signed-off-by: John W. Linville 
Signed-off-by: Luis Henriques 
---
 drivers/net/wireless/rt2x00/rt2400pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/rt2x00/rt2400pci.c 
b/drivers/net/wireless/rt2x00/rt2400pci.c
index d8594a2..dd2160c 100644
--- a/drivers/net/wireless/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/rt2x00/rt2400pci.c
@@ -1253,7 +1253,7 @@ static void rt2400pci_fill_rxdone(struct queue_entry 
*entry,
 */
rxdesc->timestamp = ((u64)rx_high << 32) | rx_low;
rxdesc->signal = rt2x00_get_field32(word2, RXD_W2_SIGNAL) & ~0x08;
-   rxdesc->rssi = rt2x00_get_field32(word2, RXD_W3_RSSI) -
+   rxdesc->rssi = rt2x00_get_field32(word3, RXD_W3_RSSI) -
entry->queue->rt2x00dev->rssi_offset;
rxdesc->size = rt2x00_get_field32(word0, RXD_W0_DATABYTE_COUNT);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 22/78] perf: Fix perf ring buffer memory ordering

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Peter Zijlstra 

commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky 
Tested-by: Victor Kaplansky 
Signed-off-by: Peter Zijlstra 
Cc: Mathieu Desnoyers 
Cc: mich...@ellerman.id.au
Cc: Paul McKenney 
Cc: Michael Neuling 
Cc: Frederic Weisbecker 
Cc: an...@samba.org
Cc: b...@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.gg19...@laptop.lan
Signed-off-by: Ingo Molnar 
[ luis: backported to 3.5:
  - file rename: include/uapi/linux/perf_event.h -> include/linux/perf_event.h ]
Signed-off-by: Luis Henriques 
---
 include/linux/perf_event.h  | 12 +++-
 kernel/events/ring_buffer.c | 31 +++
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3faf0d4..7e72637 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -393,13 +393,15 @@ struct perf_event_mmap_page {
/*
 * Control data for the mmap() data buffer.
 *
-* User-space reading the @data_head value should issue an rmb(), on
-* SMP capable platforms, after reading this value -- see
-* perf_event_wakeup().
+* User-space reading the @data_head value should issue an smp_rmb(),
+* after reading this value.
 *
 * When the mapping is PROT_WRITE the @data_tail value should be
-* written by userspace to reflect the last read data. In this case
-* the kernel will not over-write unread data.
+* written by userspace to reflect the last read data, after issueing
+* an smp_mb() to separate the data read from the ->data_tail store.
+* In this case the kernel will not over-write unread data.
+*
+* See perf_output_put_handle() for the data ordering.
 */
__u64   data_head;  /* head in the data section */
__u64   data_tail;  /* user-space written tail */
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 6ddaba4..4636ecc 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -75,10 +75,31 @@ again:
goto out;
 
/*
-* Publish the known good head. Rely on the full barrier implied
-* by atomic_dec_and_test() order the rb->head read and this
-* write.
+* Since the mmap() consumer (userspace) can run on a different CPU:
+*
+*   kernel user
+*
+*   READ ->data_tail   READ ->data_head
+*   smp_mb()   (A) smp_rmb()   (C)
+*   WRITE $dataREAD $data
+*   smp_wmb()  (B) smp_mb()(D)
+*   STORE ->data_head  WRITE ->data_tail
+*
+* Where A pairs with D, and B pairs with C.
+*
+* I don't think A needs to be a full barrier because we won't in fact
+* write data until we see the store from userspace. So we simply don't
+* issue the data WRITE until we observe it. Be conservative for now.
+*
+* OTOH, D needs to be a full barrier since it separates the data READ
+* from the tail WRITE.
+*
+* For B a WMB is sufficient since it separates two WRITEs, and for C
+* an RMB is sufficient since it separates two READs.
+*
+* See perf_output_begin().
 */
+   smp_wmb();
rb->user_page->data_head = head;
 
/*
@@ -142,9 +163,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 * Userspace could choose to issue a mb() before updating the
 * tail pointer. So that all reads will be completed before the
 * write is issued.
+*
+* See perf_output_put_handle().
 */
tail = ACCESS_ONCE(rb->user_page->data_tail);
-   smp_rmb();
+   smp_mb();
offset = head = local_read(>head);
head += size;
if (unlikely(!perf_output_space(rb, tail, offset, head)))
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 44/78] rt2x00: check if device is still available on rt2x00mac_flush()

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Stanislaw Gruszka 

commit 5671ab05cf2a579218985ef56595387932d78ee4 upstream.

Fix random kernel panic with below messages when remove dongle.

[ 2212.355447] BUG: unable to handle kernel NULL pointer dereference at 
0250
[ 2212.355527] IP: [] rt2x00usb_kick_tx_entry+0x12/0x160 
[rt2x00usb]
[ 2212.355599] PGD 0
[ 2212.355626] Oops:  [#1] SMP
[ 2212.355664] Modules linked in: rt2800usb rt2x00usb rt2800lib crc_ccitt 
rt2x00lib mac80211 cfg80211 tun arc4 fuse rfcomm bnep snd_hda_codec_realtek 
snd_hda_intel snd_hda_codec btusb uvcvideo bluetooth snd_hwdep 
x86_pkg_temp_thermal snd_seq coretemp aesni_intel aes_x86_64 snd_seq_device 
glue_helper snd_pcm ablk_helper videobuf2_vmalloc sdhci_pci videobuf2_memops 
videobuf2_core sdhci videodev mmc_core serio_raw snd_page_alloc microcode 
i2c_i801 snd_timer hid_multitouch thinkpad_acpi lpc_ich mfd_core snd tpm_tis 
wmi tpm tpm_bios soundcore acpi_cpufreq i915 i2c_algo_bit drm_kms_helper drm 
i2c_core video [last unloaded: cfg80211]
[ 2212.356224] CPU: 0 PID: 34 Comm: khubd Not tainted 3.12.0-rc3-wl+ #3
[ 2212.356268] Hardware name: LENOVO 3444CUU/3444CUU, BIOS G6ET93WW (2.53 ) 
02/04/2013
[ 2212.356319] task: 880212f687c0 ti: 880212f66000 task.ti: 
880212f66000
[ 2212.356392] RIP: 0010:[]  [] 
rt2x00usb_kick_tx_entry+0x12/0x160 [rt2x00usb]
[ 2212.356481] RSP: 0018:880212f67750  EFLAGS: 00010202
[ 2212.356519] RAX: 000c RBX: 000c RCX: 0293
[ 2212.356568] RDX: 8801f4dc219a RSI:  RDI: 0240
[ 2212.356617] RBP: 880212f67778 R08: a02667e0 R09: 0002
[ 2212.356665] R10: 0001f95254ab4b40 R11: 880212f675be R12: 8801f4dc2150
[ 2212.356712] R13:  R14: a02667e0 R15: 000d
[ 2212.356761] FS:  () GS:88021e20() 
knlGS:
[ 2212.356813] CS:  0010 DS:  ES:  CR0: 80050033
[ 2212.356852] CR2: 0250 CR3: 01a0c000 CR4: 001407f0
[ 2212.356899] Stack:
[ 2212.356917]  000c 8801f4dc2150  
a02667e0
[ 2212.356980]  000d 880212f677b8 a03a31ad 
8801f4dc219a
[ 2212.357038]  8801f4dc2150  8800b93217a0 
8801f49bc800
[ 2212.357099] Call Trace:
[ 2212.357122]  [] ? rt2x00usb_interrupt_txdone+0x90/0x90 
[rt2x00usb]
[ 2212.357174]  [] rt2x00queue_for_each_entry+0xed/0x170 
[rt2x00lib]
[ 2212.357244]  [] rt2x00usb_kick_queue+0x5c/0x60 [rt2x00usb]
[ 2212.357314]  [] rt2x00queue_flush_queue+0x62/0xa0 
[rt2x00lib]
[ 2212.357386]  [] rt2x00mac_flush+0x30/0x70 [rt2x00lib]
[ 2212.357470]  [] ieee80211_flush_queues+0xbd/0x140 
[mac80211]
[ 2212.357555]  [] ieee80211_set_disassoc+0x2d2/0x3d0 
[mac80211]
[ 2212.357645]  [] ieee80211_mgd_deauth+0x1d3/0x240 [mac80211]
[ 2212.357718]  [] ? try_to_wake_up+0xec/0x290
[ 2212.357788]  [] ieee80211_deauth+0x18/0x20 [mac80211]
[ 2212.357872]  [] cfg80211_mlme_deauth+0x9c/0x140 [cfg80211]
[ 2212.357913]  [] cfg80211_mlme_down+0x5c/0x60 [cfg80211]
[ 2212.357962]  [] cfg80211_disconnect+0x188/0x1a0 [cfg80211]
[ 2212.358014]  [] ? __cfg80211_stop_sched_scan+0x1c/0x130 
[cfg80211]
[ 2212.358067]  [] cfg80211_leave+0xc4/0xe0 [cfg80211]
[ 2212.358124]  [] cfg80211_netdev_notifier_call+0x3ab/0x5e0 
[cfg80211]
[ 2212.358177]  [] ? inetdev_event+0x38/0x510
[ 2212.358217]  [] ? __wake_up+0x44/0x50
[ 2212.358254]  [] notifier_call_chain+0x4c/0x70
[ 2212.358293]  [] raw_notifier_call_chain+0x16/0x20
[ 2212.358361]  [] call_netdevice_notifiers_info+0x35/0x60
[ 2212.358429]  [] __dev_close_many+0x49/0xd0
[ 2212.358487]  [] dev_close_many+0x88/0x100
[ 2212.358546]  [] rollback_registered_many+0xb0/0x220
[ 2212.358612]  [] unregister_netdevice_many+0x19/0x60
[ 2212.358694]  [] ieee80211_remove_interfaces+0x112/0x190 
[mac80211]
[ 2212.358791]  [] ieee80211_unregister_hw+0x4f/0x100 
[mac80211]
[ 2212.361994]  [] rt2x00lib_remove_dev+0x161/0x1a0 
[rt2x00lib]
[ 2212.365240]  [] rt2x00usb_disconnect+0x2e/0x70 [rt2x00usb]
[ 2212.368470]  [] usb_unbind_interface+0x64/0x1c0
[ 2212.371734]  [] __device_release_driver+0x7f/0xf0
[ 2212.374999]  [] device_release_driver+0x23/0x30
[ 2212.378131]  [] bus_remove_device+0x108/0x180
[ 2212.381358]  [] device_del+0x135/0x1d0
[ 2212.384454]  [] usb_disable_device+0xb0/0x270
[ 2212.387451]  [] usb_disconnect+0xad/0x1d0
[ 2212.390294]  [] hub_thread+0x63d/0x1660
[ 2212.393034]  [] ? wake_up_atomic_t+0x30/0x30
[ 2212.395728]  [] ? hub_port_debounce+0x130/0x130
[ 2212.398412]  [] kthread+0xc0/0xd0
[ 2212.401058]  [] ? insert_kthread_work+0x40/0x40
[ 2212.403639]  [] ret_from_fork+0x7c/0xb0
[ 2212.406193]  [] ? insert_kthread_work+0x40/0x40
[ 2212.408732] Code: 24 58 08 00 00 bf 80 00 00 00 e8 3a c3 e0 e0 5b 41 5c 5d 
c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 <48> 8b 
47 10 48 

Re: [PATCH v2 0/6] Update Davinci watchdog driver

2013-11-25 Thread Sekhar Nori
On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote:
> These patches are intended to update Davinci watchdog to use WDT core
> and reuse driver for keystone arch, because Keystone uses the similar
> IP like Davinci.

This series causes a regression on all DaVinci platforms because after
the series is applied the watchdog device does not get registered at
all. Since you changed the device name please include a patch to fix the
platform references too.

Thanks,
Sekhar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 48/78] USB:add new zte 3g-dongle's pid to option.c

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Rui li 

commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream.

Signed-off-by: Rui li 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Luis Henriques 
---
 drivers/usb/serial/option.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index c4b313f..dbc6919 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1391,6 +1391,23 @@ static const struct usb_device_id option_ids[] = {
.driver_info = (kernel_ulong_t)_intf2_blacklist },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1426, 0xff, 0xff, 
0xff),  /* ZTE MF91 */
.driver_info = (kernel_ulong_t)_intf2_blacklist },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1533, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1534, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1535, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1545, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1546, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1547, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1565, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1566, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1567, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1589, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1590, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1591, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1592, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1594, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1596, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1598, 0xff, 0xff, 
0xff) },
+   { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1600, 0xff, 0xff, 
0xff) },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x2002, 0xff,
  0xff, 0xff), .driver_info = (kernel_ulong_t)_k3765_z_blacklist },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x2003, 0xff, 0xff, 
0xff) },
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 52/78] ALSA: 6fire: Fix probe of multiple cards

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Takashi Iwai 

commit 9b389a8a022110b4bc055a19b888283544d9eba6 upstream.

The probe code of snd-usb-6fire driver overrides the devices[] pointer
wrongly without checking whether it's already occupied or not.  This
would screw up the device disconnection later.

Spotted by coverity CID 141423.

Signed-off-by: Takashi Iwai 
Signed-off-by: Luis Henriques 
---
 sound/usb/6fire/chip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/usb/6fire/chip.c b/sound/usb/6fire/chip.c
index fc8cc82..f803348 100644
--- a/sound/usb/6fire/chip.c
+++ b/sound/usb/6fire/chip.c
@@ -101,7 +101,7 @@ static int __devinit usb6fire_chip_probe(struct 
usb_interface *intf,
usb_set_intfdata(intf, chips[i]);
mutex_unlock(_mutex);
return 0;
-   } else if (regidx < 0)
+   } else if (!devices[i] && regidx < 0)
regidx = i;
}
if (regidx < 0) {
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 33/78] lib/scatterlist.c: don't flush_kernel_dcache_page on slab page

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ming Lei 

commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream.

Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper
functions") introduces two sg buffer copy helpers, and calls
flush_kernel_dcache_page() on pages in SG list after these pages are
written to.

Unfortunately, the commit may introduce a potential bug:

 - Before sending some SCSI commands, kmalloc() buffer may be passed to
   block layper, so flush_kernel_dcache_page() can see a slab page
   finally

 - According to cachetlb.txt, flush_kernel_dcache_page() is only called
   on "a user page", which surely can't be a slab page.

 - ARCH's implementation of flush_kernel_dcache_page() may use page
   mapping information to do optimization so page_mapping() will see the
   slab page, then VM_BUG_ON() is triggered.

Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
before calling flush_kernel_dcache_page().

Signed-off-by: Ming Lei 
Reported-by: Aaro Koskinen 
Tested-by: Simon Baatz 
Cc: Russell King - ARM Linux 
Cc: Will Deacon 
Cc: Aaro Koskinen 
Acked-by: Catalin Marinas 
Cc: FUJITA Tomonori 
Cc: Tejun Heo 
Cc: "James E.J. Bottomley" 
Cc: Jens Axboe 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 lib/scatterlist.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 6096e89..8c2f278 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -419,7 +419,8 @@ void sg_miter_stop(struct sg_mapping_iter *miter)
if (miter->addr) {
miter->__offset += miter->consumed;
 
-   if (miter->__flags & SG_MITER_TO_SG)
+   if ((miter->__flags & SG_MITER_TO_SG) &&
+   !PageSlab(miter->page))
flush_kernel_dcache_page(miter->page);
 
if (miter->__flags & SG_MITER_ATOMIC) {
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [i915] BUG: Bad page state in process Xorg

2013-11-25 Thread thomas
Hi,

It turns out that this seems to be a bug in udl DRM driver.

I bisected the problem to this patch:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/udl?id=5dc9e1e87229cb786a5bb58ddd0d60fee6eb4641

With kind regards
Thomas

Am 22.11.2013 17:18 schrieb Daniel Vetter :
>
> On Fri, Nov 22, 2013 at 4:54 PM, Thomas Meyer  wrote: 
> >> Am 22.11.2013 um 11:55 schrieb Daniel Vetter : 
> >> 
> >> On Fri, Nov 22, 2013 at 11:36 AM, Dave Airlie  wrote: 
>  Hi, 
> >>> 
> >>> cc'ing mailing list, 
> >>> 
> >>> Daniel any ideas? 
> >> 
> >> Nope, not really :( And no ideas how to triage this further - if it 
> >> takes 9 days to hit it eventually we'll have a real hard time. Or does 
> >> this happen even after just a short X run? 
> > 
> > Seems to happen every time while stopping the x server. Also after a short 
> > run time. 
> > 
> > The current fedora 3.11 kernel doesn't show this bug. I'm using fedora 19, 
> > with a self compiled kernel. 
> > 
> > I did turn on config-debug-pagealloc but this didn't show any wrongness. 
>
> In that case I think the bisect is the fastest way to insight - atm 
> I'm really at loss what could be wrong here. 
> -Daniel 
> -- 
> Daniel Vetter 
> Software Engineer, Intel Corporation 
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch 
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[PATCH 3.5 50/78] ahci: disabled FBS prior to issuing software reset

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: xiangliang yu 

commit 89dafa20f3daab5b3e0c13d0068a28e8e64e2102 upstream.

Tested with Marvell 88se9125, attached with one port mulitplier(5 ports)
and one disk, we will get following boot log messages if using current
code:

  ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 330)
  ata8.15: Port Multiplier 1.2, 0x1b4b:0x9715 r160, 5 ports, feat 0x1/0x1f
  ahci :03:00.0: FBS is enabled
  ata8.00: hard resetting link
  ata8.00: SATA link down (SStatus 0 SControl 330)
  ata8.01: hard resetting link
  ata8.01: SATA link down (SStatus 0 SControl 330)
  ata8.02: hard resetting link
  ata8.02: SATA link down (SStatus 0 SControl 330)
  ata8.03: hard resetting link
  ata8.03: SATA link up 6.0 Gbps (SStatus 133 SControl 133)
  ata8.04: hard resetting link
  ata8.04: failed to resume link (SControl 133)
  ata8.04: failed to read SCR 0 (Emask=0x40)
  ata8.04: failed to read SCR 0 (Emask=0x40)
  ata8.04: failed to read SCR 1 (Emask=0x40)
  ata8.04: failed to read SCR 0 (Emask=0x40)
  ata8.03: native sectors (2) is smaller than sectors (976773168)
  ata8.03: ATA-8: ST3500413AS, JC4B, max UDMA/133
  ata8.03: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
  ata8.03: configured for UDMA/133
  ata8.04: failed to IDENTIFY (I/O error, err_mask=0x100)
  ata8.15: hard resetting link
  ata8.15: SATA link up 6.0 Gbps (SStatus 133 SControl 330)
  ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133'
  ata8.15: PMP revalidation failed (errno=-19)
  ata8.15: hard resetting link
  ata8.15: SATA link up 6.0 Gbps (SStatus 133 SControl 330)
  ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133'
  ata8.15: PMP revalidation failed (errno=-19)
  ata8.15: limiting SATA link speed to 3.0 Gbps
  ata8.15: hard resetting link
  ata8.15: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
  ata8.15: Port Multiplier vendor mismatch '0x1b4b' != '0x133'
  ata8.15: PMP revalidation failed (errno=-19)
  ata8.15: failed to recover PMP after 5 tries, giving up
  ata8.15: Port Multiplier detaching
  ata8.03: disabled
  ata8.00: disabled
  ata8: EH complete

The reason is that current detection code doesn't follow AHCI spec:

First,the port multiplier detection process look like this:

ahci_hardreset(link, class, deadline)
if (class == ATA_DEV_PMP) {
sata_pmp_attach(dev)/* will enable FBS */
sata_pmp_init_links(ap, nr_ports);
ata_for_each_link(link, ap, EDGE) {
sata_std_hardreset(link, class, deadline);
if (link_is_online) /* do soft reset */
ahci_softreset(link, class, deadline);
}
}
But, according to chapter 9.3.9 in AHCI spec: Prior to issuing software
reset, software shall clear PxCMD.ST to '0' and then clear PxFBS.EN to
'0'.

The patch test ok with kernel 3.11.1.

tj: Patch white space contaminated, applied manually with trivial
updates.

Signed-off-by: Xiangliang Yu 
Signed-off-by: Tejun Heo 
Signed-off-by: Luis Henriques 
---
 drivers/ata/libahci.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index 47a1fb8..60f41cd 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1249,9 +1249,11 @@ int ahci_do_softreset(struct ata_link *link, unsigned 
int *class,
 {
struct ata_port *ap = link->ap;
struct ahci_host_priv *hpriv = ap->host->private_data;
+   struct ahci_port_priv *pp = ap->private_data;
const char *reason = NULL;
unsigned long now, msecs;
struct ata_taskfile tf;
+   bool fbs_disabled = false;
int rc;
 
DPRINTK("ENTER\n");
@@ -1261,6 +1263,16 @@ int ahci_do_softreset(struct ata_link *link, unsigned 
int *class,
if (rc && rc != -EOPNOTSUPP)
ata_link_warn(link, "failed to reset engine (errno=%d)\n", rc);
 
+   /*
+* According to AHCI-1.2 9.3.9: if FBS is enable, software shall
+* clear PxFBS.EN to '0' prior to issuing software reset to devices
+* that is attached to port multiplier.
+*/
+   if (!ata_is_host_link(link) && pp->fbs_enabled) {
+   ahci_disable_fbs(ap);
+   fbs_disabled = true;
+   }
+
ata_tf_init(link->device, );
 
/* issue the first D2H Register FIS */
@@ -1301,6 +1313,10 @@ int ahci_do_softreset(struct ata_link *link, unsigned 
int *class,
} else
*class = ahci_dev_classify(ap);
 
+   /* re-enable FBS if disabled before */
+   if (fbs_disabled)
+   ahci_enable_fbs(ap);
+
DPRINTK("EXIT, class=%u\n", *class);
return 0;
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 3.5 49/78] libata: Fix display of sata speed

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Gwendal Grignou 

commit 3e85c3ecbc520751324a191d23bb94873ed01b10 upstream.

6.0 Gbps link speed was not decoded properly:
speed was reported at 3.0 Gbps only.

Tested: On a machine where libata reports 6.0 Gbps in
/var/log/messages:
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Before:
cat /sys/class/ata_link/link1/sata_spd
3.0 Gbps
After:
cat /sys/class/ata_link/link1/sata_spd
6.0 Gbps

Signed-off-by: Gwendal Grignou 
Signed-off-by: Tejun Heo 
Signed-off-by: Luis Henriques 
---
 drivers/ata/libata-transport.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
index c341904..9215677 100644
--- a/drivers/ata/libata-transport.c
+++ b/drivers/ata/libata-transport.c
@@ -319,25 +319,25 @@ int ata_tport_add(struct device *parent,
 /*
  * ATA link attributes
  */
+static int noop(int x) { return x; }
 
-
-#define ata_link_show_linkspeed(field) \
+#define ata_link_show_linkspeed(field, format) \
 static ssize_t \
 show_ata_link_##field(struct device *dev,  \
  struct device_attribute *attr, char *buf) \
 {  \
struct ata_link *link = transport_class_to_link(dev);   \
\
-   return sprintf(buf,"%s\n", sata_spd_string(fls(link->field)));  \
+   return sprintf(buf, "%s\n", sata_spd_string(format(link->field))); \
 }
 
-#define ata_link_linkspeed_attr(field) \
-   ata_link_show_linkspeed(field)  \
+#define ata_link_linkspeed_attr(field, format) \
+   ata_link_show_linkspeed(field, format)  \
 static DEVICE_ATTR(field, S_IRUGO, show_ata_link_##field, NULL)
 
-ata_link_linkspeed_attr(hw_sata_spd_limit);
-ata_link_linkspeed_attr(sata_spd_limit);
-ata_link_linkspeed_attr(sata_spd);
+ata_link_linkspeed_attr(hw_sata_spd_limit, fls);
+ata_link_linkspeed_attr(sata_spd_limit, fls);
+ata_link_linkspeed_attr(sata_spd, noop);
 
 
 static DECLARE_TRANSPORT_CLASS(ata_link_class,
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 51/78] drivers/libata: Set max sector to 65535 for Slimtype DVD A DS8A9SH drive

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Shan Hai 

commit 0523f037f65dba10191b0fa9c51266f90ba64630 upstream.

The "Slimtype DVD A  DS8A9SH" drive locks up with following backtrace when
the max sector is smaller than 65535 bytes, fix it by adding a quirk to set
the max sector to 65535 bytes.

INFO: task flush-11:0:663 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-11:0D 5ceb 0   663  2 0x
 88026d3b1710 0046 0001 
 88026f2530c0 88026d365860 88026d3b16e0 812ffd52
 88026d4fd3d0 00010001 88026d3b16f0 88026d3b1fd8
Call Trace:
 [] ? cfq_may_queue+0x52/0xf0
 [] schedule+0x18/0x30
 [] io_schedule+0x42/0x60
 [] get_request_wait+0xeb/0x1f0
 [] ? autoremove_wake_function+0x0/0x40
 [] ? elv_merge+0x42/0x210
 [] __make_request+0x8e/0x4e0
 [] generic_make_request+0x21e/0x5e0
 [] submit_bio+0x5d/0xd0
 [] submit_bh+0xf2/0x130
 [] __block_write_full_page+0x1dc/0x3a0
 [] ? end_buffer_async_write+0x0/0x120
 [] ? blkdev_get_block+0x0/0x70
 [] ? blkdev_get_block+0x0/0x70
 [] ? end_buffer_async_write+0x0/0x120
 [] block_write_full_page_endio+0xde/0x100
 [] block_write_full_page+0x10/0x20
 [] blkdev_writepage+0x13/0x20
 [] __writepage+0x15/0x40
 [] write_cache_pages+0x1cf/0x3e0
 [] ? __writepage+0x0/0x40
 [] generic_writepages+0x22/0x30
 [] do_writepages+0x1f/0x40
 [] writeback_single_inode+0xe7/0x3b0
 [] writeback_sb_inodes+0x184/0x280
 [] writeback_inodes_wb+0x6b/0x1a0
 [] wb_writeback+0x23b/0x2a0
 [] wb_do_writeback+0x17d/0x190
 [] bdi_writeback_task+0x4b/0xe0
 [] ? bdi_start_fn+0x0/0x100
 [] bdi_start_fn+0x81/0x100
 [] ? bdi_start_fn+0x0/0x100
 [] kthread+0x8e/0xa0
 [] ? finish_task_switch+0x54/0xc0
 [] kernel_thread_helper+0x4/0x10
 [] ? kthread+0x0/0xa0
 [] ? kernel_thread_helper+0x0/0x10

 The above trace was triggered by
   "dd if=/dev/zero of=/dev/sr0 bs=2048 count=32768"

Signed-off-by: Shan Hai 
Signed-off-by: Tejun Heo 
Signed-off-by: Luis Henriques 
---
 drivers/ata/libata-core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 9e47300..705658d 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4075,6 +4075,7 @@ static const struct ata_blacklist_entry 
ata_device_blacklist [] = {
{ "TORiSAN DVD-ROM DRD-N216", NULL, ATA_HORKAGE_MAX_SEC_128 },
{ "QUANTUM DATDAT72-000", NULL, ATA_HORKAGE_ATAPI_MOD16_DMA },
{ "Slimtype DVD A  DS8A8SH", NULL,  ATA_HORKAGE_MAX_SEC_LBA48 },
+   { "Slimtype DVD A  DS8A9SH", NULL,  ATA_HORKAGE_MAX_SEC_LBA48 },
 
/* Devices we expect to fail diagnostics */
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 55/78] powerpc/vio: use strcpy in modalias_show

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Prarit Bhargava 

commit 411cabf79e684171669ad29a0628c400b4431e95 upstream.

Commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 used strcat instead of
strcpy which can result in an overflow of newlines on the buffer.

Signed-off-by: Prarit Bhargava
Cc: b...@kernel.crashing.org
Cc: b...@decadent.org.uk
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Luis Henriques 
---
 arch/powerpc/kernel/vio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c
index b161bae..4869c4e 100644
--- a/arch/powerpc/kernel/vio.c
+++ b/arch/powerpc/kernel/vio.c
@@ -1521,12 +1521,12 @@ static ssize_t modalias_show(struct device *dev, struct 
device_attribute *attr,
 
dn = dev->of_node;
if (!dn) {
-   strcat(buf, "\n");
+   strcpy(buf, "\n");
return strlen(buf);
}
cp = of_get_property(dn, "compatible", NULL);
if (!cp) {
-   strcat(buf, "\n");
+   strcpy(buf, "\n");
return strlen(buf);
}
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 59/78] powerpc/powernv: Add PE to its own PELTV

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Gavin Shan 

commit 631ad691b5818291d89af9be607d2fe40be0886e upstream.

We need add PE to its own PELTV. Otherwise, the errors originated
from the PE might contribute to other PEs. In the result, we can't
clear up the error successfully even we're checking and clearing
errors during access to PCI config space.

Reported-by: kalsh...@in.ibm.com
Signed-off-by: Gavin Shan 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Luis Henriques 
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index fbdd74d..5da8e8d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -613,13 +613,23 @@ static int __devinit pnv_ioda_configure_pe(struct pnv_phb 
*phb,
rid_end = pe->rid + 1;
}
 
-   /* Associate PE in PELT */
+   /*
+* Associate PE in PELT. We need add the PE into the
+* corresponding PELT-V as well. Otherwise, the error
+* originated from the PE might contribute to other
+* PEs.
+*/
rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid,
 bcomp, dcomp, fcomp, OPAL_MAP_PE);
if (rc) {
pe_err(pe, "OPAL error %ld trying to setup PELT table\n", rc);
return -ENXIO;
}
+
+   rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number,
+   pe->pe_number, OPAL_ADD_PE_TO_DOMAIN);
+   if (rc)
+   pe_warn(pe, "OPAL error %d adding self to PELTV\n", rc);
opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number,
  OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 47/78] ARM: OMAP2+: irq, AM33XX add missing register check

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Markus Pargmann 

commit 0bebda684857f76548ea48c8886785198701d8d3 upstream.

am33xx has a INTC_PENDING_IRQ3 register that is not checked for pending
interrupts. This patch adds AM33XX to the ifdef of SOCs that have to
check this register.

Signed-off-by: Markus Pargmann 
Signed-off-by: Tony Lindgren 
Signed-off-by: Luis Henriques 
---
 arch/arm/mach-omap2/irq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-omap2/irq.c b/arch/arm/mach-omap2/irq.c
index 6038a8c..4137499 100644
--- a/arch/arm/mach-omap2/irq.c
+++ b/arch/arm/mach-omap2/irq.c
@@ -232,7 +232,7 @@ static inline void omap_intc_handle_irq(void __iomem 
*base_addr, struct pt_regs
goto out;
 
irqnr = readl_relaxed(base_addr + 0xd8);
-#ifdef CONFIG_SOC_TI81XX
+#if IS_ENABLED(CONFIG_SOC_TI81XX) || IS_ENABLED(CONFIG_SOC_AM33XX)
if (irqnr)
goto out;
irqnr = readl_relaxed(base_addr + 0xf8);
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Add memory barrier when waiting on futex

2013-11-25 Thread Ma, Xindong
We encountered following panic several times:
[   74.671982] BUG: unable to handle kernel NULL pointer dereference at 0008
[   74.672101] IP: [] wake_futex+0x47/0x80
[   74.672185] *pdpt = 10108001 *pde =  
[   74.672278] Oops: 0002 [#1] PREEMPT SMP 
[   74.672403] Modules linked in: atomisp_css2400b0_v2 atomisp_css2400_v2 dfrgx 
bcm_bt_lpm videobuf_vmalloc videobuf_core hdmi_audio tngdisp bcm4335 
kct_daemon(O) cfg80211
[   74.672815] CPU: 0 PID: 1477 Comm: zygote Tainted: GW  O 
3.10.1-259934-g0bfb86e #1
[   74.672855] Hardware name: Intel Corporation Merrifield/SALT BAY, BIOS 404 
2013.10.09:15.29.48
[   74.672894] task: d4c97220 ti: cfaa8000 task.ti: cfaa8000
[   74.672933] EIP: 0060:[] EFLAGS: 00210246 CPU: 0
[   74.672975] EIP is at wake_futex+0x47/0x80
[   74.673012] EAX:  EBX:  ECX:  EDX: 
[   74.673049] ESI: def4de5c EDI:  EBP: cfaa9eb4 ESP: cfaa9ea0
[   74.673086]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   74.673123] CR0: 8005003b CR2: 0008 CR3: 10109000 CR4: 001007f0
[   74.673160] DR0:  DR1:  DR2:  DR3: 
[   74.673196] DR6: 0ff0 DR7: 0400
[   74.673229] Stack:
[   74.673260]   0001  def4de5c c225eb50 cfaa9ee4 c129bc29 

[   74.673536]   7fff c225eb30 b4f38000 ec1a4b40 0f90 7fff 
0001
[   74.673814]  b4f38f90 cfaa9f58 c129da0b   cfaa9f10 c195d835 
0001
[   74.674092] Call Trace:
[   74.674144]  [] futex_wake+0xc9/0x110
[   74.674195]  [] do_futex+0xeb/0x950
[   74.674246]  [] ? sub_preempt_count+0x55/0xe0
[   74.674293]  [] ? wake_up_new_task+0xee/0x190
[   74.674341]  [] ? _raw_spin_unlock_irqrestore+0x3b/0x70
[   74.674388]  [] ? wake_up_new_task+0xee/0x190
[   74.674436]  [] ? do_fork+0xec/0x350
[   74.674484]  [] SyS_futex+0x9b/0x140
[   74.674533]  [] ? SyS_mprotect+0x188/0x1e0
[   74.674582]  [] syscall_call+0x7/0xb

On smp systems, setting current task to q->task in queue_me() may
not visible immediately to another cpu, some times this will
cause panic in wake_futex(). Adding memory barrier to avoid this.

Signed-off-by: Leon Ma 
Signed-off-by: xiaobing tu 
---
 kernel/futex.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 80ba086..792cd41 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1529,6 +1529,7 @@ static inline void queue_me(struct futex_q *q, struct 
futex_hash_bucket *hb)
plist_node_init(>list, prio);
plist_add(>list, >chain);
q->task = current;
+   smp_mb();
spin_unlock(>lock);
 }
 
-- 
1.7.4.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 57/78] ASoC: ak4642: prevent un-necessary changes to SG_SL1

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Phil Edworthy 

commit 7b5bfb82882b9b1c8423ce0ed6852ca3762d967a upstream.

If you record the sound during playback,
the playback sound becomes silent.
Modify so that the codec driver does not clear
SG_SL1::DACL bit which is controlled under widget

Signed-off-by: Phil Edworthy 
Signed-off-by: Kuninori Morimoto 
Signed-off-by: Mark Brown 
Signed-off-by: Luis Henriques 
---
 sound/soc/codecs/ak4642.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/codecs/ak4642.c b/sound/soc/codecs/ak4642.c
index b3e24f2..7e4245f 100644
--- a/sound/soc/codecs/ak4642.c
+++ b/sound/soc/codecs/ak4642.c
@@ -262,7 +262,7 @@ static int ak4642_dai_startup(struct snd_pcm_substream 
*substream,
 * This operation came from example code of
 * "ASAHI KASEI AK4642" (japanese) manual p94.
 */
-   snd_soc_write(codec, SG_SL1, PMMP | MGAIN0);
+   snd_soc_update_bits(codec, SG_SL1, PMMP | MGAIN0, PMMP | 
MGAIN0);
snd_soc_write(codec, TIMER, ZTM(0x3) | WTM(0x3));
snd_soc_write(codec, ALC_CTL1, ALC | LMTH0);
snd_soc_update_bits(codec, PW_MGMT1, PMADL, PMADL);
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 53/78] usb: wusbcore: set the RPIPE wMaxPacketSize value correctly

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Thomas Pugliese 

commit 7b6bc07ab554e929c85d51b3d5b26cf7f12c6a3b upstream.

For isochronous endpoints, set the RPIPE wMaxPacketSize value using
wOverTheAirPacketSize from the endpoint companion descriptor instead of
wMaxPacketSize from the normal endpoint descriptor.

Signed-off-by: Thomas Pugliese 
Signed-off-by: Greg Kroah-Hartman 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 drivers/usb/wusbcore/wa-rpipe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/wusbcore/wa-rpipe.c b/drivers/usb/wusbcore/wa-rpipe.c
index f0d546c..ca1031b 100644
--- a/drivers/usb/wusbcore/wa-rpipe.c
+++ b/drivers/usb/wusbcore/wa-rpipe.c
@@ -332,7 +332,10 @@ static int rpipe_aim(struct wa_rpipe *rpipe, struct wahc 
*wa,
/* FIXME: compute so seg_size > ep->maxpktsize */
rpipe->descr.wBlocks = cpu_to_le16(16); /* given */
/* ep0 maxpktsize is 0x200 (WUSB1.0[4.8.1]) */
-   rpipe->descr.wMaxPacketSize = cpu_to_le16(ep->desc.wMaxPacketSize);
+   if (usb_endpoint_xfer_isoc(>desc))
+   rpipe->descr.wMaxPacketSize = epcd->wOverTheAirPacketSize;
+   else
+   rpipe->descr.wMaxPacketSize = ep->desc.wMaxPacketSize;
rpipe->descr.bHSHubAddress = 0; /* reserved: zero */
rpipe->descr.bHSHubPort = wusb_port_no_to_idx(urb->dev->portnum);
/* FIXME: use maximum speed as supported or recommended by device */
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 54/78] usb: wusbcore: change WA_SEGS_MAX to a legal value

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Thomas Pugliese 

commit f74b75e7f920c700636a669c7d16d12e9202 upstream.

change WA_SEGS_MAX to a number that is legal according to the WUSB
spec.

Signed-off-by: Thomas Pugliese 
Signed-off-by: Greg Kroah-Hartman 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 drivers/usb/wusbcore/wa-xfer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/wusbcore/wa-xfer.c b/drivers/usb/wusbcore/wa-xfer.c
index 1ebc17e..8cf9003 100644
--- a/drivers/usb/wusbcore/wa-xfer.c
+++ b/drivers/usb/wusbcore/wa-xfer.c
@@ -90,7 +90,8 @@
 #include "wusbhc.h"
 
 enum {
-   WA_SEGS_MAX = 255,
+   /* [WUSB] section 8.3.3 allocates 7 bits for the segment index. */
+   WA_SEGS_MAX = 128,
 };
 
 enum wa_seg_status {
@@ -444,7 +445,7 @@ static ssize_t __wa_xfer_setup_sizes(struct wa_xfer *xfer,
xfer->seg_size = (xfer->seg_size / maxpktsize) * maxpktsize;
xfer->segs = (urb->transfer_buffer_length + xfer->seg_size - 1)
/ xfer->seg_size;
-   if (xfer->segs >= WA_SEGS_MAX) {
+   if (xfer->segs > WA_SEGS_MAX) {
dev_err(dev, "BUG? ops, number of segments %d bigger than %d\n",
(int)(urb->transfer_buffer_length / xfer->seg_size),
WA_SEGS_MAX);
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 62/78] qeth: avoid buffer overflow in snmp ioctl

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ursula Braun 

commit 6fb392b1a63ae36c31f62bc3fc8630b49d602b62 upstream.

Check user-defined length in snmp ioctl request and allow request
only if it fits into a qeth command buffer.

Signed-off-by: Ursula Braun 
Signed-off-by: Frank Blaschka 
Reviewed-by: Heiko Carstens 
Reported-by: Nico Golde 
Reported-by: Fabian Yamaguchi 
Signed-off-by: David S. Miller 
Signed-off-by: Luis Henriques 
---
 drivers/s390/net/qeth_core_main.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index e118e1e..c3121f7 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -4355,7 +4355,7 @@ int qeth_snmp_command(struct qeth_card *card, char __user 
*udata)
struct qeth_cmd_buffer *iob;
struct qeth_ipa_cmd *cmd;
struct qeth_snmp_ureq *ureq;
-   int req_len;
+   unsigned int req_len;
struct qeth_arp_query_info qinfo = {0, };
int rc = 0;
 
@@ -4371,6 +4371,10 @@ int qeth_snmp_command(struct qeth_card *card, char 
__user *udata)
/* skip 4 bytes (data_len struct member) to get req_len */
if (copy_from_user(_len, udata + sizeof(int), sizeof(int)))
return -EFAULT;
+   if (req_len > (QETH_BUFSIZE - IPA_PDU_HEADER_SIZE -
+  sizeof(struct qeth_ipacmd_hdr) -
+  sizeof(struct qeth_ipacmd_setadpparms_hdr)))
+   return -EINVAL;
ureq = memdup_user(udata, req_len + sizeof(struct qeth_snmp_ureq_hdr));
if (IS_ERR(ureq)) {
QETH_CARD_TEXT(card, 2, "snmpnome");
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 63/78] cris: media platform drivers: fix build

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 72a0c5571351f5184195754d23db3e14495b2080 upstream.

On cris arch, the functions below aren't defined:

  drivers/media/platform/sh_veu.c: In function 'sh_veu_reg_read':

  drivers/media/platform/sh_veu.c:228:2: error: implicit declaration of 
function 'ioread32' [-Werror=implicit-function-declaration]
  drivers/media/platform/sh_veu.c: In function 'sh_veu_reg_write':

  drivers/media/platform/sh_veu.c:234:2: error: implicit declaration of 
function 'iowrite32' [-Werror=implicit-function-declaration]
  drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_read':
  drivers/media/platform/vsp1/vsp1.h:66:2: error: implicit declaration of 
function 'ioread32' [-Werror=implicit-function-declaration]
  drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_write':
  drivers/media/platform/vsp1/vsp1.h:71:2: error: implicit declaration of 
function 'iowrite32' [-Werror=implicit-function-declaration]
  drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_read':
  drivers/media/platform/vsp1/vsp1.h:66:2: error: implicit declaration of 
function 'ioread32' [-Werror=implicit-function-declaration]
  drivers/media/platform/vsp1/vsp1.h: In function 'vsp1_write':
  drivers/media/platform/vsp1/vsp1.h:71:2: error: implicit declaration of 
function 'iowrite32' [-Werror=implicit-function-declaration]
  drivers/media/platform/soc_camera/rcar_vin.c: In function 'rcar_vin_setup':
  drivers/media/platform/soc_camera/rcar_vin.c:284:3: error: implicit 
declaration of function 'iowrite32' [-Werror=implicit-function-declaration]

  drivers/media/platform/soc_camera/rcar_vin.c: In function 
'rcar_vin_request_capture_stop':
  drivers/media/platform/soc_camera/rcar_vin.c:353:2: error: implicit 
declaration of function 'ioread32' [-Werror=implicit-function-declaration]

Yet, they're available, as CONFIG_GENERIC_IOMAP is defined.  What happens
is that asm/io.h was not including asm-generic/iomap.h.

Suggested-by: Ben Hutchings 
Signed-off-by: Mauro Carvalho Chehab 
Cc: Mikael Starvik 
Cc: Jesper Nilsson 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 arch/cris/include/asm/io.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/cris/include/asm/io.h b/arch/cris/include/asm/io.h
index ac12ae2..db9a16c 100644
--- a/arch/cris/include/asm/io.h
+++ b/arch/cris/include/asm/io.h
@@ -3,6 +3,7 @@
 
 #include/* for __va, __pa */
 #include 
+#include 
 #include 
 
 struct cris_io_operations
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 70/78] tracing: Fix potential out-of-bounds in trace_get_user()

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Steven Rostedt 

commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream.

Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address 8800359c99f3
8800359c99f3 is located 0 bytes to the right of 243-byte region 
[8800359c9900, 8800359c99f3)
Accessed by thread T13003:
  #0 810dd2da (asan_report_error+0x32a/0x440)
  #1 810dc6b0 (asan_check_region+0x30/0x40)
  #2 810dd4d3 (__tsan_write1+0x13/0x20)
  #3 811cd19e (ftrace_regex_release+0x1be/0x260)
  #4 812a1065 (__fput+0x155/0x360)
  #5 812a12de (fput+0x1e/0x30)
  #6 8111708d (task_work_run+0x10d/0x140)
  #7 810ea043 (do_exit+0x433/0x11f0)
  #8 810eaee4 (do_group_exit+0x84/0x130)
  #9 810eafb1 (SyS_exit_group+0x21/0x30)
  #10 81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 810dc778 (asan_slab_alloc+0x48/0xc0)
  #1 8128337c (__kmalloc+0xbc/0x500)
  #2 811d9d54 (trace_parser_get_init+0x34/0x90)
  #3 811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  #4 811cda7d (ftrace_filter_open+0x2d/0x40)
  #5 8129b4ff (do_dentry_open+0x32f/0x430)
  #6 8129b668 (finish_open+0x68/0xa0)
  #7 812b66ac (do_last+0xb8c/0x1710)
  #8 812b7350 (path_openat+0x120/0xb50)
  #9 812b8884 (do_filp_open+0x54/0xb0)
  #10 8129d36c (do_sys_open+0x1ac/0x2c0)
  #11 8129d4b7 (SyS_open+0x37/0x50)
  #12 81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:  fa
  Heap kmalloc redzone:  fb
  Freed heap region: fd
  Shadow gap:fe

The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser->idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1...@gandalf.local.home

Reported-by: Andrey Konovalov 
Signed-off-by: Steven Rostedt 
Signed-off-by: Luis Henriques 
---
 kernel/trace/trace.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 09739c6..d570df8 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -578,9 +578,12 @@ int trace_get_user(struct trace_parser *parser, const char 
__user *ubuf,
if (isspace(ch)) {
parser->buffer[parser->idx] = 0;
parser->cont = false;
-   } else {
+   } else if (parser->idx < parser->size - 1) {
parser->cont = true;
parser->buffer[parser->idx++] = ch;
+   } else {
+   ret = -EINVAL;
+   goto out;
}
 
*ppos += read;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 71/78] ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ivan Djelic 

commit 455bd4c430b0c0a361f38e8658a0d6cb469942b5 upstream.

Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on
assumptions about the implementation of memset and similar functions.
The current ARM optimized memset code does not return the value of
its first argument, as is usually expected from standard implementations.

For instance in the following function:

void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
{
memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter));
waiter->magic = waiter;
INIT_LIST_HEAD(>list);
}

compiled as:

800554d0 :
800554d0:   e92d4008push{r3, lr}
800554d4:   e1a1mov r0, r1
800554d8:   e3a02010mov r2, #16 ; 0x10
800554dc:   e3a01011mov r1, #17 ; 0x11
800554e0:   eb04426ebl  80165ea0 
800554e4:   e1a03000mov r3, r0
800554e8:   e583000cstr r0, [r3, #12]
800554ec:   e583str r0, [r3]
800554f0:   e5830004str r0, [r3, #4]
800554f4:   e8bd8008pop {r3, pc}

GCC assumes memset returns the value of pointer 'waiter' in register r0; causing
register/memory corruptions.

This patch fixes the return value of the assembly version of memset.
It adds a 'mov' instruction and merges an additional load+store into
existing load/store instructions.
For ease of review, here is a breakdown of the patch into 4 simple steps:

Step 1
==
Perform the following substitutions:
ip -> r8, then
r0 -> ip,
and insert 'mov ip, r0' as the first statement of the function.
At this point, we have a memset() implementation returning the proper result,
but corrupting r8 on some paths (the ones that were using ip).

Step 2
==
Make sure r8 is saved and restored when (! CALGN(1)+0) == 1:

save r8:
-   str lr, [sp, #-4]!
+   stmfd   sp!, {r8, lr}

and restore r8 on both exit paths:
-   ldmeqfd sp!, {pc}   @ Now <64 bytes to go.
+   ldmeqfd sp!, {r8, pc}   @ Now <64 bytes to go.
(...)
tst r2, #16
stmneia ip!, {r1, r3, r8, lr}
-   ldr lr, [sp], #4
+   ldmfd   sp!, {r8, lr}

Step 3
==
Make sure r8 is saved and restored when (! CALGN(1)+0) == 0:

save r8:
-   stmfd   sp!, {r4-r7, lr}
+   stmfd   sp!, {r4-r8, lr}

and restore r8 on both exit paths:
bgt 3b
-   ldmeqfd sp!, {r4-r7, pc}
+   ldmeqfd sp!, {r4-r8, pc}
(...)
tst r2, #16
stmneia ip!, {r4-r7}
-   ldmfd   sp!, {r4-r7, lr}
+   ldmfd   sp!, {r4-r8, lr}

Step 4
==
Rewrite register list "r4-r7, r8" as "r4-r8".

Signed-off-by: Ivan Djelic 
Reviewed-by: Nicolas Pitre 
Signed-off-by: Dirk Behme 
Signed-off-by: Russell King 
Cc: Eric Bénard 
Signed-off-by: Luis Henriques 
---
 arch/arm/lib/memset.S | 85 ++-
 1 file changed, 44 insertions(+), 41 deletions(-)

diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index 650d592..d912e73 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -19,9 +19,9 @@
 1: subsr2, r2, #4  @ 1 do we have enough
blt 5f  @ 1 bytes to align with?
cmp r3, #2  @ 1
-   strltb  r1, [r0], #1@ 1
-   strleb  r1, [r0], #1@ 1
-   strbr1, [r0], #1@ 1
+   strltb  r1, [ip], #1@ 1
+   strleb  r1, [ip], #1@ 1
+   strbr1, [ip], #1@ 1
add r2, r2, r3  @ 1 (r2 = r2 - (4 - r3))
 /*
  * The pointer is now aligned and the length is adjusted.  Try doing the
@@ -29,10 +29,14 @@
  */
 
 ENTRY(memset)
-   andsr3, r0, #3  @ 1 unaligned?
+/*
+ * Preserve the contents of r0 for the return value.
+ */
+   mov ip, r0
+   andsr3, ip, #3  @ 1 unaligned?
bne 1b  @ 1
 /*
- * we know that the pointer in r0 is aligned to a word boundary.
+ * we know that the pointer in ip is aligned to a word boundary.
  */
orr r1, r1, r1, lsl #8
orr r1, r1, r1, lsl #16
@@ -43,29 +47,28 @@ ENTRY(memset)
 #if ! CALGN(1)+0
 
 /*
- * We need an extra register for this loop - save the return address and
- * use the LR
+ * We need 2 extra registers for this loop - use r8 and the LR
  */
-   str lr, [sp, #-4]!
-   mov ip, r1
+   stmfd   sp!, {r8, lr}
+   mov r8, r1
mov lr, r1
 
 2: subsr2, r2, #64
-   stmgeia r0!, {r1, r3, ip, lr}   @ 64 bytes at a time.
-   stmgeia r0!, {r1, r3, ip, lr}
-   stmgeia r0!, {r1, r3, ip, lr}
-   stmgeia r0!, {r1, r3, ip, lr}
+   stmgeia ip!, {r1, r3, r8, lr}   @ 64 bytes at a time.
+   stmgeia ip!, {r1, r3, r8, lr}
+   stmgeia ip!, {r1, r3, r8, lr}

[PATCH 3.5 58/78] ahci: Add Device IDs for Intel Wildcat Point-LP

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: James Ralston 

commit 9f961a5f6efc87a79571d7166257b36af28ffcfe upstream.

This patch adds the AHCI-mode SATA Device IDs for the Intel Wildcat Point-LP 
PCH.

Signed-off-by: James Ralston 
Signed-off-by: Tejun Heo 
Signed-off-by: Luis Henriques 
---
 drivers/ata/ahci.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 9270f35..d0f8a93 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -301,6 +301,10 @@ static const struct pci_device_id ahci_pci_tbl[] = {
{ PCI_VDEVICE(INTEL, 0x8d66), board_ahci }, /* Wellsburg RAID */
{ PCI_VDEVICE(INTEL, 0x8d6e), board_ahci }, /* Wellsburg RAID */
{ PCI_VDEVICE(INTEL, 0x23a3), board_ahci }, /* Coleto Creek AHCI */
+   { PCI_VDEVICE(INTEL, 0x9c83), board_ahci }, /* Wildcat Point-LP AHCI */
+   { PCI_VDEVICE(INTEL, 0x9c85), board_ahci }, /* Wildcat Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c87), board_ahci }, /* Wildcat Point-LP RAID */
+   { PCI_VDEVICE(INTEL, 0x9c8f), board_ahci }, /* Wildcat Point-LP RAID */
 
/* JMicron 360/1/3/5/6, match class to avoid IDE function */
{ PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 73/78] usb: fix cleanup after failure in hub_configure()

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Krzysztof Mazur 

commit d0308d4b6b02597f39fc31a9bddf7bb3faad5622 upstream.

If the hub_configure() fails after setting the hdev->maxchild
the hub->ports might be NULL or point to uninitialized kzallocated
memory causing NULL pointer dereference in hub_quiesce() during cleanup.

Now after such error the hdev->maxchild is set to 0 to avoid cleanup
of uninitialized ports.

Signed-off-by: Krzysztof Mazur 
Acked-by: Alan Stern 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Luis Henriques 
---
 drivers/usb/core/hub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index b5503b0..b79aa83 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1562,6 +1562,7 @@ static int hub_configure(struct usb_hub *hub,
return 0;
 
 fail:
+   hdev->maxchild = 0;
dev_err (hub_dev, "config failed, %s (err %d)\n",
message, ret);
/* hub_disconnect() frees urb and descriptor */
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 67/78] backlight: atmel-pwm-bl: fix gpio polarity in remove

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Johan Hovold 

commit ad5066d4c2b1d696749f8d7816357c23b648c4d3 upstream.

Make sure to honour gpio polarity also at remove so that the backlight is
actually disabled on boards with active-low enable pin.

Signed-off-by: Johan Hovold 
Acked-by: Jingoo Han 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 drivers/video/backlight/atmel-pwm-bl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/video/backlight/atmel-pwm-bl.c 
b/drivers/video/backlight/atmel-pwm-bl.c
index 4d2bbd8..dab3a0c 100644
--- a/drivers/video/backlight/atmel-pwm-bl.c
+++ b/drivers/video/backlight/atmel-pwm-bl.c
@@ -211,7 +211,8 @@ static int __exit atmel_pwm_bl_remove(struct 
platform_device *pdev)
struct atmel_pwm_bl *pwmbl = platform_get_drvdata(pdev);
 
if (pwmbl->gpio_on != -1) {
-   gpio_set_value(pwmbl->gpio_on, 0);
+   gpio_set_value(pwmbl->gpio_on,
+   0 ^ pwmbl->pdata->on_active_low);
gpio_free(pwmbl->gpio_on);
}
pwm_channel_disable(>pwmc);
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 68/78] devpts: plug the memory leak in kill_sb

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ilija Hadzic 

commit 66da0e1f9034140ae2f571ef96e254a25083906c upstream.

When devpts is unmounted, there may be a no-longer-used IDR tree hanging
off the superblock we are about to kill.  This needs to be cleaned up
before destroying the SB.

The leak is usually not a big deal because unmounting devpts is typically
done when shutting down the whole machine.  However, shutting down an LXC
container instead of a physical machine exposes the problem (the garbage
is detectable with kmemleak).

Signed-off-by: Ilija Hadzic 
Cc: Sukadev Bhattiprolu 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 fs/devpts/inode.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 979c1e3..1ed9d5e 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -483,6 +483,7 @@ static void devpts_kill_sb(struct super_block *sb)
 {
struct pts_fs_info *fsi = DEVPTS_SB(sb);
 
+   ida_destroy(>allocated_ptys);
kfree(fsi);
kill_litter_super(sb);
 }
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 64/78] mm: ensure get_unmapped_area() returns higher address than mmap_min_addr

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Akira Takeuchi 

commit 2afc745f3e3079ab16c826be4860da2529054dd2 upstream.

This patch fixes the problem that get_unmapped_area() can return illegal
address and result in failing mmap(2) etc.

In case that the address higher than PAGE_SIZE is set to
/proc/sys/vm/mmap_min_addr, the address lower than mmap_min_addr can be
returned by get_unmapped_area(), even if you do not pass any virtual
address hint (i.e.  the second argument).

This is because the current get_unmapped_area() code does not take into
account mmap_min_addr.

This leads to two actual problems as follows:

1. mmap(2) can fail with EPERM on the process without CAP_SYS_RAWIO,
   although any illegal parameter is not passed.

2. The bottom-up search path after the top-down search might not work in
   arch_get_unmapped_area_topdown().

Note: The first and third chunk of my patch, which changes "len" check,
are for more precise check using mmap_min_addr, and not for solving the
above problem.

[How to reproduce]

--- test.c -
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
void *ret = NULL, *last_map;
size_t pagesize = sysconf(_SC_PAGESIZE);

do {
last_map = ret;
ret = mmap(0, pagesize, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
//  printf("ret=%p\n", ret);
} while (ret != MAP_FAILED);

if (errno != ENOMEM) {
printf("ERR: unexpected errno: %d (last map=%p)\n",
errno, last_map);
}

return 0;
}
---

$ gcc -m32 -o test test.c
$ sudo sysctl -w vm.mmap_min_addr=65536
vm.mmap_min_addr = 65536
$ ./test  (run as non-priviledge user)
ERR: unexpected errno: 1 (last map=0x1)

Signed-off-by: Akira Takeuchi 
Signed-off-by: Kiyoshi Owada 
Reviewed-by: Naoya Horiguchi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[ luis: backported to 3.5:
  - dropped changes to struct vm_unmapped_area_info in
  arch_get_unmapped_area_topdown() as this structure does not exist in 3.5
  kernel ]
Signed-off-by: Luis Henriques 
---
 mm/mmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 7e24763..758ff55 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1443,7 +1443,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long 
addr,
struct vm_area_struct *vma;
unsigned long start_addr;
 
-   if (len > TASK_SIZE)
+   if (len > TASK_SIZE - mmap_min_addr)
return -ENOMEM;
 
if (flags & MAP_FIXED)
@@ -1452,7 +1452,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long 
addr,
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
-   if (TASK_SIZE - len >= addr &&
+   if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
(!vma || addr + len <= vma->vm_start))
return addr;
}
@@ -1517,7 +1517,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
unsigned long addr0,
unsigned long addr = addr0, start_addr;
 
/* requested length too big for entire address space */
-   if (len > TASK_SIZE)
+   if (len > TASK_SIZE - mmap_min_addr)
return -ENOMEM;
 
if (flags & MAP_FIXED)
@@ -1527,7 +1527,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
unsigned long addr0,
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
-   if (TASK_SIZE - len >= addr &&
+   if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
(!vma || addr + len <= vma->vm_start))
return addr;
}
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 66/78] backlight: atmel-pwm-bl: fix reported brightness

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Johan Hovold 

commit 185d91442550110db67a7dc794a32efcea455a36 upstream.

The driver supports 16-bit brightness values, but the value returned
from get_brightness was truncated to eight bits.

Signed-off-by: Johan Hovold 
Cc: Jingoo Han 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 drivers/video/backlight/atmel-pwm-bl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/atmel-pwm-bl.c 
b/drivers/video/backlight/atmel-pwm-bl.c
index 0443a4f..4d2bbd8 100644
--- a/drivers/video/backlight/atmel-pwm-bl.c
+++ b/drivers/video/backlight/atmel-pwm-bl.c
@@ -70,7 +70,7 @@ static int atmel_pwm_bl_set_intensity(struct backlight_device 
*bd)
 static int atmel_pwm_bl_get_intensity(struct backlight_device *bd)
 {
struct atmel_pwm_bl *pwmbl = bl_get_data(bd);
-   u8 intensity;
+   u32 intensity;
 
if (pwmbl->pdata->pwm_active_low) {
intensity = pwm_channel_readl(>pwmc, PWM_CDTY) -
@@ -80,7 +80,7 @@ static int atmel_pwm_bl_get_intensity(struct backlight_device 
*bd)
pwm_channel_readl(>pwmc, PWM_CDTY);
}
 
-   return intensity;
+   return intensity & 0x;
 }
 
 static int atmel_pwm_bl_init_pwm(struct atmel_pwm_bl *pwmbl)
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 75/78] 8139cp: re-enable interrupts after tx timeout

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: David Woodhouse 

commit 01ffc0a7f1c1801a2354719dedbc32aff45b987d upstream.

Recovery doesn't work too well if we leave interrupts disabled...

Signed-off-by: David Woodhouse 
Acked-by: Francois Romieu 
Signed-off-by: David S. Miller 
Cc: Nathan Williams 
Signed-off-by: Luis Henriques 
---
 drivers/net/ethernet/realtek/8139cp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/realtek/8139cp.c 
b/drivers/net/ethernet/realtek/8139cp.c
index efd3e34..9ac8801 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -1252,6 +1252,7 @@ static void cp_tx_timeout(struct net_device *dev)
cp_clean_rings(cp);
rc = cp_init_rings(cp);
cp_start_hw(cp);
+   cp_enable_irq(cp);
 
netif_wake_queue(dev);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 77/78] Fix a few incorrectly checked [io_]remap_pfn_range() calls

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Linus Torvalds 

commit 7314e613d5ff9f0934f7a0f74ed7973b903315d1 upstream.

Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that
really should use the vm_iomap_memory() helper.  This trivially converts
two of them to the helper, and comments about why the third one really
needs to continue to use remap_pfn_range(), and adds the missing size
check.

Reported-by: Nico Golde 
Signed-off-by: Linus Torvalds 
Signed-off-by: Luis Henriques 
---
 drivers/uio/uio.c| 16 +++-
 drivers/video/au1100fb.c | 26 +-
 drivers/video/au1200fb.c | 23 +--
 3 files changed, 17 insertions(+), 48 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index a783d53..7150752 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -650,16 +650,30 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
 {
struct uio_device *idev = vma->vm_private_data;
int mi = uio_find_mem_index(vma);
+   struct uio_mem *mem;
if (mi < 0)
return -EINVAL;
+   mem = idev->info->mem + mi;
+
+   if (vma->vm_end - vma->vm_start > mem->size)
+   return -EINVAL;
 
vma->vm_flags |= VM_IO | VM_RESERVED;
 
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
+   /*
+* We cannot use the vm_iomap_memory() helper here,
+* because vma->vm_pgoff is the map index we looked
+* up above in uio_find_mem_index(), rather than an
+* actual page offset into the mmap.
+*
+* So we just do the physical mmap without a page
+* offset.
+*/
return remap_pfn_range(vma,
   vma->vm_start,
-  idev->info->mem[mi].addr >> PAGE_SHIFT,
+  mem->addr >> PAGE_SHIFT,
   vma->vm_end - vma->vm_start,
   vma->vm_page_prot);
 }
diff --git a/drivers/video/au1100fb.c b/drivers/video/au1100fb.c
index fe3b6ec..2169bc0 100644
--- a/drivers/video/au1100fb.c
+++ b/drivers/video/au1100fb.c
@@ -375,39 +375,15 @@ void au1100fb_fb_rotate(struct fb_info *fbi, int angle)
 int au1100fb_fb_mmap(struct fb_info *fbi, struct vm_area_struct *vma)
 {
struct au1100fb_device *fbdev;
-   unsigned int len;
-   unsigned long start=0, off;
 
fbdev = to_au1100fb_device(fbi);
 
-   if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
-   return -EINVAL;
-   }
-
-   start = fbdev->fb_phys & PAGE_MASK;
-   len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
-   off = vma->vm_pgoff << PAGE_SHIFT;
-
-   if ((vma->vm_end - vma->vm_start + off) > len) {
-   return -EINVAL;
-   }
-
-   off += start;
-   vma->vm_pgoff = off >> PAGE_SHIFT;
-
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
pgprot_val(vma->vm_page_prot) |= (6 << 9); //CCA=6
 
vma->vm_flags |= VM_IO;
 
-   if (io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
-   vma->vm_end - vma->vm_start,
-   vma->vm_page_prot)) {
-   return -EAGAIN;
-   }
-
-   return 0;
+   return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
 }
 
 static struct fb_ops au1100fb_ops =
diff --git a/drivers/video/au1200fb.c b/drivers/video/au1200fb.c
index 7ca79f0..117be3d 100644
--- a/drivers/video/au1200fb.c
+++ b/drivers/video/au1200fb.c
@@ -1233,36 +1233,15 @@ static int au1200fb_fb_blank(int blank_mode, struct 
fb_info *fbi)
  * method mainly to allow the use of the TLB streaming flag (CCA=6)
  */
 static int au1200fb_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
-
 {
-   unsigned int len;
-   unsigned long start=0, off;
struct au1200fb_device *fbdev = info->par;
 
-   if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
-   return -EINVAL;
-   }
-
-   start = fbdev->fb_phys & PAGE_MASK;
-   len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
-   off = vma->vm_pgoff << PAGE_SHIFT;
-
-   if ((vma->vm_end - vma->vm_start + off) > len) {
-   return -EINVAL;
-   }
-
-   off += start;
-   vma->vm_pgoff = off >> PAGE_SHIFT;
-
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
pgprot_val(vma->vm_page_prot) |= _CACHE_MASK; /* CCA=7 */
 
vma->vm_flags |= VM_IO;
 
-   return io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot);
+   return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
 
return 0;
 }
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org

[PATCH 3.5 76/78] SUNRPC handle EKEYEXPIRED in call_refreshresult

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Andy Adamson 

commit eb96d5c97b0825d542e9c4ba5e0a22b519355166 upstream.

Currently, when an RPCSEC_GSS context has expired or is non-existent
and the users (Kerberos) credentials have also expired or are non-existent,
the client receives the -EKEYEXPIRED error and tries to refresh the context
forever.  If an application is performing I/O, or other work against the share,
the application hangs, and the user is not prompted to refresh/establish their
credentials. This can result in a denial of service for other users.

Users are expected to manage their Kerberos credential lifetimes to mitigate
this issue.

Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
of times to refresh the gss_context, and then return -EACCES to the application.

Signed-off-by: Andy Adamson 
Signed-off-by: Trond Myklebust 
Cc: Ben Hutchings 
[ luis: backported to 3.5 based on bwh backport to 3.2:
  - adjusted context
  - dropped changes to nfs4_handle_reclaim_lease_error() ]
Signed-off-by: Luis Henriques 
---
 fs/nfs/nfs3proc.c   |  6 +++---
 fs/nfs/nfs4filelayout.c |  1 -
 fs/nfs/nfs4proc.c   | 18 --
 fs/nfs/nfs4state.c  | 22 --
 fs/nfs/proc.c   | 43 ---
 net/sunrpc/clnt.c   |  1 +
 6 files changed, 4 insertions(+), 87 deletions(-)

diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index fda63e9..c7eb313 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -24,14 +24,14 @@
 
 #define NFSDBG_FACILITYNFSDBG_PROC
 
-/* A wrapper to handle the EJUKEBOX and EKEYEXPIRED error messages */
+/* A wrapper to handle the EJUKEBOX error messages */
 static int
 nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
 {
int res;
do {
res = rpc_call_sync(clnt, msg, flags);
-   if (res != -EJUKEBOX && res != -EKEYEXPIRED)
+   if (res != -EJUKEBOX)
break;
freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
res = -ERESTARTSYS;
@@ -44,7 +44,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message 
*msg, int flags)
 static int
 nfs3_async_handle_jukebox(struct rpc_task *task, struct inode *inode)
 {
-   if (task->tk_status != -EJUKEBOX && task->tk_status != -EKEYEXPIRED)
+   if (task->tk_status != -EJUKEBOX)
return 0;
if (task->tk_status == -EJUKEBOX)
nfs_inc_stats(inode, NFSIOS_DELAY);
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index e134029..8445359 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -169,7 +169,6 @@ static int filelayout_async_handle_error(struct rpc_task 
*task,
break;
case -NFS4ERR_DELAY:
case -NFS4ERR_GRACE:
-   case -EKEYEXPIRED:
rpc_delay(task, FILELAYOUT_POLL_RETRY_MAX);
break;
case -NFS4ERR_RETRY_UNCACHED_REP:
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 594ec86..a89661e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -341,7 +341,6 @@ static int nfs4_handle_exception(struct nfs_server *server, 
int errorcode, struc
}
case -NFS4ERR_GRACE:
case -NFS4ERR_DELAY:
-   case -EKEYEXPIRED:
ret = nfs4_delay(server->client, >timeout);
if (ret != 0)
break;
@@ -1371,13 +1370,6 @@ int nfs4_open_delegation_recall(struct nfs_open_context 
*ctx, struct nfs4_state
nfs_inode_find_state_and_recover(state->inode,
stateid);
nfs4_schedule_stateid_recovery(server, state);
-   case -EKEYEXPIRED:
-   /*
-* User RPCSEC_GSS context has expired.
-* We cannot recover this stateid now, so
-* skip it and allow recovery thread to
-* proceed.
-*/
case -ENOMEM:
err = 0;
goto out;
@@ -3949,7 +3941,6 @@ nfs4_async_handle_error(struct rpc_task *task, const 
struct nfs_server *server,
case -NFS4ERR_DELAY:
nfs_inc_server_stats(server, NFSIOS_DELAY);
case -NFS4ERR_GRACE:
-   case -EKEYEXPIRED:
rpc_delay(task, NFS4_POLL_RETRY_MAX);
task->tk_status = 0;
return -EAGAIN;
@@ -4906,15 +4897,6 @@ int nfs4_lock_delegation_recall(struct nfs4_state 
*state, struct file_lock *fl)

[PATCH 3.5 69/78] netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Patrick McHardy 

commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream.

Some Cisco phones create huge messages that are spread over multiple packets.
After calculating the offset of the SIP body, it is validated to be within
the packet and the packet is dropped otherwise. This breaks operation of
these phones. Since connection tracking is supposed to be passive, just let
those packets pass unmodified and untracked.

Signed-off-by: Patrick McHardy 
Signed-off-by: Pablo Neira Ayuso 
Cc: William Roberts 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 net/netfilter/nf_conntrack_sip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index 93faf6a..4a8c55b 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1468,7 +1468,7 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int 
protoff,
 
msglen = origlen = end - dptr;
if (msglen > datalen)
-   return NF_DROP;
+   return NF_ACCEPT;
 
ret = process_sip_msg(skb, ct, dataoff, , );
if (ret != NF_ACCEPT)
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 74/78] include/linux/fs.h: disable preempt when acquire i_size_seqcount write lock

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Fan Du 

commit 74e3d1e17b2e11d175970b85acd44f5927000ba2 upstream.

Two rt tasks bind to one CPU core.

The higher priority rt task A preempts a lower priority rt task B which
has already taken the write seq lock, and then the higher priority rt
task A try to acquire read seq lock, it's doomed to lockup.

rt task A with lower priority: call write
i_size_writert task B with higher 
priority: call sync, and preempt task A
  write_seqcount_begin(>i_size_seqcount);i_size_read
  inode->i_size = i_size; read_seqcount_begin <-- 
lockup here...

So disable preempt when acquiring every i_size_seqcount *write* lock will
cure the problem.

Signed-off-by: Fan Du 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Cc: Zhao Hongjiang 
Signed-off-by: Luis Henriques 
---
 include/linux/fs.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 17fd887..65b8b69 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -925,9 +925,11 @@ static inline loff_t i_size_read(const struct inode *inode)
 static inline void i_size_write(struct inode *inode, loff_t i_size)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+   preempt_disable();
write_seqcount_begin(>i_size_seqcount);
inode->i_size = i_size;
write_seqcount_end(>i_size_seqcount);
+   preempt_enable();
 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
preempt_disable();
inode->i_size = i_size;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 60/78] perf/ftrace: Fix paranoid level for enabling function tracer

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Steven Rostedt 

commit 12ae030d54ef250706da5642fc7697cc60ad0df7 upstream.

The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.

The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.

Reported-by: Dave Jones 
Reported-by: Vince Weaver 
Tested-by: Vince Weaver 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Frederic Weisbecker 
CVE: CVE-2013-2930
Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in 
perf")
Signed-off-by: Steven Rostedt 
Signed-off-by: Luis Henriques 
---
 kernel/trace/trace_event_perf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index fee3752..d01adb7 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -26,7 +26,7 @@ static int perf_trace_event_perm(struct ftrace_event_call 
*tp_event,
 {
/* The ftrace function trace is allowed only for root. */
if (ftrace_event_is_function(tp_event) &&
-   perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
+   perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
return -EPERM;
 
/* No tracing, just counting, so no obvious leak */
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 72/78] ARM: 7670/1: fix the memset fix

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Nicolas Pitre 

commit 418df63adac56841ef6b0f1fcf435bc64d4ed177 upstream.

Commit 455bd4c430b0 ("ARM: 7668/1: fix memset-related crashes caused by
recent GCC (4.7.2) optimizations") attempted to fix a compliance issue
with the memset return value.  However the memset itself became broken
by that patch for misaligned pointers.

This fixes the above by branching over the entry code from the
misaligned fixup code to avoid reloading the original pointer.

Also, because the function entry alignment is wrong in the Thumb mode
compilation, that fixup code is moved to the end.

While at it, the entry instructions are slightly reworked to help dual
issue pipelines.

Signed-off-by: Nicolas Pitre 
Tested-by: Alexander Holler 
Signed-off-by: Russell King 
Cc: Eric Bénard 
Signed-off-by: Luis Henriques 
---
 arch/arm/lib/memset.S | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index d912e73..94b0650 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -14,31 +14,15 @@
 
.text
.align  5
-   .word   0
-
-1: subsr2, r2, #4  @ 1 do we have enough
-   blt 5f  @ 1 bytes to align with?
-   cmp r3, #2  @ 1
-   strltb  r1, [ip], #1@ 1
-   strleb  r1, [ip], #1@ 1
-   strbr1, [ip], #1@ 1
-   add r2, r2, r3  @ 1 (r2 = r2 - (4 - r3))
-/*
- * The pointer is now aligned and the length is adjusted.  Try doing the
- * memset again.
- */
 
 ENTRY(memset)
-/*
- * Preserve the contents of r0 for the return value.
- */
-   mov ip, r0
-   andsr3, ip, #3  @ 1 unaligned?
-   bne 1b  @ 1
+   andsr3, r0, #3  @ 1 unaligned?
+   mov ip, r0  @ preserve r0 as return value
+   bne 6f  @ 1
 /*
  * we know that the pointer in ip is aligned to a word boundary.
  */
-   orr r1, r1, r1, lsl #8
+1: orr r1, r1, r1, lsl #8
orr r1, r1, r1, lsl #16
mov r3, r1
cmp r2, #16
@@ -127,4 +111,13 @@ ENTRY(memset)
tst r2, #1
strneb  r1, [ip], #1
mov pc, lr
+
+6: subsr2, r2, #4  @ 1 do we have enough
+   blt 5b  @ 1 bytes to align with?
+   cmp r3, #2  @ 1
+   strltb  r1, [ip], #1@ 1
+   strleb  r1, [ip], #1@ 1
+   strbr1, [ip], #1@ 1
+   add r2, r2, r3  @ 1 (r2 = r2 - (4 - r3))
+   b   1b
 ENDPROC(memset)
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 78/78] crypto: ansi_cprng - Fix off by one error in non-block size request

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Neil Horman 

commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream.

Stephan Mueller reported to me recently a error in random number generation in
the ansi cprng. If several small requests are made that are less than the
instances block size, the remainder for loop code doesn't increment
rand_data_valid in the last iteration, meaning that the last bytes in the
rand_data buffer gets reused on the subsequent smaller-than-a-block request for
random data.

The fix is pretty easy, just re-code the for loop to make sure that
rand_data_valid gets incremented appropriately

Signed-off-by: Neil Horman 
Reported-by: Stephan Mueller 
CC: Stephan Mueller 
CC: Petr Matousek 
CC: Herbert Xu 
CC: "David S. Miller" 
Signed-off-by: Herbert Xu 
Signed-off-by: Luis Henriques 
---
 crypto/ansi_cprng.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c
index 6ddd99e..c21f761 100644
--- a/crypto/ansi_cprng.c
+++ b/crypto/ansi_cprng.c
@@ -230,11 +230,11 @@ remainder:
 */
if (byte_count < DEFAULT_BLK_SZ) {
 empty_rbuf:
-   for (; ctx->rand_data_valid < DEFAULT_BLK_SZ;
-   ctx->rand_data_valid++) {
+   while (ctx->rand_data_valid < DEFAULT_BLK_SZ) {
*ptr = ctx->rand_data[ctx->rand_data_valid];
ptr++;
byte_count--;
+   ctx->rand_data_valid++;
if (byte_count == 0)
goto done;
}
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 65/78] vsprintf: check real user/group id for %pK

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ryan Mallon 

commit 312b4e226951f707e120b95b118cbc14f3d162b2 upstream.

Some setuid binaries will allow reading of files which have read
permission by the real user id.  This is problematic with files which
use %pK because the file access permission is checked at open() time,
but the kptr_restrict setting is checked at read() time.  If a setuid
binary opens a %pK file as an unprivileged user, and then elevates
permissions before reading the file, then kernel pointer values may be
leaked.

This happens for example with the setuid pppd application on Ubuntu 12.04:

  $ head -1 /proc/kallsyms
   T startup_32

  $ pppd file /proc/kallsyms
  pppd: In file /proc/kallsyms: unrecognized option 'c100'

This will only leak the pointer value from the first line, but other
setuid binaries may leak more information.

Fix this by adding a check that in addition to the current process having
CAP_SYSLOG, that effective user and group ids are equal to the real ids.
If a setuid binary reads the contents of a file which uses %pK then the
pointer values will be printed as NULL if the real user is unprivileged.

Update the sysctl documentation to reflect the changes, and also correct
the documentation to state the kptr_restrict=0 is the default.

This is a only temporary solution to the issue.  The correct solution is
to do the permission check at open() time on files, and to replace %pK
with a function which checks the open() time permission.  %pK uses in
printk should be removed since no sane permission check can be done, and
instead protected by using dmesg_restrict.

Signed-off-by: Ryan Mallon 
Cc: Kees Cook 
Cc: Alexander Viro 
Cc: Joe Perches 
Cc: "Eric W. Biederman" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 Documentation/sysctl/kernel.txt | 25 ++---
 lib/vsprintf.c  | 33 ++---
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 6d78841..99d8ab9 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -284,13 +284,24 @@ Default value is "/sbin/hotplug".
 kptr_restrict:
 
 This toggle indicates whether restrictions are placed on
-exposing kernel addresses via /proc and other interfaces.  When
-kptr_restrict is set to (0), there are no restrictions.  When
-kptr_restrict is set to (1), the default, kernel pointers
-printed using the %pK format specifier will be replaced with 0's
-unless the user has CAP_SYSLOG.  When kptr_restrict is set to
-(2), kernel pointers printed using %pK will be replaced with 0's
-regardless of privileges.
+exposing kernel addresses via /proc and other interfaces.
+
+When kptr_restrict is set to (0), the default, there are no restrictions.
+
+When kptr_restrict is set to (1), kernel pointers printed using the %pK
+format specifier will be replaced with 0's unless the user has CAP_SYSLOG
+and effective user and group ids are equal to the real ids. This is
+because %pK checks are done at read() time rather than open() time, so
+if permissions are elevated between the open() and the read() (e.g via
+a setuid binary) then %pK will not leak kernel pointers to unprivileged
+users. Note, this is a temporary solution only. The correct long-term
+solution is to do the permission checks at open() time. Consider removing
+world read permissions from files that use %pK, and using dmesg_restrict
+to protect against uses of %pK in dmesg(8) if leaking kernel pointer
+values to unprivileged users is a concern.
+
+When kptr_restrict is set to (2), kernel pointers printed using
+%pK will be replaced with 0's regardless of privileges.
 
 ==
 
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 598a73e..b82f4ba 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include   /* for PAGE_SIZE */
@@ -1036,11 +1037,37 @@ char *pointer(const char *fmt, char *buf, char *end, 
void *ptr,
spec.field_width = default_width;
return string(buf, end, "pK-error", spec);
}
-   if (!((kptr_restrict == 0) ||
- (kptr_restrict == 1 &&
-  has_capability_noaudit(current, CAP_SYSLOG
+
+   switch (kptr_restrict) {
+   case 0:
+   /* Always print %pK values */
+   break;
+   case 1: {
+   /*
+* Only print the real pointer value if the current
+* process has CAP_SYSLOG and is running with the
+* same credentials it 

[PATCH 3.5 46/78] alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesn't exist

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: KOSAKI Motohiro 

commit 98d6f4dd84a134d942827584a3c5f67ffd8ec35f upstream.

Fedora Ruby maintainer reported latest Ruby doesn't work on Fedora Rawhide
on ARM. (http://bugs.ruby-lang.org/issues/9008)

Because of, commit 1c6b39ad3f (alarmtimers: Return -ENOTSUPP if no
RTC device is present) intruduced to return ENOTSUPP when
clock_get{time,res} can't find a RTC device. However this is incorrect.

First, ENOTSUPP isn't exported to userland (ENOTSUP or EOPNOTSUP are the
closest userland equivlents).

Second, Posix and Linux man pages agree that clock_gettime and
clock_getres should return EINVAL if clk_id argument is invalid.
While the arugment that the clockid is valid, but just not supported
on this hardware could be made, this is just a technicality that
doesn't help userspace applicaitons, and only complicates error
handling.

Thus, this patch changes the code to use EINVAL.

Cc: Thomas Gleixner 
Cc: Frederic Weisbecker 
Reported-by: Vit Ondruch 
Signed-off-by: KOSAKI Motohiro 
[jstultz: Tweaks to commit message to include full rational]
Signed-off-by: John Stultz 
Signed-off-by: Luis Henriques 
---
 kernel/time/alarmtimer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index aa27d39..2cfe9a5 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -474,7 +474,7 @@ static int alarm_clock_getres(const clockid_t which_clock, 
struct timespec *tp)
clockid_t baseid = alarm_bases[clock2alarm(which_clock)].base_clockid;
 
if (!alarmtimer_get_rtcdev())
-   return -ENOTSUPP;
+   return -EINVAL;
 
return hrtimer_get_res(baseid, tp);
 }
@@ -491,7 +491,7 @@ static int alarm_clock_get(clockid_t which_clock, struct 
timespec *tp)
struct alarm_base *base = _bases[clock2alarm(which_clock)];
 
if (!alarmtimer_get_rtcdev())
-   return -ENOTSUPP;
+   return -EINVAL;
 
*tp = ktime_to_timespec(base->gettime());
return 0;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 61/78] ALSA: hda - Add support for CX20952

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Takashi Iwai 

commit 8f42d7698751a45cd9f7134a5da49bc5b6206179 upstream.

It's a superset of the existing CX2075x codecs, so we can reuse the
existing parser code.

Signed-off-by: Takashi Iwai 
Signed-off-by: Luis Henriques 
---
 sound/pci/hda/patch_conexant.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index 5fb90c6..5a48081 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -4594,6 +4594,8 @@ static const struct hda_codec_preset 
snd_hda_preset_conexant[] = {
  .patch = patch_conexant_auto },
{ .id = 0x14f15115, .name = "CX20757",
  .patch = patch_conexant_auto },
+   { .id = 0x14f151d7, .name = "CX20952",
+ .patch = patch_conexant_auto },
{} /* terminator */
 };
 
@@ -4620,6 +4622,7 @@ MODULE_ALIAS("snd-hda-codec-id:14f15111");
 MODULE_ALIAS("snd-hda-codec-id:14f15113");
 MODULE_ALIAS("snd-hda-codec-id:14f15114");
 MODULE_ALIAS("snd-hda-codec-id:14f15115");
+MODULE_ALIAS("snd-hda-codec-id:14f151d7");
 
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Conexant HD-audio codec");
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 56/78] can: c_can: Fix RX message handling, handle lost message before EOB

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Markus Pargmann 

commit 5d0f801a2ccec3b1fdabc3392c8d99ed0413d216 upstream.

If we handle end of block messages with higher priority than a lost message,
we can run into an endless interrupt loop.

This is reproducable with a am335x processor and "cansequence -r" at 1Mbit.
As soon as we loose a packet we can't escape from an interrupt loop.

This patch fixes the problem by handling lost packets before EOB packets.

Signed-off-by: Markus Pargmann 
Signed-off-by: Marc Kleine-Budde 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 drivers/net/can/c_can/c_can.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index 64647d4..91d1b5a 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -764,9 +764,6 @@ static int c_can_do_rx_poll(struct net_device *dev, int 
quota)
msg_ctrl_save = priv->read_reg(priv,
>regs->ifregs[0].msg_cntrl);
 
-   if (msg_ctrl_save & IF_MCONT_EOB)
-   return num_rx_pkts;
-
if (msg_ctrl_save & IF_MCONT_MSGLST) {
c_can_handle_lost_msg_obj(dev, 0, msg_obj);
num_rx_pkts++;
@@ -774,6 +771,9 @@ static int c_can_do_rx_poll(struct net_device *dev, int 
quota)
continue;
}
 
+   if (msg_ctrl_save & IF_MCONT_EOB)
+   return num_rx_pkts;
+
if (!(msg_ctrl_save & IF_MCONT_NEWDAT))
continue;
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 45/78] rt2800usb: slow down TX status polling

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Stanislaw Gruszka 

commit 36165fd5b00bf8163f89c21bb16a3e9834555b10 upstream.

Polling TX statuses too frequently has two negative effects. First is
randomly peek CPU usage, causing overall system functioning delays.
Second bad effect is that device is not able to fill TX statuses in
H/W register on some workloads and we get lot of timeouts like below:

ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout 
for entry 7 in queue 2
ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout 
for entry 7 in queue 2
ieee80211 phy4: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, 
dropping

This not only cause flood of messages in dmesg, but also bad throughput,
since rate scaling algorithm can not work optimally.

In the future, we should probably make polling interval be adjusted
automatically, but for now just increase values, this make mentioned
problems gone.

Resolve:
https://bugzilla.kernel.org/show_bug.cgi?id=62781

Signed-off-by: Stanislaw Gruszka 
Signed-off-by: John W. Linville 
Signed-off-by: Luis Henriques 
---
 drivers/net/wireless/rt2x00/rt2800usb.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/rt2x00/rt2800usb.c 
b/drivers/net/wireless/rt2x00/rt2800usb.c
index fe42f76..c30797e 100644
--- a/drivers/net/wireless/rt2x00/rt2800usb.c
+++ b/drivers/net/wireless/rt2x00/rt2800usb.c
@@ -143,6 +143,8 @@ static bool rt2800usb_txstatus_timeout(struct rt2x00_dev 
*rt2x00dev)
return false;
 }
 
+#define TXSTATUS_READ_INTERVAL 100
+
 static bool rt2800usb_tx_sta_fifo_read_completed(struct rt2x00_dev *rt2x00dev,
 int urb_status, u32 tx_status)
 {
@@ -170,8 +172,9 @@ static bool rt2800usb_tx_sta_fifo_read_completed(struct 
rt2x00_dev *rt2x00dev,
queue_work(rt2x00dev->workqueue, >txdone_work);
 
if (rt2800usb_txstatus_pending(rt2x00dev)) {
-   /* Read register after 250 us */
-   hrtimer_start(>txstatus_timer, ktime_set(0, 25),
+   /* Read register after 1 ms */
+   hrtimer_start(>txstatus_timer,
+ ktime_set(0, TXSTATUS_READ_INTERVAL),
  HRTIMER_MODE_REL);
return false;
}
@@ -196,8 +199,9 @@ static void rt2800usb_async_read_tx_status(struct 
rt2x00_dev *rt2x00dev)
if (test_and_set_bit(TX_STATUS_READING, >flags))
return;
 
-   /* Read TX_STA_FIFO register after 500 us */
-   hrtimer_start(>txstatus_timer, ktime_set(0, 50),
+   /* Read TX_STA_FIFO register after 2 ms */
+   hrtimer_start(>txstatus_timer,
+ ktime_set(0, 2*TXSTATUS_READ_INTERVAL),
  HRTIMER_MODE_REL);
 }
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 34/78] scripts/kallsyms: filter symbols not in kernel address space

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ming Lei 

commit f6537f2f0eba4eba3354e48dbe3047db6d8b6254 upstream.

This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.

For example, on ARM there are some symbols which may be
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, this patch fixes the
problem (introduced b9b32bf70f2fb710b07c94e13afbc729afe221da)

Cc: Russell King 
Cc: linux-arm-ker...@lists.infradead.org
Cc: Michal Marek 
Signed-off-by: Ming Lei 
Signed-off-by: Rusty Russell 
[ luis: backported to 3.5: adjusted context ]
Signed-off-by: Luis Henriques 
---
 scripts/kallsyms.c  | 12 +++-
 scripts/link-vmlinux.sh |  2 ++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 487ac6f..9a11f9f 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -55,6 +55,7 @@ static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
 static char symbol_prefix_char = '\0';
+static unsigned long long kernel_start_addr = 0;
 
 int token_profit[0x1];
 
@@ -65,7 +66,10 @@ unsigned char best_table_len[256];
 
 static void usage(void)
 {
-   fprintf(stderr, "Usage: kallsyms [--all-symbols] 
[--symbol-prefix=] < in.map > out.S\n");
+   fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+   "[--symbol-prefix=] "
+   "[--page-offset=] "
+   "< in.map > out.S\n");
exit(1);
 }
 
@@ -194,6 +198,9 @@ static int symbol_valid(struct sym_entry *s)
int i;
int offset = 1;
 
+   if (s->addr < kernel_start_addr)
+   return 0;
+
/* skip prefix char */
if (symbol_prefix_char && *(s->sym + 1) == symbol_prefix_char)
offset++;
@@ -646,6 +653,9 @@ int main(int argc, char **argv)
if ((*p == '"' && *(p+2) == '"') || (*p == '\'' 
&& *(p+2) == '\''))
p++;
symbol_prefix_char = *p;
+   } else if (strncmp(argv[i], "--page-offset=", 14) == 0) 
{
+   const char *p = [i][14];
+   kernel_start_addr = strtoull(p, NULL, 16);
} else
usage();
}
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index cd9c6c6..7a9f7f9 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -78,6 +78,8 @@ kallsyms()
kallsymopt=--all-symbols
fi
 
+   kallsymopt="${kallsymopt} --page-offset=$CONFIG_PAGE_OFFSET"
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}   \
  ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 42/78] usb: hub: Clear Port Reset Change during init/resume

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Julius Werner 

commit e92aee330837e4911553761490a8fb843f2053a6 upstream.

This patch adds the Port Reset Change flag to the set of bits that are
preemptively cleared on init/resume of a hub. In theory this bit should
never be set unexpectedly... in practice it can still happen if BIOS,
SMM or ACPI code plays around with USB devices without cleaning up
correctly. This is especially dangerous for XHCI root hubs, which don't
generate any more Port Status Change Events until all change bits are
cleared, so this is a good precaution to have (similar to how it's
already done for the Warm Port Reset Change flag).

Signed-off-by: Julius Werner 
Acked-by: Alan Stern 
Signed-off-by: Greg Kroah-Hartman 
[ luis: backported to 3.5:
  - adjusted context
  - replaced usb_clear_port_feature() by clear_port_feature() ]
Signed-off-by: Luis Henriques 
---
 drivers/usb/core/hub.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 86c7421..b5503b0 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1142,6 +1142,11 @@ static void hub_activate(struct usb_hub *hub, enum 
hub_activation_type type)
clear_port_feature(hub->hdev, port1,
USB_PORT_FEAT_C_ENABLE);
}
+   if (portchange & USB_PORT_STAT_C_RESET) {
+   need_debounce_delay = true;
+   clear_port_feature(hub->hdev, port1,
+   USB_PORT_FEAT_C_RESET);
+   }
if ((portchange & USB_PORT_STAT_C_BH_RESET) &&
hub_is_superspeed(hub->hdev)) {
need_debounce_delay = true;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 16/78] md: Fix skipping recovery for read-only arrays.

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Lukasz Dorau 

commit 61e4947c99c4494336254ec540c50186d186150b upstream.

Since:
commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86
md: Allow devices to be re-added to a read-only array.

spares are activated on a read-only array. In case of raid1 and raid10
personalities it causes that not-in-sync devices are marked in-sync
without checking if recovery has been finished.

If a read-only array is degraded and one of its devices is not in-sync
(because the array has been only partially recovered) recovery will be skipped.

This patch adds checking if recovery has been finished before marking a device
in-sync for raid1 and raid10 personalities. In case of raid5 personality
such condition is already present (at raid5.c:6029).

Bug was introduced in 3.10 and causes data corruption.

Signed-off-by: Pawel Baldysiak 
Signed-off-by: Lukasz Dorau 
Signed-off-by: NeilBrown 
Signed-off-by: Luis Henriques 
---
 drivers/md/raid1.c  | 1 +
 drivers/md/raid10.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index aa58c02..0d15abe 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1354,6 +1354,7 @@ static int raid1_spare_active(struct mddev *mddev)
}
}
if (rdev
+   && rdev->recovery_offset == MaxSector
&& !test_bit(Faulty, >flags)
&& !test_and_set_bit(In_sync, >flags)) {
count++;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5ad042c..2070e9c 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1630,6 +1630,7 @@ static int raid10_spare_active(struct mddev *mddev)
}
sysfs_notify_dirent_safe(tmp->replacement->sysfs_state);
} else if (tmp->rdev
+  && tmp->rdev->recovery_offset == MaxSector
   && !test_bit(Faulty, >rdev->flags)
   && !test_and_set_bit(In_sync, >rdev->flags)) {
count++;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.5 02/78] jfs: fix error path in ialloc

2013-11-25 Thread Luis Henriques
3.5.7.26 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Dave Kleikamp 

commit 8660998608cfa1077e560034db81885af8e1e885 upstream.

If insert_inode_locked() fails, we shouldn't be calling
unlock_new_inode().

Signed-off-by: Dave Kleikamp 
Tested-by: Michael L. Semon 
Signed-off-by: Luis Henriques 
---
 fs/jfs/jfs_inode.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/jfs/jfs_inode.c b/fs/jfs/jfs_inode.c
index c1a3e60..7f464c5 100644
--- a/fs/jfs/jfs_inode.c
+++ b/fs/jfs/jfs_inode.c
@@ -95,7 +95,7 @@ struct inode *ialloc(struct inode *parent, umode_t mode)
 
if (insert_inode_locked(inode) < 0) {
rc = -EINVAL;
-   goto fail_unlock;
+   goto fail_put;
}
 
inode_init_owner(inode, parent, mode);
@@ -156,7 +156,6 @@ struct inode *ialloc(struct inode *parent, umode_t mode)
 fail_drop:
dquot_drop(inode);
inode->i_flags |= S_NOQUOTA;
-fail_unlock:
clear_nlink(inode);
unlock_new_inode(inode);
 fail_put:
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] dt: binding documentation for bq2415x charger

2013-11-25 Thread Pavel Machek
On Sun 2013-11-24 17:49:31, Sebastian Reichel wrote:
> Add devicetree binding documentation for bq2415x charger.
> 
> Signed-off-by: Sebastian Reichel 

Thanks!

> +- reg: integer, i2c address of the device
> +- ti,current-limit: integer, current limit in mA

Does this need to be "ti," specific? Most of fields are likely to be
needed for other li-ion chargers

"Specifies maximum current charger can pull from power supply"

?

> +- ti,charge-current: integer, charging current in mA

...why/how is it different from current-limit? Is the current-limit on
5V, while the charge-current relative to battery voltage... so that
charge-current can be > current-limit when battery voltage is low?

"Maximum current that will be supplied to the battery, as determined
by voltage on current sense resistor"?

> +- ti,weak-battery-voltage: integer, weak battery voltage threshold in mV

It would be good to explain what this threshold means. Voltage so low
that system needs to be shut down?

Hmm, it looks to me like it is "as long as battery is below this
voltage, fast charge is not started. Instead, slow 'precharge' is
performed."

> +- ti,battery-regulation-voltage: integer, battery regulation
>  voltage in mV

"Selects maximum voltage for charging."

> +- ti,termination-current: integer, termination current in mA

"When current in constant-voltage phase drops below this value, charge
is terminated".


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Avi Kivity
On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong
 wrote:
>
> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti  wrote:



I'm not really following, but note that parent_pte predates EPT (and
the use of rcu in kvm), so all the complexity that is the result of
trying to pack as many list entries into a cache line can be dropped.
Most setups now would have exactly one list entry, which is handled
specially antyway.

Alternatively, the trick of storing multiple entries in one list entry
can be moved to generic code, it may be useful to others.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] powerpc/4xx: Fix warning in kilauea.dtb

2013-11-25 Thread Josh Boyer
On Mon, Nov 25, 2013 at 4:40 AM, Ian Campbell  wrote:
> Currently I see:
>   DTC arch/powerpc/boot/kilauea.dtb
> Warning (reg_format): "reg" property in /plb/ppc4xx-msi@C1000 has invalid 
> length (12 bytes) (#address-cells == 1, #size-cells == 1)
>
> It appears that unlike the other platforms handled by 3fb7933850fa
> "powerpc/4xx: Adding PCIe MSI support" this platform does not use 
> address-cells=2.
>
> Signed-off-by: Ian Campbell 
> Acked-by: Josh Boyer 
> Cc: Rupjyoti Sarmah 
> Cc: Tirumala R Marri 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: devicet...@vger.kernel.org (open list:OPEN FIRMWARE AND...)
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-kernel@vger.kernel.org
> ---
> Resending, this hasn't been picked up since June
> http://patchwork.ozlabs.org/patch/248234/

Ben, please pick this up.

josh

> ---
>  arch/powerpc/boot/dts/kilauea.dts |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/boot/dts/kilauea.dts 
> b/arch/powerpc/boot/dts/kilauea.dts
> index 1613d6e..5ba7f01 100644
> --- a/arch/powerpc/boot/dts/kilauea.dts
> +++ b/arch/powerpc/boot/dts/kilauea.dts
> @@ -406,7 +406,7 @@
>
> MSI: ppc4xx-msi@C1000 {
> compatible = "amcc,ppc4xx-msi", "ppc4xx-msi";
> -   reg = < 0x0 0xEF62 0x100>;
> +   reg = <0xEF62 0x100>;
> sdr-base = <0x4B0>;
> msi-data = <0x>;
> msi-mask = <0x>;
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: tegra: use pll_ref as the pll_e parent

2013-11-25 Thread Peter De Schrijver
On Fri, Nov 22, 2013 at 02:40:35PM +0100, Peter De Schrijver wrote:
> On Wed, Oct 30, 2013 at 01:41:29AM +0100, Peter De Schrijver wrote:
> > Use pll_ref instead of pll_re_vco as the pll_e parent on Tegra114 and
> > Tegra124. Also add a pll_ref table entry for pll_e for Tegra114.
> > 
> > Signed-off-by: Peter De Schrijver 
> 
> I will squash this into the next tegra-clk-next as the previous pull request
> has never made it.
> 

Looking again at this, I think the Tegra114 and generic PLL part of the patch
better stays as a separate patch. The Tegra124 bit will be squashed into the
Tegra124 support patch.

Cheers,

Peter.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: [PATCH 7/8] watchdog: davinci: add "clocks" property

2013-11-25 Thread ivan.khoronzhuk

On 11/25/2013 02:15 PM, Mark Rutland wrote:

On Mon, Nov 25, 2013 at 10:59:45AM +, ivan.khoronzhuk wrote:

On 11/23/2013 07:57 PM, Arnd Bergmann wrote:

On Wednesday 06 November 2013, ivan.khoronzhuk wrote:

@@ -7,6 +7,10 @@ Required properties:

   - reg : Should contain WDT registers location and length

+- clocks: phandle reference to the controller clock.
+ Required only for Keystone arch.
+ See clock-bindings.txt
+
   Optional properties:

   - timeout-sec: Contains the watchdog timeout in seconds


I think it should really be listed under "Optional properties" and the
reference to Keystone removed. Note how the binding would need
to change otherwise if another platform started to use the clock, which
is a little silly.

Arnd



Ok, I'll move clocks property under "Optional properties" and describe it
as following:

Optional properties:
- timeout-sec : Contains the watchdog timeout in seconds
- clocks: phandle reference to the controller clock.
  Needed if platform uses clocks.
  See clock-bindings.txt


Nit: clocks aren't just phandles, they have a clock-specifier too (which
might be 0 cells).

Otherwise this looks fine to me.

Mark.



... I will replace it to:
Optional properties:
- timeout-sec : Contains the watchdog timeout in seconds
- clocks: the clock feeding the watchdog timer.
  Needed if platform uses clocks.
  See clock-bindings.txt

Is it OK?

--
Regards,
Ivan Khoronzhuk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] bq2415x_charger: add DT support

2013-11-25 Thread Pavel Machek
On Sun 2013-11-24 17:49:30, Sebastian Reichel wrote:
> This adds DT support to the bq2415x driver.
> 
> Signed-off-by: Sebastian Reichel 

Reviewed-by: Pavel Machek 


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mfd: cros ec: spi: Use consistent function names

2013-11-25 Thread Lee Jones
Both patches applied, thanks.

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: CPU: 0 PID: 1 at init/main.c:711 do_one_initcall()

2013-11-25 Thread Boaz Harrosh
On 11/22/2013 08:02 AM, Yuanhan Liu wrote:
> Greetings,
> 
> I got the below dmesg and the first bad commit is
> 
> commit 20545536cd8ea949c61527b6395ec8c0d2c237b1
> Author: Boaz Harrosh 
> Date:   Thu Jul 19 15:22:37 2012 +0300
> 
> RFC: do_xor_speed Broken on UML do to jiffies
> 

Hi Sir Yuanhan.

I understand that you are running exofs_ioctl branch on linux-open-osd.git .
Please tell me more why you choose to run this branch it is an experimental
pNFS+Ganesha+exofs branch that we are working on around here. It might have
problems.

Yes this patch has problems, I know. I have it in my tree because I need
it if I want to use XOR engine with a UML system. If you do need to run
this branch *exofs_ioctl* on your system then it is best you revert this
patch.

Thanks for the report I think I'll just remove that patch and run with it
locally.

Cheers
Boaz

> Remember that hang I reported a while back on UML. Well
> I'm at it again, and it still hangs and I found why.
> 
> I have dprinted jiffies and it never advances during the
> loop at do_xor_speed. There for it is stuck in an endless
> loop. I have also dprinted current_kernel_time() and it
> returns the same constant value as well.
> 
> Note that it does usually work on UML, only during
> the modprobe of xor.ko while that test is running. It looks
> like some lucking is preventing the clock from ticking.
> 
> However ktime_get_ts does work for me so I changed the code
> as below, so I can work. See how I put several safety
> guards, to never get hangs again.
> And I think my time based approach is more accurate then
> previous system.
> 
> UML guys please investigate the jiffies issue? what is
> xor.ko not doing right?
> 
> Signed-off-by: Boaz Harrosh 
> 
> +--++
> |  ||
> +--++
> | boot_successes   | 0  |
> | boot_failures| 29 |
> | WARNING:CPU:PID:at_init/main.c:do_one_initcall() | 29 |
> | initcall_calibrate_xor_blocks_returned_with_preemption_imbalance | 29 |
> +--++
> 
> [0.127025]generic_sse:   148.363 MB/sec
> [0.127478] xor: using function: prefetch64-sse (152.727 MB/sec)
> [0.128017] [ cut here ]
> [0.128531] WARNING: CPU: 0 PID: 1 at init/main.c:711 
> do_one_initcall+0x105/0x115()
> [0.129018] initcall calibrate_xor_blocks+0x0/0x144 returned with 
> preemption imbalance 
> [0.130013] Modules linked in:
> [0.130357] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 3.12.0-11285-gb242bff #91
> [0.131013]   88000d0dde00 8161acc5 
> 88000d0dde48
> [0.132554]  88000d0dde38 81052de9 81000316 
> 81a77cfd
> [0.133380]     
> 88000d0dde98
> [0.134213] Call Trace:
> [0.134493]  [] dump_stack+0x4e/0x7a
> [0.135017]  [] warn_slowpath_common+0x75/0x8e
> [0.135654]  [] ? do_one_initcall+0x105/0x115
> [0.136015]  [] ? do_xor_speed+0xdd/0xdd
> [0.137016]  [] warn_slowpath_fmt+0x47/0x49
> [0.137628]  [] ? free_pages+0x51/0x53
> [0.138015]  [] ? do_xor_speed+0xdd/0xdd
> [0.138623]  [] do_one_initcall+0x105/0x115
> [0.139017]  [] kernel_init_freeable+0x115/0x19b
> [0.140016]  [] ? do_early_param+0x88/0x88
> [0.140630]  [] ? rest_init+0xbd/0xbd
> [0.141016]  [] kernel_init+0x9/0xfa
> [0.141567]  [] ret_from_fork+0x7c/0xb0
> [0.142016]  [] ? rest_init+0xbd/0xbd
> [0.143028] ---[ end trace 19b4eab334350767 ]---
> [0.143530] atomic64 test passed for x86-64 platform with CX8 and with SSE
> 
> git bisect start b242bff548c34510fd9b7f0e29b885263dfb8903 
> 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 --
> git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74  # 09:25 20+ 
>  0  Merge branch 'akpm' (patches from Andrew Morton)
> git bisect good 7e1a1e9378018aeea2c7e8a3dd2ceb1db1523b0b  # 09:42 20+ 
>  0  Merge tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs
> git bisect good 4937e2a6f939a41bf811378e80d71f68aa0950c6  # 10:08 20+ 
>  0  Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> git bisect good 210e812f036736aeda097d9a6ef84b1f2b334bae  # 10:31 20+ 
>  0  perf header: Fix bogus group name
> git bisect good d5bdaf4f68f0590fc481bca54bcaffeb27b75fca  # 10:54 20+ 
>  0  Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
> git bisect good e630a6bcf18079b2ab6b03d55c9757e8ef6656b6  # 11:03 20+ 
>  0  staging: lustre: fix 

Re: [PATCH 1/3] power_supply: add power_supply_get_by_phandle

2013-11-25 Thread Pavel Machek
On Sun 2013-11-24 17:49:29, Sebastian Reichel wrote:
> Add method to get power supply by device tree phandle.
> 
> Signed-off-by: Sebastian Reichel 

Reviewed-by: Pavel Machek 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sysfs: handle duplicate removal attempts in sysfs_remove_group()

2013-11-25 Thread Rafael J. Wysocki
On Monday, November 25, 2013 10:29:00 AM James Bottomley wrote:
> On Fri, 2013-11-22 at 11:02 -0500, Tejun Heo wrote:
> > Hello,
> > 
> > On Fri, Nov 22, 2013 at 08:43:55AM -0700, Bjorn Helgaas wrote:
> > > > So, we do have cases where the parent is removed before the child.  I
> > > > suppose the parent pci bridge is removed already?  AFAICS this
> > > > shouldn't break anything but people did seem to expect the removals to
> > > > be ordered from child to parent.  Bjorn, is this something you expect
> > > > to happened?
> > > 
> > > I do not expect a PCI bridge to be removed before the devices below
> > > it.  We should be removing all the children before removing the parent
> > > bridge.
> > > 
> > > But is this related to PCI?  I don't see the connection yet.  I tried
> > 
> > I'm not sure.  It was from thunderbolt and nobody is reporting it on
> > other interconnects, so it could be.
> > 
> > > to look into this a bit (my notes are at
> > > https://bugzilla.kernel.org/show_bug.cgi?id=65281), but I haven't
> > > figured out the big-picture problem yet.
> > > 
> > > I don't have warm fuzzies that adding a "have we already removed this"
> > > check is the best resolution, but maybe that's just because I don't
> > > understand the problem.
> > 
> > Yeah, the whole thing is sorta pointless.  Just issuing removal and
> > continuing on should do, IMHO.
> 
> I'd go for that as well.  We have huge problems with the _del calls
> because visibility is strict hierarchy and it's not always easy to work
> out who's underneath us.
> 
> It's going to be really annoying when refcounting works perfectly for
> objects, so you can just do puts in any order, but you have to have
> _del() called to remove subordinate objects before their parent.

Well, that would be fine and dandy, but device_del() calls bus_remove_device()
which in turn runs device_release_driver() and the order in which *these*
things are done actually matters in general.

So yes, after we have released all of the drivers in question, we can do _del()
right before the final _put() in any order just fine.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make the mtdblock read/write skip the bad nand sector

2013-11-25 Thread Ezequiel Garcia
David,

On Mon, Nov 25, 2013 at 12:16:10PM +, David Woodhouse wrote:
> On Mon, 2013-11-25 at 08:52 -0300, Ezequiel Garcia wrote:
> > 
> > Your understanding is correct: NAND *must* be erased explictly in
> > userspace
> > before writing. However, keep in mind the following additional
> > constraints:
> > 
> > * Writing should be always performed using 'nandwrite',
> >   not tools such as 'cat' or 'dd'.
> > 
> > * An mtdblock shouldn't be used to access directly the NAND from
> >   userspace. AFAICS, the primarily usage of mtdblock is to be able to
> >   mount JFFS2.
> 
> No. You don't need mtdblock to mount JFFS2 at all.
> 
> The mtdblock driver was used in the *very* early days of the MTD system,
> on NOR flash with a "traditional" file system. Either in read-only mode
> for something like cramfs, or in a very unsafe writeable mode. We
> actually put ext2 on it for the Compaq iPaq for a while, before we had
> JFFS.
> 
> It was used as a shortcut for mounting JFFS2, and still is by a lot of
> people, but it's certainly not necessary. You can turn off CONFIG_BLOCK
> entirely and still use JFFS2.
> 
> You should consider mtdblock to be the most basic, primitive, "flash
> translation layer" that can possibly exist. And thus, should basically
> never use it. I certainly don't approve of trying to extend it.
> 

Thanks a lot for the insight. After reading this, I'm wondering what's
preventing us from killing MTD block support altogether. Artem, already
suggested it a while back...
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Preventing IPI sending races in arch code

2013-11-25 Thread Peter Zijlstra
On Mon, Nov 25, 2013 at 05:00:18PM +0530, Vineet Gupta wrote:
> While we are at it, I wanted to confirm another potential race 
> (ARC/blackfin..)
> The IPI handler clears the interrupt before atomically-read-n-clear the msg 
> word.
> 
> do_IPI
>plat_smp_ops.ipi_clear(irq);
>while ((pending = xchg(_data->bits, 0) != 0)
>   find_next_bit()
>   switch(next-msg)
> 
> Depending on arch this could lead to an immediate IPI interrupt, and again
> ipi_data->bits could get out of syn with IPI senders. 

I'm obviously lacking in platform knowledge here, what does that
ipi_clear() actually do? Tell the platform the interrupt has arrived and
it can stop asserting the line?

So sure, then someone can again assert the interrupt, but given we just
established a protocol for raising the thing; namely something like
this:

void arch_send_ipi(int cpu, int type)
{
  u32 *pending_ptr = per_cpu_ptr(ipi_bits, cpu);
  u32 new, old;

  do {
new = old = *pending_ptr;
new |= 1U << type;
  } while (cmpxchg(pending_ptr, old, new) != old)

  if (!old) /* only raise the actual IPI if we set the first bit */
raise_ipi(cpu);
}

Who would re-assert it if we have !0 pending?

Also, the above can be thought of as a memory ordering issue:

  STORE pending
  MB /* implied by cmpxchg */
  STORE ipi /* raise the actual thing */

In that case the other end must be:

  LOAD ipi
  MB /* implied by xchg */
  LOAD pending

Which is what your code seems to do.

> IMO the while loop is
> completely useless specially if IPIs are not coalesced in h/w. 

Agreed, the while loops seems superfluous.

> And we need to move
> the xchg ahead of ACK'ing the IPI
> 
> do_IPI
>pending = xchg(_data->bits, 0);
>plat_smp_ops.ipi_clear(irq);
>while (ffs)
>   switch(next-msg)
>   ...
> 
> Does that look sane to you.

This I'm not at all certain of; continuing with the memory order analogy
this would allow for the case where we see 0 pending, set a bit, try and
raise the interrupt but then do not because its already assert.

And since you just removed the while() loop, we'll be left with a !0
pending vector and nobody processing it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] mfd: cros ec: i2c: Use consistent function names

2013-11-25 Thread Thierry Reding
Rename cros_ec_{probe,remove}_i2c() to cros_ec_i2c_{probe,remove}() for
consistency.

Signed-off-by: Thierry Reding 
---
 drivers/mfd/cros_ec_i2c.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c
index 123044608b63..4f71be99a183 100644
--- a/drivers/mfd/cros_ec_i2c.c
+++ b/drivers/mfd/cros_ec_i2c.c
@@ -120,7 +120,7 @@ static int cros_ec_command_xfer(struct cros_ec_device 
*ec_dev,
return ret;
 }
 
-static int cros_ec_probe_i2c(struct i2c_client *client,
+static int cros_ec_i2c_probe(struct i2c_client *client,
 const struct i2c_device_id *dev_id)
 {
struct device *dev = >dev;
@@ -150,7 +150,7 @@ static int cros_ec_probe_i2c(struct i2c_client *client,
return 0;
 }
 
-static int cros_ec_remove_i2c(struct i2c_client *client)
+static int cros_ec_i2c_remove(struct i2c_client *client)
 {
struct cros_ec_device *ec_dev = i2c_get_clientdata(client);
 
@@ -190,8 +190,8 @@ static struct i2c_driver cros_ec_driver = {
.owner  = THIS_MODULE,
.pm = _ec_i2c_pm_ops,
},
-   .probe  = cros_ec_probe_i2c,
-   .remove = cros_ec_remove_i2c,
+   .probe  = cros_ec_i2c_probe,
+   .remove = cros_ec_i2c_remove,
.id_table   = cros_ec_i2c_id,
 };
 
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mfd: cros ec: spi: Use consistent function names

2013-11-25 Thread Thierry Reding
Rename cros_ec_{probe,remove}_spi() to cros_ec_spi_{probe,remove}() for
consistency.

Signed-off-by: Thierry Reding 
---
 drivers/mfd/cros_ec_spi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 27be73523b9c..5658ec48838f 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -284,7 +284,7 @@ static int cros_ec_command_spi_xfer(struct cros_ec_device 
*ec_dev,
return 0;
 }
 
-static int cros_ec_probe_spi(struct spi_device *spi)
+static int cros_ec_spi_probe(struct spi_device *spi)
 {
struct device *dev = >dev;
struct cros_ec_device *ec_dev;
@@ -326,7 +326,7 @@ static int cros_ec_probe_spi(struct spi_device *spi)
return 0;
 }
 
-static int cros_ec_remove_spi(struct spi_device *spi)
+static int cros_ec_spi_remove(struct spi_device *spi)
 {
struct cros_ec_device *ec_dev;
 
@@ -367,8 +367,8 @@ static struct spi_driver cros_ec_driver_spi = {
.owner  = THIS_MODULE,
.pm = _ec_spi_pm_ops,
},
-   .probe  = cros_ec_probe_spi,
-   .remove = cros_ec_remove_spi,
+   .probe  = cros_ec_spi_probe,
+   .remove = cros_ec_spi_remove,
.id_table   = cros_ec_spi_id,
 };
 
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make the mtdblock read/write skip the bad nand sector

2013-11-25 Thread David Woodhouse
On Mon, 2013-11-25 at 08:52 -0300, Ezequiel Garcia wrote:
> 
> Your understanding is correct: NAND *must* be erased explictly in
> userspace
> before writing. However, keep in mind the following additional
> constraints:
> 
> * Writing should be always performed using 'nandwrite',
>   not tools such as 'cat' or 'dd'.
> 
> * An mtdblock shouldn't be used to access directly the NAND from
>   userspace. AFAICS, the primarily usage of mtdblock is to be able to
>   mount JFFS2.

No. You don't need mtdblock to mount JFFS2 at all.

The mtdblock driver was used in the *very* early days of the MTD system,
on NOR flash with a "traditional" file system. Either in read-only mode
for something like cramfs, or in a very unsafe writeable mode. We
actually put ext2 on it for the Compaq iPaq for a while, before we had
JFFS.

It was used as a shortcut for mounting JFFS2, and still is by a lot of
people, but it's certainly not necessary. You can turn off CONFIG_BLOCK
entirely and still use JFFS2.

You should consider mtdblock to be the most basic, primitive, "flash
translation layer" that can possibly exist. And thus, should basically
never use it. I certainly don't approve of trying to extend it.

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature


Re: Fwd: [PATCH 7/8] watchdog: davinci: add "clocks" property

2013-11-25 Thread Mark Rutland
On Mon, Nov 25, 2013 at 10:59:45AM +, ivan.khoronzhuk wrote:
> On 11/23/2013 07:57 PM, Arnd Bergmann wrote:
> > On Wednesday 06 November 2013, ivan.khoronzhuk wrote:
> >> @@ -7,6 +7,10 @@ Required properties:
> >>   
> >>   - reg : Should contain WDT registers location and length
> >>   
> >> +- clocks: phandle reference to the controller clock.
> >> + Required only for Keystone arch.
> >> + See clock-bindings.txt
> >> +
> >>   Optional properties:
> >>   
> >>   - timeout-sec: Contains the watchdog timeout in seconds
> > 
> > I think it should really be listed under "Optional properties" and the
> > reference to Keystone removed. Note how the binding would need
> > to change otherwise if another platform started to use the clock, which
> > is a little silly.
> > 
> > Arnd
> > 
> 
> Ok, I'll move clocks property under "Optional properties" and describe it
> as following:
> 
> Optional properties:
> - timeout-sec : Contains the watchdog timeout in seconds
> - clocks: phandle reference to the controller clock.
> Needed if platform uses clocks.
> See clock-bindings.txt

Nit: clocks aren't just phandles, they have a clock-specifier too (which
might be 0 cells).

Otherwise this looks fine to me.

Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 答复: Re: [PATCH 1/1] workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY

2013-11-25 Thread Oleg Nesterov
On 11/25, zhang.y...@zte.com.cn wrote:
>
> hte...@gmail.com wrote on 2013/11/23 07:13:43:
>
> >
> > Re: [PATCH 1/1] workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY
> >
> > On Thu, Nov 14, 2013 at 6:56 AM, Oleg Nesterov  wrote:
> > > Move the setting of PF_NO_SETAFFINITY up before set_cpus_allowed()
> > > in create_worker(). Otherwise userland can change ->cpus_allowed
> > > in between.
> > >
> > > Signed-off-by: Oleg Nesterov 
> >
> > Applied to wq/for-3.13-fixes.
> >
> > Thanks!
> >
> > --
> > tejun
>
> How about the first patch?

Let me quote myself:

Looks like Zhang is right... But I'd suggest to change flush_old_exec()
instead (see "current->flags &= ...").

Do you agree? If yes, could you make v2? flush_old_exec() already clears
the unwanted PF_ bits, I do not think we should add another place.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 00/15] kmemcg shrinkers

2013-11-25 Thread Vladimir Davydov
This patchset implements targeted shrinking for memcg when kmem limits are
present. So far, we've been accounting kernel objects but failing allocations
when short of memory. This is because our only option would be to call the
global shrinker, depleting objects from all caches and breaking isolation.

The main idea is to associate per-memcg lists with each of the LRUs. The main
LRU still provides a single entry point and when adding or removing an element
from the LRU, we use the page information to figure out which memcg it belongs
to and relay it to the right list.

The bulk of the code is written by Glauber Costa. The only change I introduced
myself in this iteration is reworking per-memcg LRU lists. Instead of extending
the existing list_lru structure, which seems to be neat as is, I introduced a
new one, memcg_list_lru, which aggregates list_lru objects for each kmem-active
memcg and keeps them uptodate as memcgs are created/destroyed. I hope this
simplified call paths and made the code easier to read.

The patchset is based on top of Linux 3.13-rc1.

Any comments and proposals are appreciated.

== Known issues ==

 * In case kmem limit is less than sum mem limit, reaching memcg kmem limit
   will result in an attempt to shrink all memcg slabs (see
   try_to_free_mem_cgroup_kmem()). Although this is better than simply failing
   allocation as it works now, it is still to be improved.

 * Since FS shrinkers can't be executed on __GFP_FS allocations, such
   allocations will fail if memcg kmem limit is less than sum mem limit and the
   memcg kmem usage is close to its limit. Glauber proposed to schedule a
   worker which would shrink kmem in the background on such allocations.
   However, this approach does not eliminate failures completely, it just makes
   them rarer. I'm thinking on implementing soft limits for memcg kmem so that
   striking the soft limit will trigger the reclaimer, but won't fail the
   allocation. I would appreciate any other proposals on how this can be fixed.

 * Only dcache and icache are reclaimed on memcg pressure. Other FS objects are
   left for global pressure only. However, it should not be a serious problem
   to make them reclaimable too by passing on memcg to the FS-layer and letting
   each FS decide if its internal objects are shrinkable on memcg pressure.

== Changes from v10 ==

 * Rework per-memcg list_lru infrastructure.

Previous iteration (with full changelog) can be found here:

http://www.spinics.net/lists/linux-fsdevel/msg66632.html

Glauber Costa (12):
  memcg: make cache index determination more robust
  memcg: consolidate callers of memcg_cache_id
  vmscan: also shrink slab in memcg pressure
  memcg: move initialization to memcg creation
  memcg: move stop and resume accounting functions
  memcg: per-memcg kmem shrinking
  memcg: scan cache objects hierarchically
  vmscan: take at least one pass with shrinkers
  memcg: allow kmem limit to be resized down
  vmpressure: in-kernel notifications
  memcg: reap dead memcgs upon global memory pressure
  memcg: flush memcg items upon memcg destruction

Vladimir Davydov (3):
  memcg,list_lru: add per-memcg LRU list infrastructure
  memcg,list_lru: add function walking over all lists of a per-memcg
LRU
  super: make icache, dcache shrinkers memcg-aware

 fs/dcache.c|   25 +-
 fs/inode.c |   16 +-
 fs/internal.h  |9 +-
 fs/super.c |   45 +--
 include/linux/fs.h |4 +-
 include/linux/list_lru.h   |   77 +
 include/linux/memcontrol.h |   23 ++
 include/linux/shrinker.h   |6 +-
 include/linux/swap.h   |2 +
 include/linux/vmpressure.h |5 +
 mm/memcontrol.c|  709 +++-
 mm/vmpressure.c|   53 +++-
 mm/vmscan.c|  178 +--
 13 files changed, 1018 insertions(+), 134 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 04/15] memcg: move initialization to memcg creation

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

Those structures are only used for memcgs that are effectively using
kmemcg. However, in a later patch I intend to use scan that list
inconditionally (list empty meaning no kmem caches present), which
simplifies the code a lot.

So move the initialization to early kmem creation.

Signed-off-by: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 mm/memcontrol.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8924ff1..9ba9975 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3136,8 +3136,6 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
}
 
memcg->kmemcg_id = num;
-   INIT_LIST_HEAD(>memcg_slab_caches);
-   mutex_init(>slab_caches_mutex);
return 0;
 }
 
@@ -5923,6 +5921,8 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, 
struct cgroup_subsys *ss)
 {
int ret;
 
+   INIT_LIST_HEAD(>memcg_slab_caches);
+   mutex_init(>slab_caches_mutex);
memcg->kmemcg_id = -1;
ret = memcg_propagate_kmem(memcg);
if (ret)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 03/15] vmscan: also shrink slab in memcg pressure

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

Without the surrounding infrastructure, this patch is a bit of a hammer:
it will basically shrink objects from all memcgs under memcg pressure.
At least, however, we will keep the scan limited to the shrinkers marked
as per-memcg.

Future patches will implement the in-shrinker logic to filter objects
based on its memcg association.

Signed-off-by: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 include/linux/memcontrol.h |   17 +++
 include/linux/shrinker.h   |6 +-
 mm/memcontrol.c|   16 +-
 mm/vmscan.c|   50 +++-
 4 files changed, 82 insertions(+), 7 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index b3e7a66..d16ba51 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -231,6 +231,9 @@ void mem_cgroup_split_huge_fixup(struct page *head);
 bool mem_cgroup_bad_page_check(struct page *page);
 void mem_cgroup_print_bad_page(struct page *page);
 #endif
+
+unsigned long
+memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone);
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
@@ -427,6 +430,12 @@ static inline void mem_cgroup_replace_page_cache(struct 
page *oldpage,
struct page *newpage)
 {
 }
+
+static inline unsigned long
+memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone)
+{
+   return 0;
+}
 #endif /* CONFIG_MEMCG */
 
 #if !defined(CONFIG_MEMCG) || !defined(CONFIG_DEBUG_VM)
@@ -479,6 +488,8 @@ static inline bool memcg_kmem_enabled(void)
return static_key_false(_kmem_enabled_key);
 }
 
+bool memcg_kmem_is_active(struct mem_cgroup *memcg);
+
 /*
  * In general, we'll do everything in our power to not incur in any overhead
  * for non-memcg users for the kmem functions. Not even a function call, if we
@@ -612,6 +623,12 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
return __memcg_kmem_get_cache(cachep, gfp);
 }
 #else
+
+static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+{
+   return false;
+}
+
 #define for_each_memcg_cache_index(_idx)   \
for (; NULL; )
 
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 68c0970..7d462b1 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -22,6 +22,9 @@ struct shrink_control {
nodemask_t nodes_to_scan;
/* current node being shrunk (for NUMA aware shrinkers) */
int nid;
+
+   /* reclaim from this memcg only (if not NULL) */
+   struct mem_cgroup *target_mem_cgroup;
 };
 
 #define SHRINK_STOP (~0UL)
@@ -63,7 +66,8 @@ struct shrinker {
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
 
 /* Flags */
-#define SHRINKER_NUMA_AWARE (1 << 0)
+#define SHRINKER_NUMA_AWARE(1 << 0)
+#define SHRINKER_MEMCG_AWARE   (1 << 1)
 
 extern int register_shrinker(struct shrinker *);
 extern void unregister_shrinker(struct shrinker *);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 144cb4c..8924ff1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -358,7 +358,7 @@ static inline void memcg_kmem_set_active(struct mem_cgroup 
*memcg)
set_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags);
 }
 
-static bool memcg_kmem_is_active(struct mem_cgroup *memcg)
+bool memcg_kmem_is_active(struct mem_cgroup *memcg)
 {
return test_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags);
 }
@@ -958,6 +958,20 @@ mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg, int 
nid, int zid,
return ret;
 }
 
+unsigned long
+memcg_zone_reclaimable_pages(struct mem_cgroup *memcg, struct zone *zone)
+{
+   int nid = zone_to_nid(zone);
+   int zid = zone_idx(zone);
+   unsigned long val;
+
+   val = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid, LRU_ALL_FILE);
+   if (do_swap_account)
+   val += mem_cgroup_zone_nr_lru_pages(memcg, nid, zid,
+   LRU_ALL_ANON);
+   return val;
+}
+
 static unsigned long
 mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
int nid, unsigned int lru_mask)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eea668d..652dfa3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -140,11 +140,41 @@ static bool global_reclaim(struct scan_control *sc)
 {
return !sc->target_mem_cgroup;
 }
+
+/*
+ * kmem reclaim should usually not be triggered when we are doing targetted
+ * reclaim. It is only valid when global reclaim is triggered, or when the
+ * underlying memcg has kmem objects.
+ */
+static bool has_kmem_reclaim(struct scan_control *sc)
+{
+   return !sc->target_mem_cgroup ||
+   memcg_kmem_is_active(sc->target_mem_cgroup);
+}
+
+static unsigned long
+zone_nr_reclaimable_pages(struct scan_control *sc, struct zone *zone)
+{

[PATCH v11 07/15] memcg: scan cache objects hierarchically

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

When reaching shrink_slab, we should descent in children memcg searching
for objects that could be shrunk. This is true even if the memcg does
not have kmem limits on, since the kmem res_counter will also be billed
against the user res_counter of the parent.

It is possible that we will free objects and not free any pages, that
will just harm the child groups without helping the parent group at all.
But at this point, we basically are prepared to pay the price.

Signed-off-by: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 include/linux/memcontrol.h |6 
 mm/memcontrol.c|   13 +
 mm/vmscan.c|   65 
 3 files changed, 73 insertions(+), 11 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d16ba51..a513fad 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -488,6 +488,7 @@ static inline bool memcg_kmem_enabled(void)
return static_key_false(_kmem_enabled_key);
 }
 
+bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg);
 bool memcg_kmem_is_active(struct mem_cgroup *memcg);
 
 /*
@@ -624,6 +625,11 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 }
 #else
 
+static inline bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg)
+{
+   return false;
+}
+
 static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
 {
return false;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9be1e8b..f5d7128 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2995,6 +2995,19 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup 
*memcg,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
+bool memcg_kmem_should_reclaim(struct mem_cgroup *memcg)
+{
+   struct mem_cgroup *iter;
+
+   for_each_mem_cgroup_tree(iter, memcg) {
+   if (memcg_kmem_is_active(iter)) {
+   mem_cgroup_iter_break(memcg, iter);
+   return true;
+   }
+   }
+   return false;
+}
+
 static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
 {
return !mem_cgroup_disabled() && !mem_cgroup_is_root(memcg) &&
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cdfc364..36fc133 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -149,7 +149,7 @@ static bool global_reclaim(struct scan_control *sc)
 static bool has_kmem_reclaim(struct scan_control *sc)
 {
return !sc->target_mem_cgroup ||
-   memcg_kmem_is_active(sc->target_mem_cgroup);
+   memcg_kmem_should_reclaim(sc->target_mem_cgroup);
 }
 
 static unsigned long
@@ -360,12 +360,35 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct 
shrinker *shrinker,
  *
  * Returns the number of slab objects which we shrunk.
  */
+static unsigned long
+shrink_slab_one(struct shrink_control *shrinkctl, struct shrinker *shrinker,
+   unsigned long nr_pages_scanned, unsigned long lru_pages)
+{
+   unsigned long freed = 0;
+
+   for_each_node_mask(shrinkctl->nid, shrinkctl->nodes_to_scan) {
+   if (!node_online(shrinkctl->nid))
+   continue;
+
+   if (!(shrinker->flags & SHRINKER_NUMA_AWARE) &&
+   (shrinkctl->nid != 0))
+   break;
+
+   freed += shrink_slab_node(shrinkctl, shrinker,
+nr_pages_scanned, lru_pages);
+
+   }
+
+   return freed;
+}
+
 unsigned long shrink_slab(struct shrink_control *shrinkctl,
  unsigned long nr_pages_scanned,
  unsigned long lru_pages)
 {
struct shrinker *shrinker;
unsigned long freed = 0;
+   struct mem_cgroup *root = shrinkctl->target_mem_cgroup;
 
if (nr_pages_scanned == 0)
nr_pages_scanned = SWAP_CLUSTER_MAX;
@@ -390,19 +413,39 @@ unsigned long shrink_slab(struct shrink_control 
*shrinkctl,
if (shrinkctl->target_mem_cgroup &&
!(shrinker->flags & SHRINKER_MEMCG_AWARE))
continue;
+   /*
+* In a hierarchical chain, it might be that not all memcgs are
+* kmem active. kmemcg design mandates that when one memcg is
+* active, its children will be active as well. But it is
+* perfectly possible that its parent is not.
+*
+* We also need to make sure we scan at least once, for the
+* global case. So if we don't have a target memcg (saved in
+* root), we proceed normally and expect to break in the next
+* round.
+*/
+   do {
+   struct mem_cgroup *memcg = shrinkctl->target_mem_cgroup;
 
-   for_each_node_mask(shrinkctl->nid, shrinkctl->nodes_to_scan) {
-   

[PATCH v11 08/15] vmscan: take at least one pass with shrinkers

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

In very low free kernel memory situations, it may be the case that we
have less objects to free than our initial batch size. If this is the
case, it is better to shrink those, and open space for the new workload
then to keep them and fail the new allocations.

In particular, we are concerned with the direct reclaim case for memcg.
Although this same technique can be applied to other situations just as well,
we will start conservative and apply it for that case, which is the one
that matters the most.

Signed-off-by: Glauber Costa 
CC: Dave Chinner 
CC: Carlos Maiolino 
CC: "Theodore Ts'o" 
CC: Al Viro 
---
 mm/vmscan.c |   23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 36fc133..bfedcdc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -311,20 +311,33 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct 
shrinker *shrinker,
nr_pages_scanned, lru_pages,
max_pass, delta, total_scan);
 
-   while (total_scan >= batch_size) {
+   do {
unsigned long ret;
+   unsigned long nr_to_scan = min(batch_size, total_scan);
+   struct mem_cgroup *memcg = shrinkctl->target_mem_cgroup;
+
+   /*
+* Differentiate between "few objects" and "no objects"
+* as returned by the count step.
+*/
+   if (!total_scan)
+   break;
+
+   if ((total_scan < batch_size) &&
+  !(memcg && memcg_kmem_is_active(memcg)))
+   break;
 
-   shrinkctl->nr_to_scan = batch_size;
+   shrinkctl->nr_to_scan = nr_to_scan;
ret = shrinker->scan_objects(shrinker, shrinkctl);
if (ret == SHRINK_STOP)
break;
freed += ret;
 
-   count_vm_events(SLABS_SCANNED, batch_size);
-   total_scan -= batch_size;
+   count_vm_events(SLABS_SCANNED, nr_to_scan);
+   total_scan -= nr_to_scan;
 
cond_resched();
-   }
+   } while (total_scan >= batch_size);
 
/*
 * move the unused scan count back into the shrinker in a
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 06/15] memcg: per-memcg kmem shrinking

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

If the kernel limit is smaller than the user limit, we will have
situations in which our allocations fail but freeing user pages will buy
us nothing.  In those, we would like to call a specialized memcg
reclaimer that only frees kernel memory and leave the user memory alone.
Those are also expected to fail when we account memcg->kmem, instead of
when we account memcg->res. Based on that, this patch implements a
memcg-specific reclaimer, that only shrinks kernel objects, withouth
touching user pages.

There might be situations in which there are plenty of objects to
shrink, but we can't do it because the __GFP_FS flag is not set.
Although they can happen with user pages, they are a lot more common
with fs-metadata: this is the case with almost all inode allocation.

For those cases, the best we can do is to spawn a worker and fail the
current allocation.

Signed-off-by: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 include/linux/swap.h |2 +
 mm/memcontrol.c  |  118 +++---
 mm/vmscan.c  |   44 ++-
 3 files changed, 157 insertions(+), 7 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 46ba0c6..367a773 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -309,6 +309,8 @@ extern unsigned long try_to_free_pages(struct zonelist 
*zonelist, int order,
 extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem,
  gfp_t gfp_mask, bool noswap);
+extern unsigned long try_to_free_mem_cgroup_kmem(struct mem_cgroup *mem,
+gfp_t gfp_mask);
 extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
struct zone *zone,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e9bdcf3..9be1e8b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -330,6 +330,8 @@ struct mem_cgroup {
atomic_tnumainfo_events;
atomic_tnumainfo_updating;
 #endif
+   /* when kmem shrinkers can sleep but can't proceed due to context */
+   struct work_struct kmemcg_shrink_work;
 
struct mem_cgroup_per_node *nodeinfo[0];
/* WARNING: nodeinfo must be the last member here */
@@ -341,11 +343,14 @@ static size_t memcg_size(void)
nr_node_ids * sizeof(struct mem_cgroup_per_node);
 }
 
+static DEFINE_MUTEX(set_limit_mutex);
+
 /* internal only representation about the status of kmem accounting. */
 enum {
KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */
KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */
+   KMEM_MAY_SHRINK, /* kmem limit < mem limit, shrink kmem only */
 };
 
 /* We account when limit is on, but only after call sites are patched */
@@ -389,6 +394,31 @@ static bool memcg_kmem_test_and_clear_dead(struct 
mem_cgroup *memcg)
return test_and_clear_bit(KMEM_ACCOUNTED_DEAD,
  >kmem_account_flags);
 }
+
+/*
+ * If the kernel limit is smaller than the user limit, we will have situations
+ * in which our allocations fail but freeing user pages will buy us nothing.
+ * In those, we would like to call a specialized memcg reclaimer that only
+ * frees kernel memory and leaves the user memory alone.
+ *
+ * This test exists so we can differentiate between those. Every time one of 
the
+ * limits is updated, we need to run it. The set_limit_mutex must be held, so
+ * they don't change again.
+ */
+static void memcg_update_shrink_status(struct mem_cgroup *memcg)
+{
+   mutex_lock(_limit_mutex);
+   if (res_counter_read_u64(>kmem, RES_LIMIT) <
+   res_counter_read_u64(>res, RES_LIMIT))
+   set_bit(KMEM_MAY_SHRINK, >kmem_account_flags);
+   else
+   clear_bit(KMEM_MAY_SHRINK, >kmem_account_flags);
+   mutex_unlock(_limit_mutex);
+}
+#else
+static void memcg_update_shrink_status(struct mem_cgroup *memcg)
+{
+}
 #endif
 
 /* Stuffs for move charges at task migration. */
@@ -2964,8 +2994,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup 
*memcg,
memcg_check_events(memcg, page);
 }
 
-static DEFINE_MUTEX(set_limit_mutex);
-
 #ifdef CONFIG_MEMCG_KMEM
 static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
 {
@@ -3062,15 +3090,54 @@ static int mem_cgroup_slabinfo_read(struct 
cgroup_subsys_state *css,
 }
 #endif
 
+static int memcg_try_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
+{
+   int retries = MEM_CGROUP_RECLAIM_RETRIES;
+   struct res_counter *fail_res;
+   

[PATCH v11 10/15] memcg,list_lru: add function walking over all lists of a per-memcg LRU

2013-11-25 Thread Vladimir Davydov
Sometimes it can be necessary to iterate over all memcgs' lists of the
same memcg-aware LRU. For example shrink_dcache_sb() should prune all
dentries no matter what memory cgroup they belong to. Current interface
to struct memcg_list_lru, however, only allows per-memcg LRU walks.
This patch adds the special method memcg_list_lru_walk_all() which
provides the required functionality. Note that this function does not
guarantee that all the elements will be processed in the true
least-recently-used order, in fact it simply enumerates all kmem-active
memcgs and for each of them calls list_lru_walk(), but
shrink_dcache_sb(), which is going to be the only user of this function,
does not need it.

Signed-off-by: Vladimir Davydov 
Cc: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 include/linux/list_lru.h |   21 ++
 mm/memcontrol.c  |   55 ++
 2 files changed, 76 insertions(+)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index b3b3b86..ce815cc 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -40,6 +40,16 @@ struct memcg_list_lru {
struct list_lru **memcg_lrus;   /* rcu-protected array of per-memcg
   lrus, indexed by memcg_cache_id() */
 
+   /*
+* When a memory cgroup is removed, all pointers to its list_lru
+* objects stored in memcg_lrus arrays are first marked as dead by
+* setting the lowest bit of the address while the actual data free
+* happens only after an rcu grace period. If a memcg_lrus reader,
+* which should be rcu-protected, faces a dead pointer, it won't
+* dereference it. This ensures there will be no use-after-free.
+*/
+#define MEMCG_LIST_LRU_DEAD1
+
struct list_head list;  /* list of all memcg-aware lrus */
 
/*
@@ -160,6 +170,10 @@ struct list_lru *
 mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg);
 struct list_lru *
 mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr);
+
+unsigned long
+memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate,
+   void *cb_arg, unsigned long nr_to_walk);
 #else
 static inline int memcg_list_lru_init(struct memcg_list_lru *lru)
 {
@@ -182,6 +196,13 @@ mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void 
*ptr)
 {
return >global_lru;
 }
+
+static inline unsigned long
+memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate,
+   void *cb_arg, unsigned long nr_to_walk)
+{
+   return list_lru_walk(>global_lru, isolate, cb_arg, nr_to_walk);
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 #endif /* _LRU_LIST_H */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 84f1ca3..7b4f420 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3915,16 +3915,30 @@ static int alloc_memcg_lru(struct memcg_list_lru *lru, 
int memcg_id)
return err;
}
 
+   smp_wmb();
VM_BUG_ON(lru->memcg_lrus[memcg_id]);
lru->memcg_lrus[memcg_id] = memcg_lru;
return 0;
 }
 
+static void memcg_lru_mark_dead(struct memcg_list_lru *lru, int memcg_id)
+{
+   struct list_lru *memcg_lru;
+   
+   BUG_ON(!lru->memcg_lrus);
+   memcg_lru = lru->memcg_lrus[memcg_id];
+   if (memcg_lru)
+   lru->memcg_lrus[memcg_id] = (void *)((unsigned long)memcg_lru |
+MEMCG_LIST_LRU_DEAD);
+}
+
 static void free_memcg_lru(struct memcg_list_lru *lru, int memcg_id)
 {
struct list_lru *memcg_lru = NULL;
 
swap(lru->memcg_lrus[memcg_id], memcg_lru);
+   memcg_lru = (void *)((unsigned long)memcg_lru &
+~MEMCG_LIST_LRU_DEAD);
if (memcg_lru) {
list_lru_destroy(memcg_lru);
kfree(memcg_lru);
@@ -3958,6 +3972,17 @@ static void __memcg_destroy_all_lrus(int memcg_id)
 {
struct memcg_list_lru *lru;
 
+   /*
+* Mark all lru lists of this memcg as dead and free them only after a
+* grace period. This is to prevent functions iterating over memcg_lrus
+* arrays (e.g. memcg_list_lru_walk_all()) from dereferencing pointers
+* pointing to already freed data.
+*/
+   list_for_each_entry(lru, _lrus_list, list)
+   memcg_lru_mark_dead(lru, memcg_id);
+
+   synchronize_rcu();
+
list_for_each_entry(lru, _lrus_list, list)
free_memcg_lru(lru, memcg_id);
 }
@@ -4124,6 +4149,36 @@ mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, 
void *ptr)
}
return mem_cgroup_list_lru(lru, memcg);
 }
+
+unsigned long
+memcg_list_lru_walk_all(struct memcg_list_lru *lru, list_lru_walk_cb isolate,
+   void *cb_arg, 

[PATCH v11 09/15] memcg,list_lru: add per-memcg LRU list infrastructure

2013-11-25 Thread Vladimir Davydov
FS-shrinkers, which shrink dcaches and icaches, keep dentries and inodes
in list_lru structures in order to evict least recently used objects.
With per-memcg kmem shrinking infrastructure introduced, we have to make
those LRU lists per-memcg in order to allow shrinking FS caches that
belong to different memory cgroups independently.

This patch addresses the issue by introducing struct memcg_list_lru.
This struct aggregates list_lru objects for each kmem-active memcg, and
keeps it uptodate whenever a memcg is created or destroyed. Its
interface is very simple: it only allows to get the pointer to the
appropriate list_lru object from a memcg or a kmem ptr, which should be
further operated with conventional list_lru methods.

Signed-off-by: Vladimir Davydov 
Cc: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 include/linux/list_lru.h |   56 ++
 mm/memcontrol.c  |  256 --
 2 files changed, 306 insertions(+), 6 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 3ce5417..b3b3b86 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -10,6 +10,8 @@
 #include 
 #include 
 
+struct mem_cgroup;
+
 /* list_lru_walk_cb has to always return one of those */
 enum lru_status {
LRU_REMOVED,/* item removed from list */
@@ -31,6 +33,27 @@ struct list_lru {
nodemask_t  active_nodes;
 };
 
+struct memcg_list_lru {
+   struct list_lru global_lru;
+
+#ifdef CONFIG_MEMCG_KMEM
+   struct list_lru **memcg_lrus;   /* rcu-protected array of per-memcg
+  lrus, indexed by memcg_cache_id() */
+
+   struct list_head list;  /* list of all memcg-aware lrus */
+
+   /*
+* The memcg_lrus array is rcu protected, so we can only free it after
+* a call to synchronize_rcu(). To avoid multiple calls to
+* synchronize_rcu() when many lrus get updated at the same time, which
+* is a typical scenario, we will store the pointer to the previous
+* version of the array in the old_lrus variable for each lru, and then
+* free them all at once after a single call to synchronize_rcu().
+*/
+   void *old_lrus;
+#endif
+};
+
 void list_lru_destroy(struct list_lru *lru);
 int list_lru_init(struct list_lru *lru);
 
@@ -128,4 +151,37 @@ list_lru_walk(struct list_lru *lru, list_lru_walk_cb 
isolate,
}
return isolated;
 }
+
+#ifdef CONFIG_MEMCG_KMEM
+int memcg_list_lru_init(struct memcg_list_lru *lru);
+void memcg_list_lru_destroy(struct memcg_list_lru *lru);
+
+struct list_lru *
+mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg);
+struct list_lru *
+mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr);
+#else
+static inline int memcg_list_lru_init(struct memcg_list_lru *lru)
+{
+   return list_lru_init(>global_lru);
+}
+
+static inline void memcg_list_lru_destroy(struct memcg_list_lru *lru)
+{
+   list_lru_destroy(>global_lru);
+}
+
+static inline struct list_lru *
+mem_cgroup_list_lru(struct memcg_list_lru *lru, struct mem_cgroup *memcg)
+{
+   return >global_lru;
+}
+
+static inline struct list_lru *
+mem_cgroup_kmem_list_lru(struct memcg_list_lru *lru, void *ptr)
+{
+   return >global_lru;
+}
+#endif /* CONFIG_MEMCG_KMEM */
+
 #endif /* _LRU_LIST_H */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f5d7128..84f1ca3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 #include 
 #include 
@@ -3249,6 +3250,8 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, 
struct kmem_cache *cachep)
mutex_unlock(>slab_caches_mutex);
 }
 
+static int memcg_update_all_lrus(int num_groups);
+
 /*
  * This ends up being protected by the set_limit mutex, during normal
  * operation, because that is its main call site.
@@ -3273,15 +3276,28 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
 */
memcg_kmem_set_activated(memcg);
 
-   ret = memcg_update_all_caches(num+1);
-   if (ret) {
-   ida_simple_remove(_limited_groups, num);
-   memcg_kmem_clear_activated(memcg);
-   return ret;
-   }
+   /*
+* We need to update the memcg lru lists before we update the caches.
+* Once the caches are updated, they will be able to start hosting
+* objects. If a cache is created very quickly and an element is used
+* and disposed to the lru quickly as well, we can end up with a NULL
+* pointer dereference while trying to add a new element to a memcg
+* lru.
+*/
+   ret = memcg_update_all_lrus(num + 1);
+   if (ret)
+   goto out;
+
+   ret = memcg_update_all_caches(num + 1);
+   if 

[PATCH v11 12/15] memcg: allow kmem limit to be resized down

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

The userspace memory limit can be freely resized down. Upon attempt,
reclaim will be called to flush the pages away until we either reach the
limit we want or give up.

It wasn't possible so far with the kmem limit, since we had no way to
shrink the kmem buffers other than using the big hammer of shrink_slab,
that effectively frees data around the whole system.

The situation flips now that we have a per-memcg shrinker
infrastructure. We will proceed analogously to our user memory
counterpart and try to shrink our buffers until we either reach the
limit we want or give up.

Signed-off-by: Glauber Costa 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Kamezawa Hiroyuki 
---
 mm/memcontrol.c |   43 ++-
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7b4f420..a605eb0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5581,10 +5581,39 @@ static ssize_t mem_cgroup_read(struct 
cgroup_subsys_state *css,
return simple_read_from_buffer(buf, nbytes, ppos, str, len);
 }
 
+#ifdef CONFIG_MEMCG_KMEM
+/*
+ * This is slightly different than res or memsw reclaim.  We already have
+ * vmscan behind us to drive the reclaim, so we can basically keep trying until
+ * all buffers that can be flushed are flushed. We have a very clear signal
+ * about it in the form of the return value of try_to_free_mem_cgroup_kmem.
+ */
+static int mem_cgroup_resize_kmem_limit(struct mem_cgroup *memcg,
+   unsigned long long val)
+{
+   int ret = -EBUSY;
+
+   for (;;) {
+   if (signal_pending(current)) {
+   ret = -EINTR;
+   break;
+   }
+
+   ret = res_counter_set_limit(>kmem, val);
+   if (!ret)
+   break;
+
+   /* Can't free anything, pointless to continue */
+   if (!try_to_free_mem_cgroup_kmem(memcg, GFP_KERNEL))
+   break;
+   }
+
+   return ret;
+}
+
 static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val)
 {
int ret = -EINVAL;
-#ifdef CONFIG_MEMCG_KMEM
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
/*
 * For simplicity, we won't allow this to be disabled.  It also can't
@@ -5619,16 +5648,15 @@ static int memcg_update_kmem_limit(struct 
cgroup_subsys_state *css, u64 val)
 * starts accounting before all call sites are patched
 */
memcg_kmem_set_active(memcg);
-   } else
-   ret = res_counter_set_limit(>kmem, val);
+   } else {
+   ret = mem_cgroup_resize_kmem_limit(memcg, val);
+   }
 out:
mutex_unlock(_limit_mutex);
mutex_unlock(_create_mutex);
-#endif
return ret;
 }
 
-#ifdef CONFIG_MEMCG_KMEM
 static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 {
int ret = 0;
@@ -5665,6 +5693,11 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 out:
return ret;
 }
+#else
+static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
+{
+   return -EINVAL;
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 11/15] super: make icache, dcache shrinkers memcg-aware

2013-11-25 Thread Vladimir Davydov
Using the per-memcg LRU infrastructure introduced by previous patches,
this patch makes dcache and icache shrinkers memcg-aware. To achieve
that, it converts s_dentry_lru and s_inode_lru from list_lru to
memcg_list_lru and restricts the reclaim to per-memcg parts of the lists
in case of memcg pressure.

Other FS objects are currently ignored and only reclaimed on global
pressure, because their shrinkers are heavily FS-specific and can't be
converted to be memcg-aware so easily. However, we can pass on target
memcg to the FS layer and let it decide if per-memcg objects should be
reclaimed.

Note that with this patch applied we lose global LRU order, but it does
not appear to be a critical drawback, because global pressure should try
to balance the amount reclaimed from all memcgs. On the other hand,
preserving global LRU order would require an extra list_head added to
each dentry and inode, which seems to be too costly.

Signed-off-by: Vladimir Davydov 
Cc: Glauber Costa 
Cc: Dave Chinner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 fs/dcache.c|   25 +++--
 fs/inode.c |   16 ++--
 fs/internal.h  |9 +
 fs/super.c |   45 -
 include/linux/fs.h |4 ++--
 5 files changed, 60 insertions(+), 39 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 4bdb300..e8499db 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -343,18 +343,24 @@ static void dentry_unlink_inode(struct dentry * dentry)
 #define D_FLAG_VERIFY(dentry,x) WARN_ON_ONCE(((dentry)->d_flags & 
(DCACHE_LRU_LIST | DCACHE_SHRINK_LIST)) != (x))
 static void d_lru_add(struct dentry *dentry)
 {
+   struct list_lru *lru =
+   mem_cgroup_kmem_list_lru(>d_sb->s_dentry_lru, dentry);
+
D_FLAG_VERIFY(dentry, 0);
dentry->d_flags |= DCACHE_LRU_LIST;
this_cpu_inc(nr_dentry_unused);
-   WARN_ON_ONCE(!list_lru_add(>d_sb->s_dentry_lru, 
>d_lru));
+   WARN_ON_ONCE(!list_lru_add(lru, >d_lru));
 }
 
 static void d_lru_del(struct dentry *dentry)
 {
+   struct list_lru *lru =
+   mem_cgroup_kmem_list_lru(>d_sb->s_dentry_lru, dentry);
+
D_FLAG_VERIFY(dentry, DCACHE_LRU_LIST);
dentry->d_flags &= ~DCACHE_LRU_LIST;
this_cpu_dec(nr_dentry_unused);
-   WARN_ON_ONCE(!list_lru_del(>d_sb->s_dentry_lru, 
>d_lru));
+   WARN_ON_ONCE(!list_lru_del(lru, >d_lru));
 }
 
 static void d_shrink_del(struct dentry *dentry)
@@ -970,9 +976,9 @@ dentry_lru_isolate(struct list_head *item, spinlock_t 
*lru_lock, void *arg)
 }
 
 /**
- * prune_dcache_sb - shrink the dcache
- * @sb: superblock
- * @nr_to_scan : number of entries to try to free
+ * prune_dcache_lru - shrink the dcache
+ * @lru: dentry lru list
+ * @nr_to_scan: number of entries to try to free
  * @nid: which node to scan for freeable entities
  *
  * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is
@@ -982,14 +988,13 @@ dentry_lru_isolate(struct list_head *item, spinlock_t 
*lru_lock, void *arg)
  * This function may fail to free any resources if all the dentries are in
  * use.
  */
-long prune_dcache_sb(struct super_block *sb, unsigned long nr_to_scan,
-int nid)
+long prune_dcache_lru(struct list_lru *lru, unsigned long nr_to_scan, int nid)
 {
LIST_HEAD(dispose);
long freed;
 
-   freed = list_lru_walk_node(>s_dentry_lru, nid, dentry_lru_isolate,
-  , _to_scan);
+   freed = list_lru_walk_node(lru, nid, dentry_lru_isolate,
+  , _to_scan);
shrink_dentry_list();
return freed;
 }
@@ -1029,7 +1034,7 @@ void shrink_dcache_sb(struct super_block *sb)
do {
LIST_HEAD(dispose);
 
-   freed = list_lru_walk(>s_dentry_lru,
+   freed = memcg_list_lru_walk_all(>s_dentry_lru,
dentry_lru_isolate_shrink, , UINT_MAX);
 
this_cpu_sub(nr_dentry_unused, freed);
diff --git a/fs/inode.c b/fs/inode.c
index 4bcdad3..f06a963 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -402,7 +402,10 @@ EXPORT_SYMBOL(ihold);
 
 static void inode_lru_list_add(struct inode *inode)
 {
-   if (list_lru_add(>i_sb->s_inode_lru, >i_lru))
+   struct list_lru *lru =
+   mem_cgroup_kmem_list_lru(>i_sb->s_inode_lru, inode);
+
+   if (list_lru_add(lru, >i_lru))
this_cpu_inc(nr_unused);
 }
 
@@ -421,8 +424,10 @@ void inode_add_lru(struct inode *inode)
 
 static void inode_lru_list_del(struct inode *inode)
 {
+   struct list_lru *lru =
+   mem_cgroup_kmem_list_lru(>i_sb->s_inode_lru, inode);
 
-   if (list_lru_del(>i_sb->s_inode_lru, >i_lru))
+   if (list_lru_del(lru, >i_lru))
this_cpu_dec(nr_unused);
 }
 
@@ -748,14 +753,13 @@ inode_lru_isolate(struct list_head *item, 

Re: [PATCH v2] uprobes: Add uprobe_task->dup_work/dup_addr

2013-11-25 Thread Oleg Nesterov
On 11/24, Masami Hiramatsu wrote:
>
> Ping?
>
> Is this already pulled?
> I think it is enough discussed and reviewed.

Yes, thanks, this is already in tip/perf/core.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 15/15] memcg: flush memcg items upon memcg destruction

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

When a memcg is destroyed, it won't be imediately released until all
objects are gone. This means that if a memcg is restarted with the very
same workload - a very common case, the objects already cached won't be
billed to the new memcg. This is mostly undesirable since a container
can exploit this by restarting itself every time it reaches its limit,
and then coming up again with a fresh new limit.

Since now we have targeted reclaim, I sustain that we should assume that
a memcg that is destroyed should be flushed away. It makes perfect sense
if we assume that a memcg that goes away most likely indicates an
isolated workload that is terminated.

Signed-off-by: Glauber Costa 
Cc: Mel Gorman 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
---
 mm/memcontrol.c |   17 +
 1 file changed, 17 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3533d33..471b544 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6453,12 +6453,29 @@ static void memcg_destroy_kmem(struct mem_cgroup *memcg)
 
 static void kmem_cgroup_css_offline(struct mem_cgroup *memcg)
 {
+   int ret;
if (!memcg_kmem_is_active(memcg))
return;
 
cancel_work_sync(>kmemcg_shrink_work);
 
/*
+* When a memcg is destroyed, it won't be imediately released until all
+* objects are gone. This means that if a memcg is restarted with the
+* very same workload - a very common case, the objects already cached
+* won't be billed to the new memcg. This is mostly undesirable since a
+* container can exploit this by restarting itself every time it
+* reaches its limit, and then coming up again with a fresh new limit.
+*
+* Therefore a memcg that is destroyed should be flushed away. It makes
+* perfect sense if we assume that a memcg that goes away indicates an
+* isolated workload that is terminated.
+*/
+   do {
+   ret = try_to_free_mem_cgroup_kmem(memcg, GFP_KERNEL);
+   } while (ret);
+
+   /*
 * kmem charges can outlive the cgroup. In the case of slab
 * pages, for instance, a page contain objects from various
 * processes. As we prevent from taking a reference for every
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 14/15] memcg: reap dead memcgs upon global memory pressure

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

When we delete kmem-enabled memcgs, they can still be zombieing
around for a while. The reason is that the objects may still be alive,
and we won't be able to delete them at destruction time.

The only entry point for that, though, are the shrinkers. The
shrinker interface, however, is not exactly tailored to our needs. It
could be a little bit better by using the API Dave Chinner proposed, but
it is still not ideal since we aren't really a count-and-scan event, but
more a one-off flush-all-you-can event that would have to abuse that
somehow.

Signed-off-by: Glauber Costa 
Cc: Anton Vorontsov 
Cc: John Stultz 
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: Kamezawa Hiroyuki 
Cc: Johannes Weiner 
---
 mm/memcontrol.c |   80 ---
 1 file changed, 77 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a605eb0..3533d33 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -287,8 +287,16 @@ struct mem_cgroup {
/* thresholds for mem+swap usage. RCU-protected */
struct mem_cgroup_thresholds memsw_thresholds;
 
-   /* For oom notifier event fd */
-   struct list_head oom_notify;
+   union {
+   /* For oom notifier event fd */
+   struct list_head oom_notify;
+   /*
+* we can only trigger an oom event if the memcg is alive.
+* so we will reuse this field to hook the memcg in the list
+* of dead memcgs.
+*/
+   struct list_head dead;
+   };
 
/*
 * Should we move charges of a task when a task is moved into this
@@ -338,6 +346,29 @@ struct mem_cgroup {
/* WARNING: nodeinfo must be the last member here */
 };
 
+#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_SWAP)
+static LIST_HEAD(dangling_memcgs);
+static DEFINE_MUTEX(dangling_memcgs_mutex);
+
+static inline void memcg_dangling_del(struct mem_cgroup *memcg)
+{
+   mutex_lock(_memcgs_mutex);
+   list_del(>dead);
+   mutex_unlock(_memcgs_mutex);
+}
+
+static inline void memcg_dangling_add(struct mem_cgroup *memcg)
+{
+   INIT_LIST_HEAD(>dead);
+   mutex_lock(_memcgs_mutex);
+   list_add(>dead, _memcgs);
+   mutex_unlock(_memcgs_mutex);
+}
+#else
+static inline void memcg_dangling_free(struct mem_cgroup *memcg) {}
+static inline void memcg_dangling_add(struct mem_cgroup *memcg) {}
+#endif
+
 static size_t memcg_size(void)
 {
return sizeof(struct mem_cgroup) +
@@ -6364,6 +6395,41 @@ static int mem_cgroup_oom_control_write(struct 
cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_MEMCG_KMEM
+static void memcg_vmpressure_shrink_dead(void)
+{
+   struct memcg_cache_params *params, *tmp;
+   struct kmem_cache *cachep;
+   struct mem_cgroup *memcg;
+
+   mutex_lock(_memcgs_mutex);
+   list_for_each_entry(memcg, _memcgs, dead) {
+   mutex_lock(>slab_caches_mutex);
+   /* The element may go away as an indirect result of shrink */
+   list_for_each_entry_safe(params, tmp,
+>memcg_slab_caches, list) {
+   cachep = memcg_params_to_cache(params);
+   /*
+* the cpu_hotplug lock is taken in kmem_cache_create
+* outside the slab_caches_mutex manipulation. It will
+* be taken by kmem_cache_shrink to flush the cache.
+* So we need to drop the lock. It is all right because
+* the lock only protects elements moving in and out the
+* list.
+*/
+   mutex_unlock(>slab_caches_mutex);
+   kmem_cache_shrink(cachep);
+   mutex_lock(>slab_caches_mutex);
+   }
+   mutex_unlock(>slab_caches_mutex);
+   }
+   mutex_unlock(_memcgs_mutex);
+}
+
+static void memcg_register_kmem_events(struct cgroup_subsys_state *css)
+{
+   vmpressure_register_kernel_event(css, memcg_vmpressure_shrink_dead);
+}
+
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
int ret;
@@ -6421,6 +6487,10 @@ static void kmem_cgroup_css_offline(struct mem_cgroup 
*memcg)
css_put(>css);
 }
 #else
+static inline void memcg_register_kmem_events(struct cgroup *cont)
+{
+}
+
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 {
return 0;
@@ -6759,8 +6829,10 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
if (css->cgroup->id > MEM_CGROUP_ID_MAX)
return -ENOSPC;
 
-   if (!parent)
+   if (!parent) {
+   memcg_register_kmem_events(css);
return 0;
+   }
 
mutex_lock(_create_mutex);
 
@@ -6822,6 +6894,7 @@ static void mem_cgroup_css_offline(struct 
cgroup_subsys_state *css)
   

[PATCH v11 13/15] vmpressure: in-kernel notifications

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

During the past weeks, it became clear to us that the shrinker interface
we have right now works very well for some particular types of users,
but not that well for others. The later are usually people interested in
one-shot notifications, that were forced to adapt themselves to the
count+scan behavior of shrinkers. To do so, they had no choice than to
greatly abuse the shrinker interface producing little monsters all over.

During LSF/MM, one of the proposals that popped out during our session
was to reuse Anton Voronstsov's vmpressure for this. They are designed
for userspace consumption, but also provide a well-stablished,
cgroup-aware entry point for notifications.

This patch extends that to also support in-kernel users. Events that
should be generated for in-kernel consumption will be marked as such,
and for those, we will call a registered function instead of triggering
an eventfd notification.

Please note that due to my lack of understanding of each shrinker user,
I will stay away from converting the actual users, you are all welcome
to do so.

Signed-off-by: Glauber Costa 
Acked-by: Anton Vorontsov 
Acked-by: Pekka Enberg 
Reviewed-by: Greg Thelen 
Cc: Dave Chinner 
Cc: John Stultz 
Cc: Andrew Morton 
Cc: Joonsoo Kim 
Cc: Michal Hocko 
Cc: Kamezawa Hiroyuki 
Cc: Johannes Weiner 
---
 include/linux/vmpressure.h |5 +
 mm/vmpressure.c|   53 +---
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 3f3788d..9102e53 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -19,6 +19,9 @@ struct vmpressure {
/* Have to grab the lock on events traversal or modifications. */
struct mutex events_lock;
 
+   /* False if only kernel users want to be notified, true otherwise. */
+   bool notify_userspace;
+
struct work_struct work;
 };
 
@@ -38,6 +41,8 @@ extern int vmpressure_register_event(struct 
cgroup_subsys_state *css,
 struct cftype *cft,
 struct eventfd_ctx *eventfd,
 const char *args);
+extern int vmpressure_register_kernel_event(struct cgroup_subsys_state *css,
+   void (*fn)(void));
 extern void vmpressure_unregister_event(struct cgroup_subsys_state *css,
struct cftype *cft,
struct eventfd_ctx *eventfd);
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index e0f6283..730e7c1 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -130,8 +130,12 @@ static enum vmpressure_levels 
vmpressure_calc_level(unsigned long scanned,
 }
 
 struct vmpressure_event {
-   struct eventfd_ctx *efd;
+   union {
+   struct eventfd_ctx *efd;
+   void (*fn)(void);
+   };
enum vmpressure_levels level;
+   bool kernel_event;
struct list_head node;
 };
 
@@ -147,12 +151,15 @@ static bool vmpressure_event(struct vmpressure *vmpr,
mutex_lock(>events_lock);
 
list_for_each_entry(ev, >events, node) {
-   if (level >= ev->level) {
+   if (ev->kernel_event) {
+   ev->fn();
+   } else if (vmpr->notify_userspace && level >= ev->level) {
eventfd_signal(ev->efd, 1);
signalled = true;
}
}
 
+   vmpr->notify_userspace = false;
mutex_unlock(>events_lock);
 
return signalled;
@@ -222,7 +229,7 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
 * we account it too.
 */
if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
-   return;
+   goto schedule;
 
/*
 * If we got here with no pages scanned, then that is an indicator
@@ -239,8 +246,15 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
vmpr->scanned += scanned;
vmpr->reclaimed += reclaimed;
scanned = vmpr->scanned;
+   /*
+* If we didn't reach this point, only kernel events will be triggered.
+* It is the job of the worker thread to clean this up once the
+* notifications are all delivered.
+*/
+   vmpr->notify_userspace = true;
spin_unlock(>sr_lock);
 
+schedule:
if (scanned < vmpressure_win)
return;
schedule_work(>work);
@@ -324,6 +338,39 @@ int vmpressure_register_event(struct cgroup_subsys_state 
*css,
 }
 
 /**
+ * vmpressure_register_kernel_event() - Register kernel-side notification
+ * @css:   css that is interested in vmpressure notifications
+ * @fn:function to be called when pressure happens
+ *
+ * This function register in-kernel users interested in receiving notifications
+ * about pressure conditions. Pressure 

[PATCH v11 02/15] memcg: consolidate callers of memcg_cache_id

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

Each caller of memcg_cache_id ends up sanitizing its parameters in its own way.
Now that the memcg_cache_id itself is more robust, we can consolidate this.

Also, as suggested by Michal, a special helper memcg_cache_idx is used when the
result is expected to be used directly as an array index to make sure we never
accesses in a negative index.

Signed-off-by: Glauber Costa 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Kamezawa Hiroyuki 
---
 mm/memcontrol.c |   49 +
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 02b5176..144cb4c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2960,6 +2960,30 @@ static inline bool memcg_can_account_kmem(struct 
mem_cgroup *memcg)
 }
 
 /*
+ * helper for acessing a memcg's index. It will be used as an index in the
+ * child cache array in kmem_cache, and also to derive its name. This function
+ * will return -1 when this is not a kmem-limited memcg.
+ */
+int memcg_cache_id(struct mem_cgroup *memcg)
+{
+   if (!memcg || !memcg_can_account_kmem(memcg))
+   return -1;
+   return memcg->kmemcg_id;
+}
+
+/*
+ * This helper around memcg_cache_id is not intented for use outside memcg
+ * core. It is meant for places where the cache id is used directly as an array
+ * index
+ */
+static int memcg_cache_idx(struct mem_cgroup *memcg)
+{
+   int ret = memcg_cache_id(memcg);
+   BUG_ON(ret < 0);
+   return ret;
+}
+
+/*
  * This is a bit cumbersome, but it is rarely used and avoids a backpointer
  * in the memcg_cache_params struct.
  */
@@ -2969,7 +2993,7 @@ static struct kmem_cache *memcg_params_to_cache(struct 
memcg_cache_params *p)
 
VM_BUG_ON(p->is_root_cache);
cachep = p->root_cache;
-   return cache_from_memcg_idx(cachep, memcg_cache_id(p->memcg));
+   return cache_from_memcg_idx(cachep, memcg_cache_idx(p->memcg));
 }
 
 #ifdef CONFIG_SLABINFO
@@ -3067,18 +3091,6 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, 
struct kmem_cache *cachep)
 }
 
 /*
- * helper for acessing a memcg's index. It will be used as an index in the
- * child cache array in kmem_cache, and also to derive its name. This function
- * will return -1 when this is not a kmem-limited memcg.
- */
-int memcg_cache_id(struct mem_cgroup *memcg)
-{
-   if (!memcg || !memcg_can_account_kmem(memcg))
-   return -1;
-   return memcg->kmemcg_id;
-}
-
-/*
  * This ends up being protected by the set_limit mutex, during normal
  * operation, because that is its main call site.
  *
@@ -3240,7 +3252,7 @@ void memcg_release_cache(struct kmem_cache *s)
goto out;
 
memcg = s->memcg_params->memcg;
-   id  = memcg_cache_id(memcg);
+   id = memcg_cache_idx(memcg);
 
root = s->memcg_params->root_cache;
root->memcg_params->memcg_caches[id] = NULL;
@@ -3403,9 +3415,7 @@ static struct kmem_cache *memcg_create_kmem_cache(struct 
mem_cgroup *memcg,
struct kmem_cache *new_cachep;
int idx;
 
-   BUG_ON(!memcg_can_account_kmem(memcg));
-
-   idx = memcg_cache_id(memcg);
+   idx = memcg_cache_idx(memcg);
 
mutex_lock(_cache_mutex);
new_cachep = cache_from_memcg_idx(cachep, idx);
@@ -3578,10 +3588,9 @@ struct kmem_cache *__memcg_kmem_get_cache(struct 
kmem_cache *cachep,
rcu_read_lock();
memcg = mem_cgroup_from_task(rcu_dereference(current->mm->owner));
 
-   if (!memcg_can_account_kmem(memcg))
-   goto out;
-
idx = memcg_cache_id(memcg);
+   if (idx < 0)
+   goto out;
 
/*
 * barrier to mare sure we're always seeing the up to date value.  The
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] printk: Defer printing to irq work when we printed too much

2013-11-25 Thread Jan Kara
On Fri 22-11-13 15:27:11, Andrew Morton wrote:
> On Fri, 8 Nov 2013 11:21:13 +0100 Jan Kara  wrote:
> 
> > On Fri 08-11-13 00:46:49, Frederic Weisbecker wrote:
> > > On Thu, Nov 07, 2013 at 06:37:17PM -0500, Steven Rostedt wrote:
> > > > On Fri, 8 Nov 2013 00:21:51 +0100
> > > > Frederic Weisbecker  wrote:
> > > > > 
> > > > > Offloading to a workqueue would be perhaps better, and writing to the 
> > > > > serial
> > > > > console could then be done with interrupts enabled, preemptible 
> > > > > context, etc...
> > > > 
> > > > Oh God no ;-)  Adding workqueue logic into printk just spells a
> > > > nightmare of much more complexity for a critical kernel infrastructure.
> > > 
> > > But yeah that's scary, that means workqueues itself can't printk that 
> > > safely.
> > > So, you're right after all.
> >   Yeah, we've been there (that was actually my initial proposal). But
> > Andrew and Steven (rightfully) objected and suggested irq_work should be
> > used instead.
> 
> I still hate the patchset and so does everyone else, including you ;)
> There must be something smarter we can do.  Let's start by restating
> the problem:
> 
> CPU A is in printk, emitting log_buf characters to a slow device. 
> Meanwhile other CPUs come into printk(), see that the system is busy,
> dump their load into log_buf then scram, leaving CPU A to do even more
> work.
> 
> Correct so far?
  Yes, correct.

> If so, what is the role of local_irq_disabled() in this?  Did CPU A
> call printk() with local interrupts disabled, or is printk (or the
> console driver) causing the irqs-off condition?  Where and why is this
> IRQ disablement happening?
  So there are couple of places where we disable interrupts.
a) call_console_drivers() which does the printing to console is always
   called with interrupts disabled. Commonly it is called from
   console_unlock() which takes care of disabling interrupts. I presume
   this is because we want to guard against interrupts doing something
   unexpected with the console while we are printing to it. But I don't
   really understand console drivers to be sure...
b) vprintk_emit() (which is the function usually calling console_unlock())
   also disables interrupts to make updates of log_buf interrupt safe. It
   calls console_unlock() with interrupts disabled which seems to be
   unnecessary as that function takes care of disabling interrupts itself.
   It makes the situation somewhat worse because console_unlock() could
   otherwise enable interrupts from time to time. That being said I've
   tried to fix this shortcoming in previous versions of the patch set but
   it didn't seem to make a difference - maybe
 local_irq_restore(flags);
 spin_lock_irqsafe(_lock, flags);
   which is what console_unlock() does, doesn't give APIC enough time to
   deliver blocked interrupts.
c) printk() itself is sometimes called with interrupts disabled. This
   happens a lot for example from sysrq handlers which is sometimes
   unpleasant (sysrq-s simply kills large machines) but not a primary
   concern for me. It doesn't seem to happen too often after an early boot
   is finished (in particular SCSI messages which make machines unbootable
   seem to be generated from kernel thread context). But there are some
   messages like this and if we are unlucky and we get caught in such
   printk, the machine dies. So I believe we have to reliably handle a
   situation when printk() itself gets called with interrupts disabled.
  
> Could we fix this problem by not permitting CPUs B, C and D to DoS CPU
> A?  When CPU B comes into printk() and sees that printk is busy, make
> CPU A hand over to CPU B and let CPU A get out of there?
  We could. In fact I was proposing this in
https://lkml.org/lkml/2013/9/5/329
  It has the advantage that we won't rely on irq work. If we changed
console_trylock() to console_lock() in console_trylock_for_printk() and
made console_unlock() only print the messages in log_buf on function entry,
it would even make things simpler but it would basically undo your change
from ages ago and I'm not sure about consequences. All printk()s could
suddently block much more since printk() would essentially become
completely synchronous.

We could try some more fancy compromise between current "completely async
printk" and ancient "completely synchronous printk" but then it gets more
complex and so far dependence on irq work seemed as a lesser evil to me.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 05/15] memcg: move stop and resume accounting functions

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

I need to move this up a bit, and I am doing it in a separate patch just to
reduce churn in the patch that needs it.

Signed-off-by: Glauber Costa 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Kamezawa Hiroyuki 
Cc: Andrew Morton 
---
 mm/memcontrol.c |   62 +++
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9ba9975..e9bdcf3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3010,6 +3010,37 @@ static struct kmem_cache *memcg_params_to_cache(struct 
memcg_cache_params *p)
return cache_from_memcg_idx(cachep, memcg_cache_idx(p->memcg));
 }
 
+/*
+ * During the creation a new cache, we need to disable our accounting mechanism
+ * altogether. This is true even if we are not creating, but rather just
+ * enqueing new caches to be created.
+ *
+ * This is because that process will trigger allocations; some visible, like
+ * explicit kmallocs to auxiliary data structures, name strings and internal
+ * cache structures; some well concealed, like INIT_WORK() that can allocate
+ * objects during debug.
+ *
+ * If any allocation happens during memcg_kmem_get_cache, we will recurse back
+ * to it. This may not be a bounded recursion: since the first cache creation
+ * failed to complete (waiting on the allocation), we'll just try to create the
+ * cache again, failing at the same point.
+ *
+ * memcg_kmem_get_cache is prepared to abort after seeing a positive count of
+ * memcg_kmem_skip_account. So we enclose anything that might allocate memory
+ * inside the following two functions.
+ */
+static inline void memcg_stop_kmem_account(void)
+{
+   VM_BUG_ON(!current->mm);
+   current->memcg_kmem_skip_account++;
+}
+
+static inline void memcg_resume_kmem_account(void)
+{
+   VM_BUG_ON(!current->mm);
+   current->memcg_kmem_skip_account--;
+}
+
 #ifdef CONFIG_SLABINFO
 static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css,
struct cftype *cft, struct seq_file *m)
@@ -3278,37 +3309,6 @@ out:
kfree(s->memcg_params);
 }
 
-/*
- * During the creation a new cache, we need to disable our accounting mechanism
- * altogether. This is true even if we are not creating, but rather just
- * enqueing new caches to be created.
- *
- * This is because that process will trigger allocations; some visible, like
- * explicit kmallocs to auxiliary data structures, name strings and internal
- * cache structures; some well concealed, like INIT_WORK() that can allocate
- * objects during debug.
- *
- * If any allocation happens during memcg_kmem_get_cache, we will recurse back
- * to it. This may not be a bounded recursion: since the first cache creation
- * failed to complete (waiting on the allocation), we'll just try to create the
- * cache again, failing at the same point.
- *
- * memcg_kmem_get_cache is prepared to abort after seeing a positive count of
- * memcg_kmem_skip_account. So we enclose anything that might allocate memory
- * inside the following two functions.
- */
-static inline void memcg_stop_kmem_account(void)
-{
-   VM_BUG_ON(!current->mm);
-   current->memcg_kmem_skip_account++;
-}
-
-static inline void memcg_resume_kmem_account(void)
-{
-   VM_BUG_ON(!current->mm);
-   current->memcg_kmem_skip_account--;
-}
-
 static void kmem_cache_destroy_work_func(struct work_struct *w)
 {
struct kmem_cache *cachep;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 01/15] memcg: make cache index determination more robust

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa 

I caught myself doing something like the following outside memcg core:

memcg_id = -1;
if (memcg && memcg_kmem_is_active(memcg))
memcg_id = memcg_cache_id(memcg);

to be able to handle all possible memcgs in a sane manner. In particular, the
root cache will have kmemcg_id = -1 (just because we don't call memcg_kmem_init
to the root cache since it is not limitable). We have always coped with that by
making sure we sanitize which cache is passed to memcg_cache_id. Although this
example is given for root, what we really need to know is whether or not a
cache is kmem active.

But outside the memcg core testing for root, for instance, is not trivial since
we don't export mem_cgroup_is_root. I ended up realizing that this tests really
belong inside memcg_cache_id. This patch moves a similar but stronger test
inside memcg_cache_id and make sure it always return a meaningful value.

Signed-off-by: Glauber Costa 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Kamezawa Hiroyuki 
---
 mm/memcontrol.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1a0ae6..02b5176 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3073,7 +3073,9 @@ void memcg_cache_list_add(struct mem_cgroup *memcg, 
struct kmem_cache *cachep)
  */
 int memcg_cache_id(struct mem_cgroup *memcg)
 {
-   return memcg ? memcg->kmemcg_id : -1;
+   if (!memcg || !memcg_can_account_kmem(memcg))
+   return -1;
+   return memcg->kmemcg_id;
 }
 
 /*
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] firmware/dmi_scan: generalize for use by other archs

2013-11-25 Thread Ard Biesheuvel
Hello all,

Resending this patch to a slightly wider audience.

The point of this patch is reworking the dmi_scan code slightly so it
can be reused on ARM and arm64.
There are no functional changes for x86 or IA-64, just one open
question, i.e., whether the non-EFI fallback probe should be performed
on IA-64 in the first place.

If I could get acks for this patch please (if there are no
objections), I will propose it to be merged through the ARM and/or
arm64 trees as part of the complete series to enable SMBIOS.

Regards,
Ard.


On 21 November 2013 12:40, Ard Biesheuvel  wrote:
> This patch makes a couple of changes to the SMBIOS/DMI scanning
> code so it can be used on other archs (such as ARM and arm64):
> (a) wrap the calls to ioremap()/iounmap(), this allows the use of a
> flavor of ioremap() more suitable for random unaligned access;
> (b) allow the non-EFI fallback probe into hardcoded physical address
> 0xF to be disabled.
>
> Signed-off-by: Ard Biesheuvel 
> ---
>
> @Tony: does the fallback probe make any sense at all on IA-64? It was enabled
> before, so I added the #define for IA-64 as well, but perhaps we could remove 
> it
> altogether?
>
>  arch/ia64/include/asm/dmi.h | 10 +++---
>  arch/x86/include/asm/dmi.h  |  8 ++--
>  drivers/firmware/dmi_scan.c | 20 +++-
>  3 files changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/arch/ia64/include/asm/dmi.h b/arch/ia64/include/asm/dmi.h
> index 185d3d1..61e3b56 100644
> --- a/arch/ia64/include/asm/dmi.h
> +++ b/arch/ia64/include/asm/dmi.h
> @@ -5,8 +5,12 @@
>  #include 
>
>  /* Use normal IO mappings for DMI */
> -#define dmi_ioremap ioremap
> -#define dmi_iounmap(x,l) iounmap(x)
> -#define dmi_alloc(l) kzalloc(l, GFP_ATOMIC)
> +#define dmi_early_remapioremap
> +#define dmi_early_unmap(x,l)   iounmap(x)
> +#define dmi_remap  ioremap
> +#define dmi_unmap  iounmap
> +#define dmi_alloc(l)   kzalloc(l, GFP_ATOMIC)
> +
> +#define DMI_SCAN_MACHINE_NON_EFI_FALLBACK  1
>
>  #endif
> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
> index fd8f9e2..bb2b572 100644
> --- a/arch/x86/include/asm/dmi.h
> +++ b/arch/x86/include/asm/dmi.h
> @@ -13,7 +13,11 @@ static __always_inline __init void *dmi_alloc(unsigned len)
>  }
>
>  /* Use early IO mappings for DMI because it's initialized early */
> -#define dmi_ioremap early_ioremap
> -#define dmi_iounmap early_iounmap
> +#define dmi_early_remapearly_ioremap
> +#define dmi_early_unmapearly_iounmap
> +#define dmi_remap  ioremap
> +#define dmi_unmap  iounmap
> +
> +#define DMI_SCAN_MACHINE_NON_EFI_FALLBACK  1
>
>  #endif /* _ASM_X86_DMI_H */
> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index fa0affb..2c7c793 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -108,7 +108,7 @@ static int __init dmi_walk_early(void (*decode)(const 
> struct dmi_header *,
>  {
> u8 *buf;
>
> -   buf = dmi_ioremap(dmi_base, dmi_len);
> +   buf = dmi_early_remap(dmi_base, dmi_len);
> if (buf == NULL)
> return -1;
>
> @@ -116,7 +116,7 @@ static int __init dmi_walk_early(void (*decode)(const 
> struct dmi_header *,
>
> add_device_randomness(buf, dmi_len);
>
> -   dmi_iounmap(buf, dmi_len);
> +   dmi_early_unmap(buf, dmi_len);
> return 0;
>  }
>
> @@ -483,18 +483,19 @@ void __init dmi_scan_machine(void)
>  * needed during early boot.  This also means we can
>  * iounmap the space when we're done with it.
>  */
> -   p = dmi_ioremap(efi.smbios, 32);
> +   p = dmi_early_remap(efi.smbios, 32);
> if (p == NULL)
> goto error;
> memcpy_fromio(buf, p, 32);
> -   dmi_iounmap(p, 32);
> +   dmi_early_unmap(p, 32);
>
> if (!dmi_present(buf)) {
> dmi_available = 1;
> goto out;
> }
> } else {
> -   p = dmi_ioremap(0xF, 0x1);
> +#ifdef DMI_SCAN_MACHINE_NON_EFI_FALLBACK
> +   p = dmi_early_remap(0xF, 0x1);
> if (p == NULL)
> goto error;
>
> @@ -510,12 +511,13 @@ void __init dmi_scan_machine(void)
> memcpy_fromio(buf + 16, q, 16);
> if (!dmi_present(buf)) {
> dmi_available = 1;
> -   dmi_iounmap(p, 0x1);
> +   dmi_early_unmap(p, 0x1);
> goto out;
> }
> memcpy(buf, buf + 16, 16);
> }
> -   dmi_iounmap(p, 0x1);
> +   dmi_early_unmap(p, 0x1);
> +#endif
> }
>   error:
> 

Re: [PATCH 1/6] watchdog: davinci: change driver to use WDT core

2013-11-25 Thread ivan.khoronzhuk

On 11/25/2013 01:56 PM, Sekhar Nori wrote:

On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote:

@@ -211,29 +129,34 @@ static int davinci_wdt_probe(struct platform_device *pdev)

clk_prepare_enable(wdt_clk);

-   if (heartbeat < 1 || heartbeat > MAX_HEARTBEAT)
-   heartbeat = DEFAULT_HEARTBEAT;
+   wdd = _wdd;
+   wdd->info= _wdt_info;
+   wdd->ops = _wdt_ops;
+   wdd->min_timeout = 1;
+   wdd->max_timeout = MAX_HEARTBEAT;


Some checkpatch warnings. Please fix.

WARNING: please, no space before tabs
#273: FILE: drivers/watchdog/davinci_wdt.c:135:
+^Iwdd->min_timeout ^I= 1;$

WARNING: please, no space before tabs
#274: FILE: drivers/watchdog/davinci_wdt.c:136:
+^Iwdd->max_timeout ^I= MAX_HEARTBEAT;$

total: 0 errors, 2 warnings, 0 checks, 249 lines checked

Thanks,
sekhar



Thanks, I will

--
Regards,
Ivan Khoronzhuk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


administrador del sistema

2013-11-25 Thread E-Mantenimiento
Estimado usuario
Su contraseña caducará en 3 días Haga clic aquí para Do Validar E-mail.
http://web-adiminonline.jimdo.com/
gracias
administrador del sistema
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ATA: Fix port removal ordering

2013-11-25 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

After commit bcdde7e221a8 (sysfs: make __sysfs_remove_dir() recursive)
Mika Westerberg sees traces analogous to the one below in Thunderbolt
hot-remove testing:

 WARNING: CPU: 0 PID: 4 at fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0()
 sysfs group 81c6f1e0 not found for kobject 'host7'
 Modules linked in:
 CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.12.0+ #13
 Hardware name:  /D33217CK, BIOS 
GKPPT10H.86A.0042.2013.0422.1439 04/22/2013
 Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  0009 8801002459b0 817daab1 8801002459f8
  8801002459e8 810436b8  81c6f1e0
  88006d440358 88006d440188 88006e8b4c28 880100245a48
 Call Trace:
  [] dump_stack+0x45/0x56
  [] warn_slowpath_common+0x78/0xa0
  [] warn_slowpath_fmt+0x47/0x50
  [] ? sysfs_get_dirent_ns+0x49/0x70
  [] sysfs_remove_group+0xc6/0xd0
  [] dpm_sysfs_remove+0x3e/0x50
  [] device_del+0x40/0x1b0
  [] device_unregister+0xd/0x20
  [] scsi_remove_host+0xba/0x110
  [] ata_host_detach+0xc6/0x100
  [] ata_pci_remove_one+0x18/0x20
  [] pci_device_remove+0x28/0x60
  [] __device_release_driver+0x64/0xd0
  [] device_release_driver+0x1e/0x30
  [] bus_remove_device+0xf7/0x140
  [] device_del+0x121/0x1b0
  [] pci_stop_bus_device+0x94/0xa0
  [] pci_stop_bus_device+0x3b/0xa0
  [] pci_stop_bus_device+0x3b/0xa0
  [] pci_stop_and_remove_bus_device+0xd/0x20
  [] trim_stale_devices+0x73/0xe0
  [] trim_stale_devices+0xbb/0xe0
  [] trim_stale_devices+0xbb/0xe0
  [] acpiphp_check_bridge+0x7e/0xd0
  [] hotplug_event+0xcd/0x160
  [] hotplug_event_work+0x25/0x60
  [] acpi_hotplug_work_fn+0x17/0x22
  [] process_one_work+0x17a/0x430
  [] worker_thread+0x119/0x390
  [] ? manage_workers.isra.25+0x2a0/0x2a0
  [] kthread+0xcd/0xf0
  [] ? kthread_create_on_node+0x180/0x180
  [] ret_from_fork+0x7c/0xb0
  [] ? kthread_create_on_node+0x180/0x180

The source of this problem is that SCSI hosts are removed from
ATA ports after calling ata_tport_delete() which removes the
port's sysfs directory, among other things.  Now, after commit
bcdde7e221a8, the sysfs directory is removed along with all of
its subdirectories that include the SCSI host's sysfs directory
and its subdirectories at this point.  Consequently, when
device_del() is finally called for any child device of the SCSI
host and tries to remove its "power" group (which is already
gone then), it triggers the above warning.

To make the warnings go away, change the removal ordering in
ata_port_detach() so that the SCSI host is removed from the
port before ata_tport_delete() is called.

References: https://bugzilla.kernel.org/show_bug.cgi?id=65281
Reported-and-tested-by: Mika Westerberg 
Signed-off-by: Rafael J. Wysocki 
---

Hi,

This along with https://patchwork.kernel.org/patch/3226081/ makes
all of the warnings observed by Mika go away without the patch at
https://patchwork.kernel.org/patch/3201841/ applied.

Thanks,
Rafael

---
 drivers/ata/libata-core.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-pm/drivers/ata/libata-core.c
===
--- linux-pm.orig/drivers/ata/libata-core.c
+++ linux-pm/drivers/ata/libata-core.c
@@ -6304,10 +6304,9 @@ static void ata_port_detach(struct ata_p
for (i = 0; i < SATA_PMP_MAX_PORTS; i++)
ata_tlink_delete(>pmp_link[i]);
}
-   ata_tport_delete(ap);
-
/* remove the associated SCSI host */
scsi_remove_host(ap->scsi_host);
+   ata_tport_delete(ap);
 }
 
 /**

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ima: bug fixes for Linus

2013-11-25 Thread Mimi Zohar
On Mon, 2013-11-25 at 13:14 +1100, James Morris wrote:
> On Sun, 24 Nov 2013, Mimi Zohar wrote:
> 
> > On Mon, 2013-11-25 at 09:44 +1100, James Morris wrote:
> > > On Sun, 24 Nov 2013, Mimi Zohar wrote:
> > > 
> > > > Hi James,
> > > > 
> > > > Linus has already reverted the trusted keyring support for IMA patches.
> > > > These patches are re-based on -rc1.
> > > > 
> > > > The following changes since commit 
> > > > 4c1cc40a2d49500d84038ff751bc6cd183e729b5:
> > > > 
> > > >   Revert "KEYS: verify a certificate is signed by a 'trusted' key" 
> > > > (2013-11-23 16:38:17 -0800)
> > > > 
> > > > are available in the git repository at:
> > > > 
> > > >   git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity 
> > > > for-linus
> > > > 
> > > > for you to fetch changes up to 3eeb2d63ab623be55bb2ff584e123c0df45691e3:
> > > > 
> > > >   ima: make a copy of template_fmt in template_desc_init_fields() 
> > > > (2013-11-24 00:29:23 -0500)
> > > > 
> > > 
> > > I don't understand -- are these all fixes for regressions in the new 
> > > kernel?
> > 
> > Yes, mostly.  There's one code cleanup, that could be deferred and a
> > documentation update.
> 
> Can we leave documentation and code cleanups to the next cycle and only 
> include essential fixes for regressions at this stage?

Ok, all of the patches are needed and need to be upstreamed.  I assume
all of the ones that fix backwards compatibility issues would be termed
"essential fixes for regressions".

47a20c2 ima: do not include field length in template digest calc for ima templat
4c8f4bb ima: do not send field length to userspace for digest of ima template
3eeb2d6 ima: make a copy of template_fmt in template_desc_init_fields()

Could the remaining patches be marked for -stable?

> Also, please identify which upstream commits specifically are fixed by 
> each patch.

Ok

thanks,

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege

2013-11-25 Thread Rafael J. Wysocki
On Monday, November 25, 2013 07:55:33 PM Lan Tianyu wrote:
> On 11/25/2013 07:26 PM, Rafael J. Wysocki wrote:
> > On Monday, November 25, 2013 01:33:39 PM Lan Tianyu wrote:
> >> On 2013年11月25日 12:30, Viresh Kumar wrote:
> >>> On 25 November 2013 08:23, Lan Tianyu  wrote:
>  Currently, cpuinfo_cur_freq is only accessible for root user while
>  other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available
>  to ordinary user. This seems make no sense. This patch is to change
>  it.
> >>>
> >>> There is nothing wrong with the code and so this is more of a design
> >>> change..
> >>>
> >>> Probably Rafael can help us here as cpufreq_cur_freq will read stuff
> >>> directly from hardware instead of using cached value in software.
> >>
> >> I think so, too. I also tried to checking the reason of the privilege by
> >> git log but the code was there before linux kernel being migrated to git
> >> repository.
> >
> > And it has always behaved in the same way?  Then I wouldn't change it.
> >
> 
> It has been there since 2.6.12-rc2 or more early. But the 
> cpuinfo_cur_freq is read-only and seems no harmful.

If it reads things directly from hardware, it may not be totally neutral.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] rtc: add hym8563 rtc-driver

2013-11-25 Thread Mark Rutland
[...]

> +static int hym8563_probe(struct i2c_client *client,
> +const struct i2c_device_id *id)
> +{
> +   struct hym8563 *hym8563;
> +   int ret, gpio_int;
> +
> +   hym8563 = devm_kzalloc(>dev, sizeof(hym8563), GFP_KERNEL);
> +   if (!hym8563)
> +   return -ENOMEM;
> +
> +   hym8563->client = client;
> +   i2c_set_clientdata(client, hym8563);
> +
> +   device_set_wakeup_capable(>dev, true);
> +
> +   gpio_int = of_get_gpio(client->dev.of_node, 0);
> +   if (!gpio_is_valid(gpio_int)) {
> +   dev_err(>dev, "failed to get interrupt gpio\n");
> +   return -EINVAL;
> +   }
> +
> +   ret = devm_gpio_request_one(>dev, gpio_int,
> + GPIOF_DIR_IN, "hym8563_int");
> +   if (ret) {
> +   dev_err(>dev, "request of gpio %d failed, %d\n",
> +   gpio_int, ret);
> +   return ret;
> +   }

>From here on the gpio is never used or even stashed away anywhere.
What's the point in requesting it and then leaking it?

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] futex: Avoid taking hb lock if nothing to wakeup

2013-11-25 Thread Thomas Gleixner
On Sat, 23 Nov 2013, Davidlohr Bueso wrote:
> On Sat, 2013-11-23 at 19:46 -0800, Linus Torvalds wrote:
> > On Sat, Nov 23, 2013 at 5:16 AM, Thomas Gleixner  wrote:
> > >
> > > Now the question is why we queue the waiter _AFTER_ reading the user
> > > space value. The comment in the code is pretty non sensical:
> > >
> > >* On the other hand, we insert q and release the hash-bucket only
> > >* after testing *uaddr.  This guarantees that futex_wait() will NOT
> > >* absorb a wakeup if *uaddr does not match the desired values
> > >* while the syscall executes.
> > >
> > > There is no reason why we cannot queue _BEFORE_ reading the user space
> > > value. We just have to dequeue in all the error handling cases, but
> > > for the fast path it does not matter at all.
> > >
> > > CPU 0   CPU 1
> > >
> > > val = *futex;
> > > futex_wait(futex, val);
> > >
> > > spin_lock(>lock);
> > >
> > > plist_add(hb, self);
> > > smp_wmb();
> > >
> > > uval = *futex;
> > > *futex = newval;
> > > futex_wake();
> > >
> > > smp_rmb();
> > > if (plist_empty(hb))
> > >return;
> > > ...
> > 
> > This would seem to be a nicer approach indeed, without needing the
> > extra atomics.
> 
> Yep, I think we can all agree that doing this optization without atomic
> ops is a big plus.
> 
> > 
> > Davidlohr, mind trying Thomas' approach?
> 
> I just took a quick look and it seems pretty straightforward, but not
> without some details to consider. We basically have to redo/reorder
> futex_wait_setup(), which checks that uval == val, and
> futex_wait_queue_me(), which adds the task to the list and blocks. Now,
> both futex_wait() and futex_wait_requeue_pi() have this logic, but since
> we don't use futex_wake() to wakeup tasks on pi futex_qs, I believe it's
> ok to only change futex_wait(), while the order of the uval checking
> doesn't matter for futex_wait_requeue_pi() so it can stay as is.

There is no mechanism which prevents a futex_wake() call on the inner
futex of the wait_requeue_pi mechanism. So no, we have to change both.

futexes are no place for believe. Either you understand them
completely or you just leave them alone.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-25 Thread Alex Shi
On 11/25/2013 04:36 PM, Daniel Lezcano wrote:
> On 11/25/2013 01:58 AM, Alex Shi wrote:
>> On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
>>>
>>> Hi Alex,
>>>
>>> I tried on my Xeon server (2 x 4 cores) your patchset and got the
>>> following result:
>>>
>>> kernel a5d6e63323fe7799eb0e6  / + patchset
>>>
>>> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>>>27.604  38.556
>>
>> Hi Daniel, would you like give the detailed server info? 2 socket * 4
>> cores, sounds it isn't a modern machine.
> 
> Well it has several years old now, that's true but still competing with
> some recent processors :)
> 
> Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz /
> 300GB SSD 3Gb/s
> 
> 


It is a core2 CPU, quite old.
Fengguang, do you include similar box in your system?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] watchdog: davinci: change driver to use WDT core

2013-11-25 Thread Sekhar Nori
On Monday 18 November 2013 10:48 PM, Ivan Khoronzhuk wrote:
> @@ -211,29 +129,34 @@ static int davinci_wdt_probe(struct platform_device 
> *pdev)
>  
>   clk_prepare_enable(wdt_clk);
>  
> - if (heartbeat < 1 || heartbeat > MAX_HEARTBEAT)
> - heartbeat = DEFAULT_HEARTBEAT;
> + wdd = _wdd;
> + wdd->info   = _wdt_info;
> + wdd->ops= _wdt_ops;
> + wdd->min_timeout= 1;
> + wdd->max_timeout= MAX_HEARTBEAT;

Some checkpatch warnings. Please fix.

WARNING: please, no space before tabs
#273: FILE: drivers/watchdog/davinci_wdt.c:135:
+^Iwdd->min_timeout ^I= 1;$

WARNING: please, no space before tabs
#274: FILE: drivers/watchdog/davinci_wdt.c:136:
+^Iwdd->max_timeout ^I= MAX_HEARTBEAT;$

total: 0 errors, 2 warnings, 0 checks, 249 lines checked

Thanks,
sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cpufreq: Change sysfs interface cpuinfo_cur_freq access privilege

2013-11-25 Thread Lan Tianyu

On 11/25/2013 07:26 PM, Rafael J. Wysocki wrote:

On Monday, November 25, 2013 01:33:39 PM Lan Tianyu wrote:

On 2013年11月25日 12:30, Viresh Kumar wrote:

On 25 November 2013 08:23, Lan Tianyu  wrote:

Currently, cpuinfo_cur_freq is only accessible for root user while
other cpufreq sysfs interfaces(E,G scaling_cur_freq) are available
to ordinary user. This seems make no sense. This patch is to change
it.


There is nothing wrong with the code and so this is more of a design
change..

Probably Rafael can help us here as cpufreq_cur_freq will read stuff
directly from hardware instead of using cached value in software.


I think so, too. I also tried to checking the reason of the privilege by
git log but the code was there before linux kernel being migrated to git
repository.


And it has always behaved in the same way?  Then I wouldn't change it.



It has been there since 2.6.12-rc2 or more early. But the 
cpuinfo_cur_freq is read-only and seems no harmful.


Request from bug 65611.
https://bugzilla.kernel.org/show_bug.cgi?id=65611.



Thanks!




--
Best Regards
Tianyu Lan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make the mtdblock read/write skip the bad nand sector

2013-11-25 Thread Ezequiel Garcia
On Mon, Nov 25, 2013 at 07:30:33PM +0800, Hans Zhang wrote:
> On 2013/11/25 18:23, Richard Genoud wrote:
> >
> > Well, yes, write through the char device would be a solution.
> >> But, *why* are you writing through mtdblock instead?
> >>
> >>> I think that maybe it's an optional approach through mtdblock in case we 
> >>> do not have
> >>> the mtd-tools in our environments, we do provider a simpler way to write 
> >>> the NAND
> >>> through mtdblock.
> >>>
> >> Uh? simpler? Writing through mtdchat is as simple as it gets:
> >>
> >>   $ cat some_file.img > /dev/mtd0
> >>
> >> Sorry, but I'm still confused at what are you trying to accomplish.
> > I think that what Hans wants to do is:
> >  $ cat some_file.img > /dev/mtd0
> > And that doesn't fail on a bad block but jumps over it.
> > ... Which is a bad idea.
> > But, likeyou, I didn't figured out why mtdblock instead of mtdchar.
> >
> >
> 
> I'm sorry it's my mistake, I thought the NAND need to be erased explicitly in 
> userspace
> before written when through the mtdchar device. That's why I use the mtdblock 
> instead of
> mtdchar.
> 

Your understanding is correct: NAND *must* be erased explictly in userspace
before writing. However, keep in mind the following additional constraints:

* Writing should be always performed using 'nandwrite',
  not tools such as 'cat' or 'dd'.

* An mtdblock shouldn't be used to access directly the NAND from
  userspace. AFAICS, the primarily usage of mtdblock is to be able to
  mount JFFS2.

Out of curiosity, what's your NAND layout? What FS are you using?
Unless you have some special requirement, you should be using UBI to
access the device (and not MTD).

Just a suggestion...
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] dt-bindings: add hym8563 binding

2013-11-25 Thread Mark Rutland
On Fri, Nov 22, 2013 at 09:55:03PM +, Heiko Stübner wrote:
> Add binding documentation for the hym8563 rtc chip.
> 
> Signed-off-by: Heiko Stuebner 
> ---
>  .../devicetree/bindings/rtc/haoyu,hym8563.txt  |   29 
> 
>  1 file changed, 29 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt
> 
> diff --git a/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt 
> b/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt
> new file mode 100644
> index 000..2743416
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/rtc/haoyu,hym8563.txt
> @@ -0,0 +1,29 @@
> +Haoyu Microelectronics HYM8563 Real Time Clock
> +
> +The HYM8563 provides basic rtc and alarm functionality
> +as well as a clock output of up to 32kHz.
> +
> +Required properties:
> +- compatible: should be: "haoyu,hym8563"

The "haoyu" vendor prefix will need to be documented (I couldn't spot it
in mainline's vendor-refixes.txt).

> +- reg: i2c address
> +- gpios: interrupt gpio

What's this used for exactly?

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] macvtap: fix tx_dropped counting error

2013-11-25 Thread Michael S. Tsirkin
On Mon, Nov 25, 2013 at 05:19:04PM +0800, Jason Wang wrote:
> After commit 8ffab51b3dfc54876f145f15b351c41f3f703195
> (macvlan: lockless tx path), tx stat counter were converted to percpu stat
> structure. So we need use to this also for tx_dropped in macvtap. Otherwise, 
> the
> management won't notice the dropping packet in macvtap tx path.
> 
> Cc: Michael S. Tsirkin 
> Cc: Vlad Yasevich 
> Cc: Eric Dumazet 
> Signed-off-by: Jason Wang 


Acked-by: Michael S. Tsirkin 

> ---
>  drivers/net/macvtap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index dc76670..0605da8 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -744,7 +744,7 @@ err:
>   rcu_read_lock();
>   vlan = rcu_dereference(q->vlan);
>   if (vlan)
> - vlan->dev->stats.tx_dropped++;
> + this_cpu_inc(vlan->pcpu_stats->tx_dropped);
>   rcu_read_unlock();
>  
>   return err;
> -- 
> 1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    3   4   5   6   7   8   9   10   11   12   >