Re: [PATCH V2] usbnet: fix race condition caused spinlock bad magic issue

2013-11-12 Thread Oliver Neukum
On Wed, 2013-11-13 at 09:11 +0800, wangbiao wrote:

Hi,

> 2, for the same reason,dev->wait should be judged again before use it,
> as between the judge point(if(dev->wait)) and use point(wakeup(dev->wait)),
> the dev->wait may be set NULL by another cpu.
> 
> for issue 1, declare  unlink_wakeup in global section instead of on stack.
> for issue 2, use a temporary local var to keep the value of dev->wait
> in stack and judge it before using.
> 
> Signed-off-by: wang, biao 
> Acked-by: Ingo Molnar 
> Acked-by: Oliver Neukum 

Well, I didn't exactly ack this version. I requested a change
to the last version, which you did, but you did also other things.

> Acked-by: Zhang, Di 
> ---
>  drivers/net/usb/usbnet.c |6 --
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
> index 90a429b..0603ef6 100644
> --- a/drivers/net/usb/usbnet.c
> +++ b/drivers/net/usb/usbnet.c
> @@ -86,6 +86,7 @@ static const char driver_name [] = "usbnet";
>  
>  /* use ethtool to change the level for any given device */
>  static int msg_level = -1;
> +static DECLARE_WAIT_QUEUE_HEAD(unlink_wakeup);
>  module_param (msg_level, int, 0);
>  MODULE_PARM_DESC (msg_level, "Override default message level");
>  
> @@ -761,7 +762,6 @@ EXPORT_SYMBOL_GPL(usbnet_unlink_rx_urbs);
>  // precondition: never called in_interrupt
>  static void usbnet_terminate_urbs(struct usbnet *dev)
>  {
> - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(unlink_wakeup);
>   DECLARE_WAITQUEUE(wait, current);
>   int temp;

That solves the first problem.
 
> @@ -1449,7 +1449,9 @@ static void usbnet_bh (unsigned long param)
>   // waiting for all pending urbs to complete?
>   if (dev->wait) {
>   if ((dev->txq.qlen + dev->rxq.qlen + dev->done.qlen) == 0) {
> - wake_up (dev->wait);
> + wait_queue_head_t *wait_d = dev->wait;
> + if (wait_d)

Here's the window. Either it can be freed or not. Moving the check
won't help.

> + wake_up(wait_d);
>   }
>  
>   // or are we maybe short a few urbs?

Regards
Oliver



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/17] [m68k] IRQ: add handle_polled_irq() for timer based soft interrupts

2013-11-12 Thread Michael Schmitz

Thomas,

I suppose setting the flag can be done in the corresponding irq 
startup

function, instead of when setting up the irq controller?


irq_startup() is called with irq_desc->lock held and
irq_set_status_flags() wants desc->lock as well. Deadlock


Thanks, point taken.


And no, you don't want to fiddle manually in the irq descriptor data
fields. See commit a6967caf00eb :)


I'm not that crazy :-)



Geert - I will send the patch to ataints.c implementing this as soon 
as

Thomas' fix is merged.


I'll expedite it for 3.13.


Thanks, that'll help.

Cheers,

Michael




Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-m68k" 
in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-11-12 Thread Andreas Dilger
On Nov 13, 2013, at 12:00 AM, David Turner  wrote:
> This patch is against e2fsprogs.
> 
> ---
> Older kernels on 64-bit machines would incorrectly encode pre-1970
> ext4 dates as post-2311 dates.  Detect and correct this (assuming the
> current date is before 2311).
> 
> Signed-off-by: David Turner 
> ---
> e2fsck/pass1.c   | 37 +
> e2fsck/problem.c |  7 +++
> e2fsck/problem.h |  6 ++
> 3 files changed, 50 insertions(+)
> 
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index ab23e42..cb72964 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -348,6 +348,23 @@ fix:
>   EXT2_INODE_SIZE(sb), "pass1");
> }
> 
> +#define EXT4_EPOCH_BITS 2
> +#define EXT4_EPOCH_MASK ((1 << EXT4_EPOCH_BITS) - 1)
> +
> +static int large_inode_extra(__u32 xtime, __u32 extra) {

“large_inode_extra()” doesn’t really describe the purpose of this function very
well, which is checking the timestamps to have large future (or past) dates.
How about a more descriptive name “check_old_ext4_negative_epoch()” or
maybe “check_inode_extra_negative_epoch()” or something like that?

> + return (xtime & (1 << 31)) != 0 &&
> + (extra & EXT4_EPOCH_MASK) == EXT4_EPOCH_MASK;
> +}
> +
> +#define LARGE_INODE_EXTRA(inode, xtime) \

Ditto.

> + large_inode_extra(inode->i_##xtime, \
> +   inode->i_##xtime##_extra)
> +
> +/* When the date is earlier than 2311, we assume that atimes, ctimes,
> + * and mtimes greater than 2311 are actually pre-1970 dates mis-encoded.

I like the idea of checking the current date, so that there isn’t a need to
revert this code at some point in the future.  I’m wondering if there should
be a margin, like “When the current date is earlier than 2240 we assume ...
times greater than 2311 are actually ...”?   I’d hope that old versions of
pre-3.14 kernels are not still running by then.

> +#define EXT4_EXTRA_NEGATIVE_DATE_CUTOFF 6 * (1UL << 32)

That would make this (5 * (1ULL << 32)).  I think this should be ULL so that
if there are 64-bit timestamps on 32-bit systems it will still work correctly.

> static void check_inode_extra_space(e2fsck_t ctx, struct problem_context 
> *pctx)
> {
>   struct ext2_super_block *sb = ctx->fs->super;
> @@ -388,6 +405,26 @@ static void check_inode_extra_space(e2fsck_t ctx, struct 
> problem_context *pctx)
>   /* it seems inode has an extended attribute(s) in body */
>   check_ea_in_inode(ctx, pctx);
>   }
> +
> + /*
> +  * If the inode's extended atime (ctime, mtime) is stored in
> +  * the old, invalid format, the inode is corrupt.
> +  */
> + if (sizeof(time_t) > 4 && ctx->now < EXT4_EXTRA_NEGATIVE_DATE_CUTOFF &&
> + LARGE_INODE_EXTRA(inode, atime) ||
> + LARGE_INODE_EXTRA(inode, ctime) ||
> + LARGE_INODE_EXTRA(inode, mtime)) {

(style) please align continued line after ‘(‘ of previous line, otherwise it 
isn’t
easy to see if these are continuations of the condition or if they are part of 
the
body of the condition like the lines below.

> + if (!fix_problem(ctx, PR_1_EA_TIME_OUT_OF_RANGE, pctx))
> + return;
> +
> + inode->i_atime_extra &= ~EXT4_EPOCH_MASK;
> + inode->i_ctime_extra &= ~EXT4_EPOCH_MASK;
> + inode->i_mtime_extra &= ~EXT4_EPOCH_MASK;
> + e2fsck_write_inode_full(ctx, pctx->ino, pctx->inode,
> + EXT2_INODE_SIZE(sb), "pass1");
> + }
> +
> }
> 
> /*
> diff --git a/e2fsck/problem.c b/e2fsck/problem.c
> index 897693a..51fa7c3 100644
> --- a/e2fsck/problem.c
> +++ b/e2fsck/problem.c
> @@ -1018,6 +1018,13 @@ static struct e2fsck_problem problem_table[] = {
> N_("@i %i, end of extent exceeds allowed value\n\t(logical @b %c, 
> physical @b %b, len %N)\n"),
> PROMPT_CLEAR, 0 },
> 
> +/* The extended a, c, or mtime on this inode is in the far future,
> +   indicating that it was written with an older, buggy version of the
> +   kernel on a 64-bit machine */

Please make the comment match the expanded text as closely as possible.
Otherwise, it is hard to track down some problem that prints one message,
but uses the crazy @foo encodings and it isn’t clear what part of e2fsck
generated it.  It is fine if it contains more text.

> + { PR_1_EA_TIME_OUT_OF_RANGE,
> +   N_("Extended time on @i %i is in the far future.\n"
> + "Assume that it is in fact a pre-1970 date written by an older, 
> buggy version of Linux?\n"), 

(style) please align after ‘(‘ from previous line and under 80 columns.

That said, I think this message is a bit harsh, since those older, buggy 
versions
of Linux include all versions running today.  I’d probably make a more succinct
message like:

   N_(“Timestamp(s) on @i %i beyond 2033 are likely pre-1970 dates.\n”)

> + PROMPT_FIX, 0 },

I’d probably also make this error code “PROMPT_FIX | PREEN_OK 

[patch] aio: checking for NULL instead of IS_ERR

2013-11-12 Thread Dan Carpenter
alloc_anon_inode() returns an ERR_PTR(), it doesn't return NULL.

Fixes: 71ad7490c1f3 ('rework aio migrate pages to use aio fs')
Signed-off-by: Dan Carpenter 

diff --git a/fs/aio.c b/fs/aio.c
index bf8d080..699f53e 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -164,8 +164,8 @@ static struct file *aio_private_file(struct kioctx *ctx, 
loff_t nr_pages)
struct file *file;
struct path path;
struct inode *inode = alloc_anon_inode(aio_mnt->mnt_sb);
-   if (!inode)
-   return ERR_PTR(-ENOMEM);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
 
inode->i_mapping->a_ops = _ctx_aops;
inode->i_mapping->private_data = ctx;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] dmaengine: edma: double free on error in edma_prep_slave_sg()

2013-11-12 Thread Dan Carpenter
We accidentally applied two correct but duplicative fixes for a memory
leak here:
4b6271a64463 ('dma: edma: Fix memory leak')
2f6d8fad0a16 ('dma: edma.c: remove edma_desc leakage')

Signed-off-by: Dan Carpenter 

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index ea4abaa..9c8103d 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -420,7 +420,6 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
edma_alloc_slot(EDMA_CTLR(echan->ch_num),
EDMA_SLOT_ANY);
if (echan->slot[i] < 0) {
-   kfree(edesc);
dev_err(dev, "Failed to allocate slot\n");
kfree(edesc);
return NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] mfd: max14577: Add max14577 MFD driver core

2013-11-12 Thread Krzysztof Kozlowski
From: Chanwoo Choi 

This patch adds max14577 core/irq driver to support MUIC(Micro USB IC)
device and charger device and support irq domain method to control
internal interrupt of max14577 device. Also, this patch supports DT
binding with max14577_i2c_parse_dt().

The MAXIM 14577 chip contains Micro-USB Interface Circuit and Li+ Battery
Charger. It contains accessory and USB charger detection logic. It supports
USB 2.0 Hi-Speed, UART and stereo audio signals over Micro-USB connector.

The battery charger is compliant with the USB Battery Charging Specification
Revision 1.1. It has also SFOUT LDO output for powering USB devices.

Signed-off-by: Chanwoo Choi 
Signed-off-by: Krzysztof Kozlowski 
Signed-off-by: Kyungmin Park 
---
 drivers/mfd/Kconfig  |   13 ++
 drivers/mfd/Makefile |1 +
 drivers/mfd/max14577-irq.c   |  283 +
 drivers/mfd/max14577.c   |  268 +++
 include/linux/mfd/max14577-private.h |  291 ++
 include/linux/mfd/max14577.h |   76 +
 6 files changed, 932 insertions(+)
 create mode 100644 drivers/mfd/max14577-irq.c
 create mode 100644 drivers/mfd/max14577.c
 create mode 100644 include/linux/mfd/max14577-private.h
 create mode 100644 include/linux/mfd/max14577.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 914c3d1..f2a1a76 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -309,6 +309,19 @@ config MFD_88PM860X
  select individual components like voltage regulators, RTC and
  battery-charger under the corresponding menus.
 
+config MFD_MAX14577
+   bool "Maxim Semiconductor MAX14577 MUIC + Charger Support"
+   depends on I2C=y
+   select MFD_CORE
+   select REGMAP_I2C
+   select IRQ_DOMAIN
+   help
+ Say yes here to support for Maxim Semiconductor MAX14577.
+ This is a Micro-USB IC with Charger controls on chip.
+ This driver provides common support for accessing the device;
+ additional drivers must be enabled in order to use the functionality
+ of the device.
+
 config MFD_MAX77686
bool "Maxim Semiconductor MAX77686 PMIC Support"
depends on I2C=y
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 15b905c..548c87d 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -110,6 +110,7 @@ obj-$(CONFIG_MFD_DA9055)+= da9055.o
 da9063-objs:= da9063-core.o da9063-irq.o da9063-i2c.o
 obj-$(CONFIG_MFD_DA9063)   += da9063.o
 
+obj-$(CONFIG_MFD_MAX14577) += max14577.o max14577-irq.o
 obj-$(CONFIG_MFD_MAX77686) += max77686.o max77686-irq.o
 obj-$(CONFIG_MFD_MAX77693) += max77693.o max77693-irq.o
 obj-$(CONFIG_MFD_MAX8907)  += max8907.o
diff --git a/drivers/mfd/max14577-irq.c b/drivers/mfd/max14577-irq.c
new file mode 100644
index 000..9cd8012
--- /dev/null
+++ b/drivers/mfd/max14577-irq.c
@@ -0,0 +1,283 @@
+/*
+ * max14577-irq.c - MFD Interrupt controller support for MAX14577
+ *
+ * Copyright (C) 2013 Samsung Electronics Co.Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ * This driver is based on max8997-irq.c
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * After resuming from suspend it may happen that IRQ is signalled but
+ * IRQ GPIO is not high. Also the interrupt registers won't have any data
+ * (all of them equal to 0x00).
+ *
+ * In such case retry few times reading the interrupt registers.
+ */
+#define IRQ_READ_REG_RETRY_CNT 5
+
+static const u8 max14577_mask_reg[] = {
+   [MAX14577_IRQ_INT1] = MAX14577_REG_INTMASK1,
+   [MAX14577_IRQ_INT2] = MAX14577_REG_INTMASK2,
+   [MAX14577_IRQ_INT3] = MAX14577_REG_INTMASK3,
+};
+
+struct max14577_irq_data {
+   int mask;
+   enum max14577_irq_source group;
+};
+
+#define DECLARE_IRQ(idx, _group, _mask)\
+   [(idx)] = { .group = (_group), .mask = (_mask) }
+static const struct max14577_irq_data max14577_irqs[] = {
+   DECLARE_IRQ(MAX14577_IRQ_INT1_ADC,  MAX14577_IRQ_INT1, 1 << 
0),
+   DECLARE_IRQ(MAX14577_IRQ_INT1_ADCLOW,   MAX14577_IRQ_INT1, 1 << 
1),
+   DECLARE_IRQ(MAX14577_IRQ_INT1_ADCERR,   

[PATCH 2/4] extcon: max77693: Add extcon-max14577 driver to support MUIC device

2013-11-12 Thread Krzysztof Kozlowski
From: Chanwoo Choi 

This patch supports Maxim MAX14577 MUIC(Micro USB Interface Controller)
device by using EXTCON subsystem to handle various external connectors.
The max14577 device uses regmap method for i2c communication and
supports irq domain.

Signed-off-by: Chanwoo Choi 
Signed-off-by: Krzysztof Kozlowski 
Signed-off-by: Kyungmin Park 
---
 drivers/extcon/Kconfig   |   10 +
 drivers/extcon/Makefile  |1 +
 drivers/extcon/extcon-max14577.c |  806 ++
 3 files changed, 817 insertions(+)
 create mode 100644 drivers/extcon/extcon-max14577.c

diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig
index f1d54a3..bdb5a00 100644
--- a/drivers/extcon/Kconfig
+++ b/drivers/extcon/Kconfig
@@ -31,6 +31,16 @@ config EXTCON_ADC_JACK
help
  Say Y here to enable extcon device driver based on ADC values.
 
+config EXTCON_MAX14577
+   tristate "MAX14577 EXTCON Support"
+   depends on MFD_MAX14577
+   select IRQ_DOMAIN
+   select REGMAP_I2C
+   help
+ If you say yes here you get support for the MUIC device of
+ Maxim MAX14577 PMIC. The MAX14577 MUIC is a USB port accessory
+ detector and switch.
+
 config EXTCON_MAX77693
tristate "MAX77693 EXTCON Support"
depends on MFD_MAX77693 && INPUT
diff --git a/drivers/extcon/Makefile b/drivers/extcon/Makefile
index 759fdae..43eccc0 100644
--- a/drivers/extcon/Makefile
+++ b/drivers/extcon/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_OF_EXTCON) += of_extcon.o
 obj-$(CONFIG_EXTCON)   += extcon-class.o
 obj-$(CONFIG_EXTCON_GPIO)  += extcon-gpio.o
 obj-$(CONFIG_EXTCON_ADC_JACK)  += extcon-adc-jack.o
+obj-$(CONFIG_EXTCON_MAX14577)  += extcon-max14577.o
 obj-$(CONFIG_EXTCON_MAX77693)  += extcon-max77693.o
 obj-$(CONFIG_EXTCON_MAX8997)   += extcon-max8997.o
 obj-$(CONFIG_EXTCON_ARIZONA)   += extcon-arizona.o
diff --git a/drivers/extcon/extcon-max14577.c b/drivers/extcon/extcon-max14577.c
new file mode 100644
index 000..e629d81
--- /dev/null
+++ b/drivers/extcon/extcon-max14577.c
@@ -0,0 +1,806 @@
+/*
+ * extcon-max14577.c - MAX14577 extcon driver to support MAX14577 MUIC
+ *
+ * Copyright (C) 2013 Samsung Electrnoics
+ * Chanwoo Choi 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#defineDEV_NAME"max14577-muic"
+#defineDELAY_MS_DEFAULT17000   /* unit: 
millisecond */
+
+enum max14577_muic_adc_debounce_time {
+   ADC_DEBOUNCE_TIME_5MS = 0,
+   ADC_DEBOUNCE_TIME_10MS,
+   ADC_DEBOUNCE_TIME_25MS,
+   ADC_DEBOUNCE_TIME_38_62MS,
+};
+
+enum max14577_muic_status {
+   MAX14577_MUIC_STATUS1 = 0,
+   MAX14577_MUIC_STATUS2 = 1,
+   MAX14577_MUIC_STATUS_END,
+};
+
+struct max14577_muic_info {
+   struct device *dev;
+   struct max14577 *max14577;
+   struct extcon_dev *edev;
+   int prev_cable_type;
+   int prev_chg_type;
+   u8 status[MAX14577_MUIC_STATUS_END];
+
+   bool irq_adc;
+   bool irq_chg;
+   struct work_struct irq_work;
+   struct mutex mutex;
+
+   /*
+* Use delayed workqueue to detect cable state and then
+* notify cable state to notifiee/platform through uevent.
+* After completing the booting of platform, the extcon provider
+* driver should notify cable state to upper layer.
+*/
+   struct delayed_work wq_detcable;
+
+   /*
+* Default usb/uart path whether UART/USB or AUX_UART/AUX_USB
+* h/w path of COMP2/COMN1 on CONTROL1 register.
+*/
+   int path_usb;
+   int path_uart;
+};
+
+enum max14577_muic_cable_group {
+   MAX14577_CABLE_GROUP_ADC = 0,
+   MAX14577_CABLE_GROUP_CHG,
+   MAX14577_CABLE_GROUP_VBVOLT,
+};
+
+/**
+ * struct max14577_muic_irq
+ * @irq: the index of irq list of MUIC device.
+ * @name: the name of irq.
+ * @virq: the virtual irq to use irq domain
+ */
+struct max14577_muic_irq {
+   unsigned int irq;
+   const char *name;
+   unsigned int virq;
+};
+
+static struct max14577_muic_irq muic_irqs[] = {
+   { MAX14577_IRQ_INT1_ADC,"muic-ADC" },
+   { MAX14577_IRQ_INT1_ADCLOW, "muic-ADCLOW" },
+   { MAX14577_IRQ_INT1_ADCERR, "muic-ADCError" },
+   { MAX14577_IRQ_INT2_CHGTYP, "muic-CHGTYP" },
+   { MAX14577_IRQ_INT2_CHGDETRUN,  "muic-CHGDETRUN" },
+

[PATCH 3/4] charger: max14577: Add charger support for Maxim 14577

2013-11-12 Thread Krzysztof Kozlowski
MAX14577 chip is a multi-function device which includes MUIC, charger
and voltage regulator. The driver is located in drivers/mfd.

This patch supports battery charging control of MAX14577 chip and
provides power supply class information to userspace.

Signed-off-by: Krzysztof Kozlowski 
Signed-off-by: Kyungmin Park 
---
 drivers/power/Kconfig|7 +
 drivers/power/Makefile   |1 +
 drivers/power/max14577_charger.c |  327 ++
 3 files changed, 335 insertions(+)
 create mode 100644 drivers/power/max14577_charger.c

diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
index e6f92b45..fb789df 100644
--- a/drivers/power/Kconfig
+++ b/drivers/power/Kconfig
@@ -316,6 +316,13 @@ config CHARGER_MANAGER
   runtime and in suspend-to-RAM by waking up the system periodically
   with help of suspend_again support.
 
+config CHARGER_MAX14577
+   tristate "Maxim MAX14577 MUIC battery charger driver"
+   depends on MFD_MAX14577
+   help
+ Say Y to enable support for the battery charger control sysfs and
+ platform data of MAX14577 MUICs.
+
 config CHARGER_MAX8997
tristate "Maxim MAX8997/MAX8966 PMIC battery charger driver"
depends on MFD_MAX8997 && REGULATOR_MAX8997
diff --git a/drivers/power/Makefile b/drivers/power/Makefile
index a4b7417..aa30084 100644
--- a/drivers/power/Makefile
+++ b/drivers/power/Makefile
@@ -48,6 +48,7 @@ obj-$(CONFIG_CHARGER_LP8727)  += lp8727_charger.o
 obj-$(CONFIG_CHARGER_LP8788)   += lp8788-charger.o
 obj-$(CONFIG_CHARGER_GPIO) += gpio-charger.o
 obj-$(CONFIG_CHARGER_MANAGER)  += charger-manager.o
+obj-$(CONFIG_CHARGER_MAX14577) += max14577_charger.o
 obj-$(CONFIG_CHARGER_MAX8997)  += max8997_charger.o
 obj-$(CONFIG_CHARGER_MAX8998)  += max8998_charger.o
 obj-$(CONFIG_CHARGER_BQ2415X)  += bq2415x_charger.o
diff --git a/drivers/power/max14577_charger.c b/drivers/power/max14577_charger.c
new file mode 100644
index 000..614be6d
--- /dev/null
+++ b/drivers/power/max14577_charger.c
@@ -0,0 +1,327 @@
+/*
+ * max14577_charger.c - Battery charger driver for the Maxim 14577
+ *
+ * Copyright (C) 2013 Samsung Electronics
+ * Krzysztof Kozlowski 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+struct max14577_charger {
+   struct device *dev;
+   struct max14577 *max14577;
+   struct power_supply charger;
+
+   unsigned intcharging_state;
+   unsigned intbattery_state;
+};
+
+static int max14577_get_charger_state(struct max14577_charger *chg)
+{
+   struct regmap *rmap = chg->max14577->regmap;
+   int state = POWER_SUPPLY_STATUS_DISCHARGING;
+   u8 reg_data;
+
+   /*
+* Charging occurs only if:
+*  - CHGCTRL2/MBCHOSTEN == 1
+*  - STATUS2/CGMBC == 1
+*
+* TODO:
+*  - handle FULL after Top-off timer (EOC register may be off
+*and the charger won't be charging although MBCHOSTEN is on)
+*  - handle properly dead-battery charging (respect timer)
+*  - handle timers (fast-charge and prequal) /MBCCHGERR/
+*/
+   max14577_read_reg(rmap, MAX14577_CHG_REG_CHG_CTRL2, _data);
+   if ((reg_data & CHGCTRL2_MBCHOSTEN_MASK) == 0)
+   goto state_set;
+
+   max14577_read_reg(rmap, MAX14577_CHG_REG_STATUS3, _data);
+   if (reg_data & STATUS3_CGMBC_MASK) {
+   /* Charger or USB-cable is connected */
+   if (reg_data & STATUS3_EOC_MASK)
+   state = POWER_SUPPLY_STATUS_FULL;
+   else
+   state = POWER_SUPPLY_STATUS_CHARGING;
+   goto state_set;
+   }
+
+state_set:
+   chg->charging_state = state;
+   return state;
+}
+
+/*
+ * Supported charge types:
+ *  - POWER_SUPPLY_CHARGE_TYPE_NONE
+ *  - POWER_SUPPLY_CHARGE_TYPE_FAST
+ */
+static int max14577_get_charge_type(struct max14577_charger *chg)
+{
+   /*
+* TODO: CHARGE_TYPE_TRICKLE (VCHGR_RC or EOC)?
+* As spec says:
+* [after reaching EOC interrupt]
+* "When the battery is fully charged, the 30-minute (typ)
+*  top-off timer starts. The device continues to trickle
+*  charge the battery until the top-off timer 

[PATCH 4/4] regulator: max14577: Add regulator driver for Maxim 14577

2013-11-12 Thread Krzysztof Kozlowski
MAX14577 chip is a multi-function device which includes MUIC,
charger and voltage regulator. The driver is located in drivers/mfd.

This patch adds regulator driver for MAX14577 chip. There are two
regulators in this chip:
1. Safeout LDO with constant voltage output of 4.9V. It can be only
   enabled or disabled.
2. Current regulator for the charger. It provides current from 90mA up
   to 950mA.
Driver supports Device Tree.

Signed-off-by: Krzysztof Kozlowski 
Signed-off-by: Kyungmin Park 
---
 drivers/regulator/Kconfig|7 +
 drivers/regulator/Makefile   |1 +
 drivers/regulator/max14577.c |  365 ++
 3 files changed, 373 insertions(+)
 create mode 100644 drivers/regulator/max14577.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index ce785f4..11ee053 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -249,6 +249,13 @@ config REGULATOR_LP8788
help
  This driver supports LP8788 voltage regulator chip.
 
+config REGULATOR_MAX14577
+   tristate "Maxim 14577 regulator"
+   depends on MFD_MAX14577
+   help
+ This driver controls a Maxim 14577 regulator via I2C bus.
+ The regulators include safeout LDO and current regulator 'CHARGER'.
+
 config REGULATOR_MAX1586
tristate "Maxim 1586/1587 voltage regulator"
depends on I2C
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index 01c597e..654bd43 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_REGULATOR_LP872X) += lp872x.o
 obj-$(CONFIG_REGULATOR_LP8788) += lp8788-buck.o
 obj-$(CONFIG_REGULATOR_LP8788) += lp8788-ldo.o
 obj-$(CONFIG_REGULATOR_LP8755) += lp8755.o
+obj-$(CONFIG_REGULATOR_MAX14577) += max14577.o
 obj-$(CONFIG_REGULATOR_MAX1586) += max1586.o
 obj-$(CONFIG_REGULATOR_MAX8649)+= max8649.o
 obj-$(CONFIG_REGULATOR_MAX8660) += max8660.o
diff --git a/drivers/regulator/max14577.c b/drivers/regulator/max14577.c
new file mode 100644
index 000..e05c445
--- /dev/null
+++ b/drivers/regulator/max14577.c
@@ -0,0 +1,365 @@
+/*
+ * max14577.c - Regulator driver for the Maxim 14577
+ *
+ * Copyright (C) 2013 Samsung Electronics
+ * Krzysztof Kozlowski 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct max14577_regulator {
+   struct device *dev;
+   struct max14577 *max14577;
+   int num_regulators;
+   struct regulator_dev **regulators;
+};
+
+static int max14577_reg_calc_charge_current(struct regmap *rmap)
+{
+   u8 reg_data;
+   max14577_read_reg(rmap, MAX14577_CHG_REG_CHG_CTRL4, _data);
+
+   if ((reg_data & CHGCTRL4_MBCICHWRCL_MASK) == 0)
+   return MAX14577_REGULATOR_CURRENT_LIMIT_MIN;
+
+   reg_data = ((reg_data & CHGCTRL4_MBCICHWRCH_MASK) >>
+   CHGCTRL4_MBCICHWRCH_SHIFT);
+   return MAX14577_REGULATOR_CURRENT_LIMIT_HIGH_START +
+   reg_data * MAX14577_REGULATOR_CURRENT_LIMIT_HIGH_STEP;
+}
+
+static inline int max14577_reg_set_safeout(struct regmap *rmap, int on)
+{
+   u8 reg_data = (on ? 0x1 << CTRL2_SFOUTORD_SHIFT : 0x0);
+
+   return max14577_update_reg(rmap, MAX14577_REG_CONTROL2,
+   CTRL2_SFOUTORD_MASK, reg_data);
+}
+
+static inline int max14577_reg_set_charging(struct regmap *rmap, int on)
+{
+   u8 reg_data = (on ? 0x1 << CHGCTRL2_MBCHOSTEN_SHIFT : 0x0);
+
+   return max14577_update_reg(rmap, MAX14577_CHG_REG_CHG_CTRL2,
+   CHGCTRL2_MBCHOSTEN_MASK, reg_data);
+}
+
+static int max14577_reg_is_enabled(struct regulator_dev *rdev)
+{
+   int rid = rdev_get_id(rdev);
+   struct regmap *rmap = rdev->regmap;
+   u8 reg_data;
+
+   switch (rid) {
+   case MAX14577_CHARGER:
+   max14577_read_reg(rmap, MAX14577_CHG_REG_CHG_CTRL2, _data);
+   if ((reg_data & CHGCTRL2_MBCHOSTEN_MASK) == 0)
+   return 0;
+   max14577_read_reg(rmap, MAX14577_CHG_REG_STATUS3, _data);
+   if ((reg_data & STATUS3_CGMBC_MASK) == 0)
+   return 0;
+   /* MBCHOSTEN and CGMBC are on */
+   return 1;
+   case MAX14577_SAFEOUT:
+  

[PATCH 0/4] mfd: max14577: Add max14577 MFD drivers

2013-11-12 Thread Krzysztof Kozlowski
Hi,

This patchset adds drivers for MAXIM 14577 chip. The chip contains Micro-USB
Interface Circuit and Li+ Battery Charger. It contains accessory and USB
charger detection logic. It supports USB 2.0 Hi-Speed, UART and stereo
audio signals over Micro-USB connector.

The battery charger is compliant with the USB Battery Charging Specification
Revision 1.1. It has also SFOUT LDO output for powering USB devices.

The patchset consists of following drivers:
1. MFD core driver.
2. Extcon driver for the MUIC (Micro USB Interface Controller).
3. Charger driver using power supply class.
4. Regulator driver for SFOUT and charger.

The patchset is rebased on latest Linus' tree (v3.12-4849-g10d0c97) however
testing was mostly done on 3.10. Except minor change in extcon_dev_register()
function the patchset cleanly applies to 3.10 and 3.12.


Best regards,
Krzysztof Kozlowski


Chanwoo Choi (2):
  mfd: max14577: Add max14577 MFD driver core
  extcon: max77693: Add extcon-max14577 driver to support MUIC device

Krzysztof Kozlowski (2):
  charger: max14577: Add charger support for Maxim 14577
  regulator: max14577: Add regulator driver for Maxim 14577

 drivers/extcon/Kconfig   |   10 +
 drivers/extcon/Makefile  |1 +
 drivers/extcon/extcon-max14577.c |  806 ++
 drivers/mfd/Kconfig  |   13 +
 drivers/mfd/Makefile |1 +
 drivers/mfd/max14577-irq.c   |  283 
 drivers/mfd/max14577.c   |  268 +++
 drivers/power/Kconfig|7 +
 drivers/power/Makefile   |1 +
 drivers/power/max14577_charger.c |  327 ++
 drivers/regulator/Kconfig|8 +
 drivers/regulator/Makefile   |1 +
 drivers/regulator/max14577.c |  365 +++
 include/linux/mfd/max14577-private.h |  291 
 include/linux/mfd/max14577.h |   76 
 15 files changed, 2458 insertions(+)
 create mode 100644 drivers/extcon/extcon-max14577.c
 create mode 100644 drivers/mfd/max14577-irq.c
 create mode 100644 drivers/mfd/max14577.c
 create mode 100644 drivers/power/max14577_charger.c
 create mode 100644 drivers/regulator/max14577.c
 create mode 100644 include/linux/mfd/max14577-private.h
 create mode 100644 include/linux/mfd/max14577.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible regression with cgroups in 3.11

2013-11-12 Thread Tejun Heo
Hey, guys.

cc'ing people from "workqueue, pci: INFO: possible recursive locking
detected" thread.

  http://thread.gmane.org/gmane.linux.kernel/1525779

So, to resolve that issue, we ripped out lockdep annotation from
work_on_cpu() and cgroup is now experiencing deadlock involving
work_on_cpu().  It *could* be that workqueue is actually broken or
memcg is looping but it doesn't seem like a very good idea to not have
lockdep annotation around work_on_cpu().

IIRC, there was one pci code path which called work_on_cpu()
recursively.  Would it be possible for that path to use something like
work_on_cpu_nested(XXX, depth) so that we can retain lockdep
annotation on work_on_cpu()?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Ceph: Avoid data inconsistency due to d-cache aliasing in readpage()

2013-11-12 Thread Li Wang
If the length of data to be read in readpage() is exactly
PAGE_CACHE_SIZE, the original code does not flush d-cache
for data consistency after finishing reading. This patches fixes
this.

Signed-off-by: Li Wang 
---
 fs/ceph/addr.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6df8bd4..7ba 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -210,9 +210,13 @@ static int readpage_nounlock(struct file *filp, struct 
page *page)
if (err < 0) {
SetPageError(page);
goto out;
-   } else if (err < PAGE_CACHE_SIZE) {
+   } else {
+   if (err < PAGE_CACHE_SIZE) {
/* zero fill remainder of page */
-   zero_user_segment(page, err, PAGE_CACHE_SIZE);
+   zero_user_segment(page, err, PAGE_CACHE_SIZE);
+   } else {
+   flush_dcache_page(page);
+   }
}
SetPageUptodate(page);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] userns: allow privileged user to operate locked mount

2013-11-12 Thread Gao feng
Privileged user should have rights to mount/umount/move
these even locked mount.

Signed-off-by: Gao feng 
---
 fs/namespace.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index da5c494..7097fc7 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1297,6 +1297,11 @@ static inline bool may_mount(void)
return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN);
 }
 
+static inline bool may_mount_lock(struct mount *mnt)
+{
+   return !(mnt->mnt.mnt_flags & MNT_LOCKED) || capable(CAP_SYS_ADMIN);
+}
+
 /*
  * Now umount can handle mount points as well as block devices.
  * This is important for filesystems which use unnamed block devices.
@@ -1330,7 +1335,7 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags)
goto dput_and_out;
if (!check_mnt(mnt))
goto dput_and_out;
-   if (mnt->mnt.mnt_flags & MNT_LOCKED)
+   if (!may_mount_lock(mnt))
goto dput_and_out;
 
retval = do_umount(mnt, flags);
@@ -1768,7 +1773,8 @@ static int do_loopback(struct path *path, const char 
*old_name,
if (!check_mnt(parent) || !check_mnt(old))
goto out2;
 
-   if (!recurse && has_locked_children(old, old_path.dentry))
+   if (!recurse && has_locked_children(old, old_path.dentry) &&
+   !capable(CAP_SYS_ADMIN))
goto out2;
 
if (recurse)
@@ -1895,7 +1901,7 @@ static int do_move_mount(struct path *path, const char 
*old_name)
if (!check_mnt(p) || !check_mnt(old))
goto out1;
 
-   if (old->mnt.mnt_flags & MNT_LOCKED)
+   if (!may_mount_lock(old))
goto out1;
 
err = -EINVAL;
@@ -2679,7 +2685,7 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root,
goto out4;
if (!check_mnt(root_mnt) || !check_mnt(new_mnt))
goto out4;
-   if (new_mnt->mnt.mnt_flags & MNT_LOCKED)
+   if (!may_mount_lock(new_mnt))
goto out4;
error = -ENOENT;
if (d_unlinked(new.dentry))
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted

2013-11-12 Thread Gao feng
On 11/09/2013 01:42 PM, Eric W. Biederman wrote:
> Gao feng  writes:
> 
>> On 11/02/2013 02:06 PM, Gao feng wrote:
>>> Hi Eric,
>>>
>>> On 08/28/2013 05:44 AM, Eric W. Biederman wrote:

 Rely on the fact that another flavor of the filesystem is already
 mounted and do not rely on state in the user namespace.

 Verify that the mounted filesystem is not covered in any significant
 way.  I would love to verify that the previously mounted filesystem
 has no mounts on top but there are at least the directories
 /proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
 for other filesystems to mount on top of.

 Refactor the test into a function named fs_fully_visible and call that
 function from the mount routines of proc and sysfs.  This makes this
 test local to the filesystems involved and the results current of when
 the mounts take place, removing a weird threading of the user
 namespace, the mount namespace and the filesystems themselves.

 Signed-off-by: "Eric W. Biederman" 
 ---
  fs/namespace.c |   37 
 +
  fs/proc/root.c |7 +--
  fs/sysfs/mount.c   |3 ++-
  include/linux/fs.h |1 +
  include/linux/user_namespace.h |4 
  kernel/user.c  |2 --
  kernel/user_namespace.c|2 --
  7 files changed, 33 insertions(+), 23 deletions(-)

 diff --git a/fs/namespace.c b/fs/namespace.c
 index 64627f8..877e427 100644
 --- a/fs/namespace.c
 +++ b/fs/namespace.c
 @@ -2867,25 +2867,38 @@ bool current_chrooted(void)
return chrooted;
  }
  
 -void update_mnt_policy(struct user_namespace *userns)
 +bool fs_fully_visible(struct file_system_type *type)
  {
struct mnt_namespace *ns = current->nsproxy->mnt_ns;
struct mount *mnt;
 +  bool visible = false;
  
 -  down_read(_sem);
 +  if (unlikely(!ns))
 +  return false;
 +
 +  namespace_lock();
list_for_each_entry(mnt, >list, mnt_list) {
 -  switch (mnt->mnt.mnt_sb->s_magic) {
 -  case SYSFS_MAGIC:
 -  userns->may_mount_sysfs = true;
 -  break;
 -  case PROC_SUPER_MAGIC:
 -  userns->may_mount_proc = true;
 -  break;
 +  struct mount *child;
 +  if (mnt->mnt.mnt_sb->s_type != type)
 +  continue;
 +
 +  /* This mount is not fully visible if there are any child mounts
 +   * that cover anything except for empty directories.
 +   */
 +  list_for_each_entry(child, >mnt_mounts, mnt_child) {
 +  struct inode *inode = child->mnt_mountpoint->d_inode;
 +  if (!S_ISDIR(inode->i_mode))
 +  goto next;
 +  if (inode->i_nlink != 2)
 +  goto next;
>>>
>>>
>>> I met a problem that proc filesystem failed to mount in user namespace,
>>> The problem is the i_nlink of sysctl entries under proc filesystem is not
>>> 2. it always is 1 even it's a directory, see proc_sys_make_inode. and for
>>> btrfs, the i_nlink for an empty dir is 2 too. it seems like depends on the
>>> filesystem itself,not depends on vfs. In my system binfmt_misc is mounted
>>> on /proc/sys/fs/binfmt_misc, and the i_nlink of this directory's inode is
>>> 1.
> 
> Yes. 1 is what filesystems that are too lazy to count the number of
> links to a directory return, and /proc/sys is currently such a
> filesystem.
> 
> Ordinarily nlink == 2 means a directory does not have any subdirectories.
> 
>>> btw, I'm not quite understand what's the inode->i_nlink != 2 here means?
>>> is this directory empty? as I know, when we create a file(not dir) under
>>> a dir, the i_nlink of this dir will not increase.
>>>
>>> And another question, it looks like if we don't have proc/sys fs mounted,
>>> then proc/sys will be failed to be mounted?
>>>
>>
>> Any Idea?? or should we need to revert this patch??
> 
> The patch is mostly doing what it is supposed to be doing.
> 
> Now the code is slightly buggy.  inode->i_nlink will test to see if a
> directory has subdirectories but it won't test to see if a directory is
> empty.  Where did my brain go when I was writing that test?
> 
> Right now I would rather not have the empty directory exception than
> remove this code.
> 
> The test is a little trickier to write than it might otherwise be
> because /proc and /sys tend to be slightly imperfect filesystems.
> 
> I think the only way to really test that is to call readdir on the
> directory itself :(  I don't like that thought.
> 
> I don't know what I was thinking when I wrote that test but I definitely
> goofed up.  Grr!
> 
> I can certainly filter out any directory with 

[PATCH v5 17/17] ARM: at91: add new compatible strings for pmc driver

2013-11-12 Thread Boris BREZILLON
This patch adds new compatible string for PMC node to prepare the
transition to common clk.

These compatible string come from pmc driver in clk subsystem and are
needed to provide new device tree compatibility with old at91 clks
(device tree using common clks will use the new compatible strings).

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 arch/arm/mach-at91/clock.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c
index 5f02aea..72b2579 100644
--- a/arch/arm/mach-at91/clock.c
+++ b/arch/arm/mach-at91/clock.c
@@ -884,6 +884,11 @@ static int __init at91_pmc_init(unsigned long main_clock)
 #if defined(CONFIG_OF)
 static struct of_device_id pmc_ids[] = {
{ .compatible = "atmel,at91rm9200-pmc" },
+   { .compatible = "atmel,at91sam9260-pmc" },
+   { .compatible = "atmel,at91sam9g45-pmc" },
+   { .compatible = "atmel,at91sam9n12-pmc" },
+   { .compatible = "atmel,at91sam9x5-pmc" },
+   { .compatible = "atmel,sama5d3-pmc" },
{ /*sentinel*/ }
 };
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 16/17] ARM: at91: move pit timer to common clk framework

2013-11-12 Thread Boris BREZILLON
Use device tree to get the source clock of the PIT (Periodic Interval Timer).
If the clock is not found in device tree (or dt is not enabled) we'll try to
get it using clk_lookup definitions.

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 arch/arm/mach-at91/at91sam926x_time.c |   14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-at91/at91sam926x_time.c 
b/arch/arm/mach-at91/at91sam926x_time.c
index bb39232..0f04ffe 100644
--- a/arch/arm/mach-at91/at91sam926x_time.c
+++ b/arch/arm/mach-at91/at91sam926x_time.c
@@ -39,6 +39,7 @@
 static u32 pit_cycle;  /* write-once */
 static u32 pit_cnt;/* access only w/system irq blocked */
 static void __iomem *pit_base_addr __read_mostly;
+static struct clk *mck;
 
 static inline unsigned int pit_read(unsigned int reg_offset)
 {
@@ -195,10 +196,14 @@ static int __init of_at91sam926x_pit_init(void)
if (!pit_base_addr)
goto node_err;
 
+   mck = of_clk_get(np, 0);
+
/* Get the interrupts property */
ret = irq_of_parse_and_map(np, 0);
if (!ret) {
pr_crit("AT91: PIT: Unable to get IRQ from DT\n");
+   if (!IS_ERR(mck))
+   clk_put(mck);
goto ioremap_err;
}
at91sam926x_pit_irq.irq = ret;
@@ -230,6 +235,8 @@ void __init at91sam926x_pit_init(void)
unsignedbits;
int ret;
 
+   mck = ERR_PTR(-ENOENT);
+
/* For device tree enabled device: initialize here */
of_at91sam926x_pit_init();
 
@@ -237,7 +244,12 @@ void __init at91sam926x_pit_init(void)
 * Use our actual MCK to figure out how many MCK/16 ticks per
 * 1/HZ period (instead of a compile-time constant LATCH).
 */
-   pit_rate = clk_get_rate(clk_get(NULL, "mck")) / 16;
+   if (IS_ERR(mck))
+   mck = clk_get(NULL, "mck");
+
+   if (IS_ERR(mck))
+   panic("AT91: PIT: Unable to get mck clk\n");
+   pit_rate = clk_get_rate(mck) / 16;
pit_cycle = (pit_rate + HZ/2) / HZ;
WARN_ON(((pit_cycle - 1) & ~AT91_PIT_PIV) != 0);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 15/17] dt: binding: add at91 clks dt bindings documentation

2013-11-12 Thread Boris BREZILLON
This patch adds new at91 clks dt bindings documentation.

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 .../devicetree/bindings/clock/at91-clock.txt   |  339 
 1 file changed, 339 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/clock/at91-clock.txt

diff --git a/Documentation/devicetree/bindings/clock/at91-clock.txt 
b/Documentation/devicetree/bindings/clock/at91-clock.txt
new file mode 100644
index 000..cd5e239
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/at91-clock.txt
@@ -0,0 +1,339 @@
+Device Tree Clock bindings for arch-at91
+
+This binding uses the common clock binding[1].
+
+[1] Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+Required properties:
+- compatible : shall be one of the following:
+   "atmel,at91rm9200-pmc" or
+   "atmel,at91sam9g45-pmc" or
+   "atmel,at91sam9n12-pmc" or
+   "atmel,at91sam9x5-pmc" or
+   "atmel,sama5d3-pmc":
+   at91 PMC (Power Management Controller)
+   All at91 specific clocks (clocks defined below) must be child
+   node of the PMC node.
+
+   "atmel,at91rm9200-clk-main":
+   at91 main oscillator
+
+   "atmel,at91rm9200-clk-master" or
+   "atmel,at91sam9x5-clk-master":
+   at91 master clock
+
+   "atmel,at91sam9x5-clk-peripheral" or
+   "atmel,at91rm9200-clk-peripheral":
+   at91 peripheral clocks
+
+   "atmel,at91rm9200-clk-pll" or
+   "atmel,at91sam9g45-clk-pll" or
+   "atmel,at91sam9g20-clk-pllb" or
+   "atmel,sama5d3-clk-pll":
+   at91 pll clocks
+
+   "atmel,at91sam9x5-clk-plldiv":
+   at91 plla divisor
+
+   "atmel,at91rm9200-clk-programmable" or
+   "atmel,at91sam9g45-clk-programmable" or
+   "atmel,at91sam9x5-clk-programmable":
+   at91 programmable clocks
+
+   "atmel,at91sam9x5-clk-smd":
+   at91 SMD (Soft Modem) clock
+
+   "atmel,at91rm9200-clk-system":
+   at91 system clocks
+
+   "atmel,at91rm9200-clk-usb" or
+   "atmel,at91sam9x5-clk-usb" or
+   "atmel,at91sam9n12-clk-usb":
+   at91 usb clock
+
+   "atmel,at91sam9x5-clk-utmi":
+   at91 utmi clock
+
+Required properties for PMC node:
+- reg : defines the IO memory reserved for the PMC.
+- #size-cells : shall be 0 (reg is used to encode clk id).
+- #address-cells : shall be 1 (reg is used to encode clk id).
+- interrupts : shall be set to PMC interrupt line.
+- interrupt-controller : tell that the PMC is an interrupt controller.
+- #interrupt-cells : must be set to 1. The first cell encodes the interrupt id,
+   and reflect the bit position in the PMC_ER/DR/SR registers.
+   You can use the dt macros defined in dt-bindings/clk/at91.h.
+   0 (AT91_PMC_MOSCS) -> main oscillator ready
+   1 (AT91_PMC_LOCKA) -> PLL A ready
+   2 (AT91_PMC_LOCKB) -> PLL B ready
+   3 (AT91_PMC_MCKRDY) -> master clock ready
+   6 (AT91_PMC_LOCKU) -> UTMI PLL clock ready
+   8 .. 15 (AT91_PMC_PCKRDY(id)) -> programmable clock ready
+   16 (AT91_PMC_MOSCSELS) -> main oscillator selected
+   17 (AT91_PMC_MOSCRCS) -> RC main oscillator stabilized
+   18 (AT91_PMC_CFDEV) -> clock failure detected
+
+For example:
+   pmc: pmc@fc00 {
+   compatible = "atmel,sama5d3-pmc";
+   interrupts = <1 4 7>;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   #size-cells = <0>;
+   #address-cells = <1>;
+
+   /* put at91 clocks here */
+   };
+
+Required properties for main clock:
+- interrupt-parent : must reference the PMC node.
+- interrupts : shall be set to "<0>".
+- #clock-cells : from common clock binding; shall be set to 0.
+- clocks (optional if clock-frequency is provided) : shall be the slow clock
+   phandle. This clock is used to calculate the main clock rate if
+   "clock-frequency" is not provided.
+- clock-frequency : the main oscillator frequency.Prefer the use of
+   "clock-frequency" over automatic clock rate calculation.
+
+For example:
+   main: mainck {
+   compatible = "atmel,at91rm9200-clk-main";
+   interrupt-parent = <>;
+   interrupts = <0>;
+   #clock-cells = <0>;
+   clocks = <>;
+   clock-frequency = <18432000>;
+   };
+
+Required properties for master clock:
+- interrupt-parent : must reference the PMC node.
+- interrupts : shall be set to "<3>".
+- #clock-cells : from common clock binding; shall be set to 0.
+- clocks : shall be the master clock sources (see atmel datasheet) phandles.
+   e.g. "<>, <>, <>, <>".
+- atmel,clk-output-range : minimum and maximum clock frequency (two u32
+  fields).
+  e.g. output = <0 13300>; <=> 0 to 133MHz.
+- atmel,clk-divisors : master clock divisors table (four u32 

Re: [PATCH] mm: revert mremap pud_free anti-fix

2013-11-12 Thread Chen Gang
On 10/15/2013 07:46 PM, Chen Gang wrote:
> On 10/15/2013 06:34 PM, Hugh Dickins wrote:
>> > Revert 1ecfd533f4c5 ("mm/mremap.c: call pud_free() after fail calling
>> > pmd_alloc()").  The original code was correct: pud_alloc(), pmd_alloc(),
>> > pte_alloc_map() ensure that the pud, pmd, pt is already allocated, and
>> > seldom do they need to allocate; on failure, upper levels are freed if
>> > appropriate by the subsequent do_munmap().  Whereas 1ecfd533f4c5 did an
>> > unconditional pud_free() of a most-likely still-in-use pud: saved only
>> > by the near-impossiblity of pmd_alloc() failing.
>> > 
> What you said above sounds reasonable to me,  but better to provide the
> information below:
> 
>  - pud_free() for pgd_alloc() in "arch/arm/mm/pgd.c".
> 

It is correct, it is for 'new_pgd' which not come from 'mm'.

>  - pud_free() for init_stub_pte() in "arch/um/kernel/skas/mmu.c".
> 

For me, it need improvement, I have sent related patch for it.

>  - more details about do_munmap(), (e.g. do it need mm->page_table_lock)
>or more details about the demo "most-likely still-in-use pud ...".
> 

According to "Documentation/vm/locking", 'mm->page_table_lock' is for
using vma list, so not need it when its related vmas are detached from
using vma list.

The related work flow:

  do_munmap()->
detach_vmas_to_be_unmapped(); /* so not need mm->page_table_lock */
unmap_region() ->
  free_pgtables() ->
free_pgd_range() ->
  free_pud_range() ->
free_pmd_range() ->
  free_pte_range() ->
pmd_clear();
pte_free_tlb();
  pud_clear();
  pmd_free_tlb();
pgd_clear();
pud_free_tlb();


Thanks.

> 
> Hmm... I am not quite sure about the 3 things, and I will/should
> continue analysing/learning about them, but better to get your reply. :-)


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [PATCH RFC 2/6] arm64: Kprobes with single stepping support

2013-11-12 Thread Sandeepa Prabhu
On 13 November 2013 12:25, Sandeepa Prabhu  wrote:
 I'm unsure about arm64's debug feature behavior, what does happen when
 it performs a single-step on sw-breakpoint?

> Sandeepa: I think you need to retry Masami's test on the arm64 model, 
> since
> I'm fairly sure it won't work as expected without some additional code.

 OK, anyway, for testing same one, we need to port ftrace first. So the next
>>
>> Sorry for confusion, s/next/fallback is what I meant. Making a kprobe module
>> can be done without ftrace port.
>>
 plan is to make a kprobe module to put a probe (which just printk 
 something)
 on a specific function (e.g. vfs_symlink), and run perf record with
 hw-breakpoint as below

 $ perf record -e "mem:0xXX:k" ln -s /dev/null /tmp/foo

 Note that 0xXX is the address of vfs_symlink.

 After that, you can see the message in dmesg and also check the perf result
 with "sudo perf script --dump" (you can find a PERF_RECORD_SAMPLE entry if
 it works)
Hi Will, Masami,

I am not sure of 'perf' right now (my minimal rootfs doesn't have) but
I tried to test hardware breakpoints using sample modules
"samples/hw_breakpoint/" on arm64 upstream branch. This should use
same kernel api as perf I believe.

1.  Placing watchpoint ( attr.bp_type = HW_BREAKPOINT_W |
HW_BREAKPOINT_R) upon vfs_symlink symbol, but seems watch-point is not
triggering at all.
2.  Placing text breakpoint (modified sample module with attr.bp_type
= HW_BREAKPOINT_X) upon vfs_symlink, and run "ln -s /dev/null
/tmp/foo".  This time, breakpoint hit but exception is re-cursing
infinitely!

I have attached the kernel logs for reference. So wanted to check if
hw breakpoint/watch-points are working on the upstream branch? Has it
been tested recently with sample modules  or perf/ptrace?

Thanks,
Sandeepa
Initializing cgroup subsys cpu
Linux version 3.12.0-rc4+ (sandeepa@linaro-workstation) (gcc version 4.7.3 20130328 (prerelease) (crosstool-NG linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) ) #24 SMP PREEMPT Wed Nov 13 12:04:03 IST 2013
CPU: AArch64 Processor [410fd0f0] revision 0
Machine: RTSM_VE_AEMv8A
bootconsole [earlycon0] enabled
PERCPU: Embedded 10 pages/cpu @ffc87ffa8000 s11776 r8192 d20992 u40960
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1034240
Kernel command line: console=ttyAMA0 root=/dev/mmcblk0p2 earlyprintk=pl011,0x1c09 consolelog=9 rw
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
software IO TLB [mem 0x8f800-0x8fc00] (64MB) mapped at [ffc87800-ffc87bff]
Memory: 4058384K/4194304K available (3347K kernel code, 211K rwdata, 1164K rodata, 171K init, 154K bss, 135920K reserved)
Virtual kernel memory layout:
vmalloc : 0xff80 - 0xffbb   (245759 MB)
vmemmap : 0xffbc01c0 - 0xffbc1f80   (   476 MB)
modules : 0xffbffc00 - 0xffc0   (64 MB)
memory  : 0xffc0 - 0xffc88000   ( 34816 MB)
  .init : 0xffc0004e9000 - 0xffc000513e00   (   172 kB)
  .text : 0xffc8 - 0xffc0004e8cf4   (  4516 kB)
  .data : 0xffc000514000 - 0xffc000548e80   (   212 kB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Preemptible hierarchical RCU implementation.
	RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
NR_IRQS:64 nr_irqs:64 0
Architected cp15 timer(s) running at 100.00MHz (phys).
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 200.00 BogoMIPS (lpj=100)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 256
hw perfevents: enabled with arm/armv8-pmuv3 PMU driver, 9 counters available
CPU1: Booted secondary processor
CPU2: Booted secondary processor
CPU3: Booted secondary processor
Brought up 4 CPUs
SMP: Total of 4 processors activated.
devtmpfs: initialized
atomic64 test passed
NET: Registered protocol family 16
of_amba_device_create(): amba_device_add() failed (-19) for /smb/motherboard/iofpga@3,/sysctl@02
vdso: 2 pages (1 code, 1 data) at base ffc000519000
hw-breakpoint: found 16 breakpoint and 16 watchpoint registers.
Serial: AMBA PL011 UART driver
1c09.uart: ttyAMA0 at MMIO 0x1c09 (irq = 37, base_baud = 0) is a PL011 rev2
console [ttyAMA0] enabled, bootconsole disabled
console [ttyAMA0] enabled, bootconsole disabled
1c0a.uart: ttyAMA1 at MMIO 0x1c0a (irq = 38, base_baud = 0) is a PL011 rev2
1c0b.uart: ttyAMA2 at MMIO 0x1c0b (irq = 39, base_baud = 0) is a PL011 rev2
1c0c.uart: ttyAMA3 at MMIO 0x1c0c (irq = 40, base_baud = 0) is a PL011 rev2
bio: create slab  at 0
SCSI subsystem initialized
Switched to clocksource arch_sys_counter
NET: Registered protocol family 2
TCP established hash table 

[PATCH v4 2/2] e2fsck: Correct ext4 dates generated by old kernels.

2013-11-12 Thread David Turner
This patch is against e2fsprogs.

---
Older kernels on 64-bit machines would incorrectly encode pre-1970
ext4 dates as post-2311 dates.  Detect and correct this (assuming the
current date is before 2311).

Signed-off-by: David Turner 
---
 e2fsck/pass1.c   | 37 +
 e2fsck/problem.c |  7 +++
 e2fsck/problem.h |  6 ++
 3 files changed, 50 insertions(+)

diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index ab23e42..cb72964 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -348,6 +348,23 @@ fix:
EXT2_INODE_SIZE(sb), "pass1");
 }
 
+#define EXT4_EPOCH_BITS 2
+#define EXT4_EPOCH_MASK ((1 << EXT4_EPOCH_BITS) - 1)
+
+static int large_inode_extra(__u32 xtime, __u32 extra) {
+   return (xtime & (1 << 31)) != 0 &&
+   (extra & EXT4_EPOCH_MASK) == EXT4_EPOCH_MASK;
+}
+
+#define LARGE_INODE_EXTRA(inode, xtime) \
+   large_inode_extra(inode->i_##xtime, \
+ inode->i_##xtime##_extra)
+
+/* When the date is earlier than 2311, we assume that atimes, ctimes,
+ * and mtimes greater than 2311 are actually pre-1970 dates mis-encoded.
+ */
+#define EXT4_EXTRA_NEGATIVE_DATE_CUTOFF 6 * (1UL << 32)
+
 static void check_inode_extra_space(e2fsck_t ctx, struct problem_context *pctx)
 {
struct ext2_super_block *sb = ctx->fs->super;
@@ -388,6 +405,26 @@ static void check_inode_extra_space(e2fsck_t ctx, struct 
problem_context *pctx)
/* it seems inode has an extended attribute(s) in body */
check_ea_in_inode(ctx, pctx);
}
+
+   /*
+* If the inode's extended atime (ctime, mtime) is stored in
+* the old, invalid format, the inode is corrupt.
+*/
+   if (sizeof(time_t) > 4 && ctx->now < EXT4_EXTRA_NEGATIVE_DATE_CUTOFF &&
+   LARGE_INODE_EXTRA(inode, atime) ||
+   LARGE_INODE_EXTRA(inode, ctime) ||
+   LARGE_INODE_EXTRA(inode, mtime)) {
+
+   if (!fix_problem(ctx, PR_1_EA_TIME_OUT_OF_RANGE, pctx))
+   return;
+
+   inode->i_atime_extra &= ~EXT4_EPOCH_MASK;
+   inode->i_ctime_extra &= ~EXT4_EPOCH_MASK;
+   inode->i_mtime_extra &= ~EXT4_EPOCH_MASK;
+   e2fsck_write_inode_full(ctx, pctx->ino, pctx->inode,
+   EXT2_INODE_SIZE(sb), "pass1");
+   }
+
 }
 
 /*
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 897693a..51fa7c3 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -1018,6 +1018,13 @@ static struct e2fsck_problem problem_table[] = {
  N_("@i %i, end of extent exceeds allowed value\n\t(logical @b %c, 
physical @b %b, len %N)\n"),
  PROMPT_CLEAR, 0 },
 
+/* The extended a, c, or mtime on this inode is in the far future,
+   indicating that it was written with an older, buggy version of the
+   kernel on a 64-bit machine */
+   { PR_1_EA_TIME_OUT_OF_RANGE,
+ N_("Extended time on @i %i is in the far future.\n"
+   "Assume that it is in fact a pre-1970 date written by an older, 
buggy version of Linux?\n"), 
+   PROMPT_FIX, 0 },
 
/* Pass 1b errors */
 
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index ae1ed26..a44f6dd 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -593,6 +593,12 @@ struct problem_context {
 #define PR_1_EXTENT_INDEX_START_INVALID0x01006D
 
 #define PR_1_EXTENT_END_OUT_OF_BOUNDS  0x01006E
+
+/* The extended a, c, or mtime on this inode is in the far future,
+   indicating that it was written with an older, buggy version of
+the kernel on a 64-bit machine */
+#define PR_1_EA_TIME_OUT_OF_RANGE  0x01006F
+
 /*
  * Pass 1b errors
  */
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/2] ext4: Fix handling of extended tv_sec (bug 23732)

2013-11-12 Thread David Turner
In ext4, the bottom two bits of {a,c,m}time_extra are used to extend
the {a,c,m}time fields, deferring the year 2038 problem to the year
2446.

When decoding these extended fields, for times whose bottom 32 bits
would represent a negative number, sign extension causes the 64-bit
extended timestamp to be negative as well, which is not what's
intended.  This patch corrects that issue, so that the only negative
{a,c,m}times are those between 1901 and 1970 (as per 32-bit signed
timestamps).

Some older kernels might have written pre-1970 dates with 1,1 in the
extra bits.  This patch treats those incorrectly-encoded dates as
pre-1970, instead of post-2311, until kernel 4.20 is released.
Hopefully by then e2fsck will have fixed up the bad data.

Signed-off-by: David Turner 
Reported-by: Mark Harris 
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732
---
 fs/ext4/ext4.h | 61 +-
 1 file changed, 39 insertions(+), 22 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 18aa56b..7d5e019 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -713,38 +714,54 @@ struct move_extent {
  sizeof((ext4_inode)->field))  \
<= (EXT4_GOOD_OLD_INODE_SIZE +  \
(einode)->i_extra_isize))   \
+
 /*
- * We use the bottom 34 bits of the signed 64-bit time value, with
- * the top two of these bits in the bottom of extra.  This leads
- * to a slightly odd encoding, which works like this:
+ * We need is an encoding that preserves the times for extra epoch "00":
  *
- * extra  msb of
- * epoch  32-bit
- * bits   timedecoded 64-bit tv_sec   valid time range
- * 0 000x0..0x07fff  1970-01-01..2038-01-19
- * 0 010x08000..0x0  2038-01-19..2106-02-07
- * 0 100x1..0x17fff  2106-02-07..2174-02-25
- * 0 110x18000..0x1  2174-02-25..2242-03-16
- * 1 000x2..0x27fff  2242-03-16..2310-04-04
- * 1 010x28000..0x2  2310-04-04..2378-04-22
- * 1 100x3..0x37fff  2378-04-22..2446-05-10
-
- * 1 11-0x8000..-0x0001  1901-12-13..1969-12-31
+ * extra  msb of adjust for signed
+ * epoch  32-bit 32-bit tv_sec to
+ * bits   timedecoded 64-bit tv_sec  64-bit tv_sec  valid time range
+ * 0 01-0x8000..-0x0001  0x0 1901-12-13..1969-12-31
+ * 0 000x0..0x07fff  0x0 1970-01-01..2038-01-19
+ * 0 110x08000..0x0  0x1 2038-01-19..2106-02-07
+ * 0 100x1..0x17fff  0x1 2106-02-07..2174-02-25
+ * 1 010x18000..0x1  0x2 2174-02-25..2242-03-16
+ * 1 000x2..0x27fff  0x2 2242-03-16..2310-04-04
+ * 1 110x28000..0x2  0x3 2310-04-04..2378-04-22
+ * 1 100x3..0x37fff  0x3 2378-04-22..2446-05-10
+ *
+ * Note that previous versions of the kernel on 64-bit systems would
+ * incorrectly use extra epoch bits 1,1 for dates between 1901 and
+ * 1970.  e2fsck will correct this, assuming that it is run on the
+ * affected filesystem before 2311.
  */
 
 static inline __le32 ext4_encode_extra_time(struct timespec *time)
 {
-   return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
-  (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
-  ((time->tv_nsec << EXT4_EPOCH_BITS) & 
EXT4_NSEC_MASK));
+   u32 extra = sizeof(time->tv_sec) > 4 ?
+   ((time->tv_sec - (s32)time->tv_sec) >> 32) & EXT4_EPOCH_MASK : 
0;
+   return cpu_to_le32(extra | (time->tv_nsec << EXT4_EPOCH_BITS));
 }
 
 static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
 {
-   if (sizeof(time->tv_sec) > 4)
-  time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
-  << 32;
-   time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> 
EXT4_EPOCH_BITS;
+   if (unlikely(sizeof(time->tv_sec) > 4 &&
+   (extra & cpu_to_le32(EXT4_EPOCH_MASK {
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,20,0)
+   /* Handle legacy encoding of pre-1970 dates with epoch
+* bits 1,1.  We assume that by kernel version 4.20,
+* everyone will have run fsck over the affected
+* filesystems to correct the problem.
+*/
+   u64 extra_bits = le32_to_cpu(extra) & EXT4_EPOCH_MASK;
+   if (extra_bits == 3)
+   extra_bits = 0;
+   time->tv_sec += extra_bits << 32;
+#else
+   time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 
32;
+#endif
+   }
+   

Re: [PATCH] perf trace: Simplify '--summary' output

2013-11-12 Thread Pekka Enberg

On 11/12/13 11:40 PM, Ingo Molnar wrote:

So if you prefer unit-less lines that's defensible, perhaps output the
unit somewhere else:

 syscallcalls min  avg  max  stddev
(msec)   (msec)   (msec)(%)
 ---     --
 sendmsg20.0020.0050.008  55.00
 recvmsg20.0020.0030.005  44.00
 epoll_wait 10.0000.0000.000   0.00

or so?


Looks good.  I'll make a patch later today unless someone else beats me 
to it.


Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / driver core: Store a device pointer in struct acpi_dev_node

2013-11-12 Thread Aaron Lu
On 11/11/2013 09:45 PM, Rafael J. Wysocki wrote:
> On Monday, November 11, 2013 09:21:40 AM Lan Tianyu wrote:
>> On 2013年11月10日 08:58, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki 
>>>
>>> Modify struct acpi_dev_node to contain a pointer to struct device
>>> ambedded in the struct acpi_device associated with the given device
>>> object (that is, its ACPI companion device) instead of an ACPI handle
>>> corresponding to that struct acpi_device.  Introduce two new macros
>>> for manipulating that pointer in a CONFIG_ACPI-safe way,
>>> ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
>>> ACPI_HANDLE() macro to take the above changes into account.
>>> Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
>>> use ACPI_COMPANION_SET() instead.  For some of them who used to
>>> pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
>>> introduce a helper routine acpi_preset_companion() doing an
>>> equivalent thing.
>>>
>>> The rationale for using a struct device pointer instead of a
>>> struct acpi_device one as the member of struct acpi_dev_node is
>>> that it allows device.h to avoid including linux/acpi.h which would
>>> introduce quite a bit of compilation overhead for stuff that doesn't
>>> care about ACPI.
>>> In turn, moving the macros to linux/acpi.h forces
>>> the stuff that does care about ACPI to include that file as
>>> appropriate anyway.
>>
>> How about declaring "struct acpi_device" in the device.h? This can help
>> to use struct acpi_device without including linux/acpi.h.
>>
>> struct iommu_ops and struct iommu_group have been used by the same way
>> in the device.h.
> 
> Yes, they are.  Well, that appears to work too.
> 
> Updated patch is appended.  It also contains some fixes for problems reported
> by the auto build system and it's been tested on x86-64 now, so it should be
> reasonably close to final.
> 
> Thanks,
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki 
> Subject: ACPI / driver core: Store an ACPI device pointer in struct 
> acpi_dev_node
> 
> Modify struct acpi_dev_node to contain a pointer to struct acpi_device
> associated with the given device object (that is, its ACPI companion
> device) instead of an ACPI handle corresponding to it.  Introduce two
> new macros for manipulating that pointer in a CONFIG_ACPI-safe way,
> ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
> ACPI_HANDLE() macro to take the above changes into account.
> Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
> use ACPI_COMPANION_SET() instead.  For some of them who used to
> pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
> introduce a helper routine acpi_preset_companion() doing an
> equivalent thing.
> 
> The main motivation for doing this is that there are things
> represented by struct acpi_device objects that don't have valid
> ACPI handles (so called fixed ACPI hardware features, such as
> power and sleep buttons) and we would like to create platform
> device objects for them and "glue" them to their ACPI companions
> in the usual way (which currently is impossible due to the
> lack of valid ACPI handles).  However, there are more reasons
> why it may be useful.
> 
> First, struct acpi_device pointers allow of much better type checking
> than void pointers which are ACPI handles, so it should be more
> difficult to write buggy code using modified struct acpi_dev_node
> and the new macros.  Second, the change should help to reduce (over
> time) the number of places in which the result of ACPI_HANDLE() is
> passed to acpi_bus_get_device() in order to obtain a pointer to the
> struct acpi_device associated with the given "physical" device,
> because now that pointer is returned by ACPI_COMPANION() directly.
> Finally, the change should make it easier to write generic code that
> will build both for CONFIG_ACPI set and unset without adding explicit
> compiler directives to it.
> 
> Signed-off-by: Rafael J. Wysocki 

Reviewed-by: Aaron Lu  for ATA and SDIO part.

Thanks,
Aaron
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] regulator: lp3972: Convert to devm_regulator_register

2013-11-12 Thread Axel Lin
Both num_regulators and **rdev are no longer required after this conversion,
thus remove them from struct lp3972.

Signed-off-by: Axel Lin 
---
 drivers/regulator/lp3972.c | 41 ++---
 1 file changed, 6 insertions(+), 35 deletions(-)

diff --git a/drivers/regulator/lp3972.c b/drivers/regulator/lp3972.c
index 093e6f4..aea485a 100644
--- a/drivers/regulator/lp3972.c
+++ b/drivers/regulator/lp3972.c
@@ -22,8 +22,6 @@ struct lp3972 {
struct device *dev;
struct mutex io_lock;
struct i2c_client *i2c;
-   int num_regulators;
-   struct regulator_dev **rdev;
 };
 
 /* LP3972 Control Registers */
@@ -478,41 +476,27 @@ static int setup_regulators(struct lp3972 *lp3972,
 {
int i, err;
 
-   lp3972->num_regulators = pdata->num_regulators;
-   lp3972->rdev = kcalloc(pdata->num_regulators,
-   sizeof(struct regulator_dev *), GFP_KERNEL);
-   if (!lp3972->rdev) {
-   err = -ENOMEM;
-   goto err_nomem;
-   }
-
/* Instantiate the regulators */
for (i = 0; i < pdata->num_regulators; i++) {
struct lp3972_regulator_subdev *reg = >regulators[i];
struct regulator_config config = { };
+   struct regulator_dev *rdev;
 
config.dev = lp3972->dev;
config.init_data = reg->initdata;
config.driver_data = lp3972;
 
-   lp3972->rdev[i] = regulator_register([reg->id],
-);
-   if (IS_ERR(lp3972->rdev[i])) {
-   err = PTR_ERR(lp3972->rdev[i]);
+   rdev = devm_regulator_register(lp3972->dev,
+  [reg->id], );
+   if (IS_ERR(rdev)) {
+   err = PTR_ERR(rdev);
dev_err(lp3972->dev, "regulator init failed: %d\n",
err);
-   goto error;
+   return err;
}
}
 
return 0;
-error:
-   while (--i >= 0)
-   regulator_unregister(lp3972->rdev[i]);
-   kfree(lp3972->rdev);
-   lp3972->rdev = NULL;
-err_nomem:
-   return err;
 }
 
 static int lp3972_i2c_probe(struct i2c_client *i2c,
@@ -557,18 +541,6 @@ static int lp3972_i2c_probe(struct i2c_client *i2c,
return 0;
 }
 
-static int lp3972_i2c_remove(struct i2c_client *i2c)
-{
-   struct lp3972 *lp3972 = i2c_get_clientdata(i2c);
-   int i;
-
-   for (i = 0; i < lp3972->num_regulators; i++)
-   regulator_unregister(lp3972->rdev[i]);
-   kfree(lp3972->rdev);
-
-   return 0;
-}
-
 static const struct i2c_device_id lp3972_i2c_id[] = {
{ "lp3972", 0 },
{ }
@@ -581,7 +553,6 @@ static struct i2c_driver lp3972_i2c_driver = {
.owner = THIS_MODULE,
},
.probe= lp3972_i2c_probe,
-   .remove   = lp3972_i2c_remove,
.id_table = lp3972_i2c_id,
 };
 
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] regulator: lp3971: Convert to devm_regulator_register

2013-11-12 Thread Axel Lin
Both num_regulators and **rdev are no longer required after this conversion,
thus remove them from struct lp3971.

Signed-off-by: Axel Lin 
---
 drivers/regulator/lp3971.c | 43 ++-
 1 file changed, 6 insertions(+), 37 deletions(-)

diff --git a/drivers/regulator/lp3971.c b/drivers/regulator/lp3971.c
index 947c05f..3b1102b 100644
--- a/drivers/regulator/lp3971.c
+++ b/drivers/regulator/lp3971.c
@@ -25,8 +25,6 @@ struct lp3971 {
struct device *dev;
struct mutex io_lock;
struct i2c_client *i2c;
-   int num_regulators;
-   struct regulator_dev **rdev;
 };
 
 static u8 lp3971_reg_read(struct lp3971 *lp3971, u8 reg);
@@ -383,42 +381,27 @@ static int setup_regulators(struct lp3971 *lp3971,
 {
int i, err;
 
-   lp3971->num_regulators = pdata->num_regulators;
-   lp3971->rdev = kcalloc(pdata->num_regulators,
-   sizeof(struct regulator_dev *), GFP_KERNEL);
-   if (!lp3971->rdev) {
-   err = -ENOMEM;
-   goto err_nomem;
-   }
-
/* Instantiate the regulators */
for (i = 0; i < pdata->num_regulators; i++) {
struct regulator_config config = { };
struct lp3971_regulator_subdev *reg = >regulators[i];
+   struct regulator_dev *rdev;
 
config.dev = lp3971->dev;
config.init_data = reg->initdata;
config.driver_data = lp3971;
 
-   lp3971->rdev[i] = regulator_register([reg->id],
-);
-   if (IS_ERR(lp3971->rdev[i])) {
-   err = PTR_ERR(lp3971->rdev[i]);
+   rdev = devm_regulator_register(lp3971->dev,
+  [reg->id], );
+   if (IS_ERR(rdev)) {
+   err = PTR_ERR(rdev);
dev_err(lp3971->dev, "regulator init failed: %d\n",
err);
-   goto error;
+   return err;
}
}
 
return 0;
-
-error:
-   while (--i >= 0)
-   regulator_unregister(lp3971->rdev[i]);
-   kfree(lp3971->rdev);
-   lp3971->rdev = NULL;
-err_nomem:
-   return err;
 }
 
 static int lp3971_i2c_probe(struct i2c_client *i2c,
@@ -460,19 +443,6 @@ static int lp3971_i2c_probe(struct i2c_client *i2c,
return 0;
 }
 
-static int lp3971_i2c_remove(struct i2c_client *i2c)
-{
-   struct lp3971 *lp3971 = i2c_get_clientdata(i2c);
-   int i;
-
-   for (i = 0; i < lp3971->num_regulators; i++)
-   regulator_unregister(lp3971->rdev[i]);
-
-   kfree(lp3971->rdev);
-
-   return 0;
-}
-
 static const struct i2c_device_id lp3971_i2c_id[] = {
{ "lp3971", 0 },
{ }
@@ -485,7 +455,6 @@ static struct i2c_driver lp3971_i2c_driver = {
.owner = THIS_MODULE,
},
.probe= lp3971_i2c_probe,
-   .remove   = lp3971_i2c_remove,
.id_table = lp3971_i2c_id,
 };
 
-- 
1.8.1.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scsi: avoid use of reclaimed reference

2013-11-12 Thread James Bottomley
On Tue, 2013-11-12 at 18:09 -0800, David Decotigny wrote:
> I was considering the following scenario wherein the "if
> (scsi_device_created(sdev))" test at the end would test garbage at
> best (or unmapped data):

Well, no, the counting isn't right:

>if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {  // not found
>sdev = scsi_alloc_sdev(starget, 0, NULL);// -> ref cnt = 2

1

>if (scsi_device_get(sdev)) {  // -> ref cnt = 3

2

>}
>...
>}
>   ...
>res = scsi_probe_and_add_lun(starget,// ->
> ref cnt = 1

No idea what you think here, where were the other puts?  If starget,lun
is sdev, then the count goes to 3 here otherwise it stays at 2 if it
isn't reported in the scan.
  ...
>scsi_device_put(sdev);  // -> reclaimed

No, it goes to either 2 or 1 here.  If it goes to 1 it's because the
sdev was never probed and thus it remains in the created state.

>if (scsi_device_created(sdev))  // test on garbage or unmapped data 
> (#PF)

Which means this test passes and it gets garbage collected by
__scsi_remove_device().  Otherwise we exit with refcount 2.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] pm2301-charger: remove unneeded NULL checks

2013-11-12 Thread Anton Vorontsov
On Thu, Nov 07, 2013 at 11:06:17AM +0300, Dan Carpenter wrote:
> If "pm2" were NULL we would oops printing the error message.
> Fortunately, that's not possible so I have removed the NULL checks.
> 
> Signed-off-by: Dan Carpenter 

Applied, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9] PPC: POWERNV: move iommu_add_device earlier

2013-11-12 Thread Alexey Kardashevskiy
The current implementation of IOMMU on sPAPR does not use iommu_ops
and therefore does not call IOMMU API's bus_set_iommu() which
1) sets iommu_ops for a bus
2) registers a bus notifier
Instead, PCI devices are added to IOMMU groups from
subsys_initcall_sync(tce_iommu_init) which does basically the same
thing without using iommu_ops callbacks.

However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
implements iommu_ops and when tce_iommu_init is called, every PCI device
is already added to some group so there is a conflict.

This patch does 2 things:
1. removes the loop in which PCI devices were added to groups and
adds explicit iommu_add_device() calls to add devices as soon as they get
the iommu_table pointer assigned to them.
2. moves a bus notifier to powernv code in order to avoid conflict with
the notifier from Freescale driver.

iommu_add_device() and iommu_del_device() are public now.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v9:
* removed "KVM" from the subject as it is not really a KVM patch so
PPC mainainter (hi Ben!) can review/include it into his tree

v8:
* added the check for iommu_group!=NULL before removing device from a group
as suggested by Wei Yang 

v2:
* added a helper - set_iommu_table_base_and_group - which does
set_iommu_table_base() and iommu_add_device()
---
 arch/powerpc/include/asm/iommu.h|  9 +++
 arch/powerpc/kernel/iommu.c | 41 +++--
 arch/powerpc/platforms/powernv/pci-ioda.c   |  8 +++---
 arch/powerpc/platforms/powernv/pci-p5ioc2.c |  2 +-
 arch/powerpc/platforms/powernv/pci.c| 33 ++-
 arch/powerpc/platforms/pseries/iommu.c  |  8 +++---
 6 files changed, 55 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index c34656a..19ad77f 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -103,6 +103,15 @@ extern struct iommu_table *iommu_init_table(struct 
iommu_table * tbl,
int nid);
 extern void iommu_register_group(struct iommu_table *tbl,
 int pci_domain_number, unsigned long pe_num);
+extern int iommu_add_device(struct device *dev);
+extern void iommu_del_device(struct device *dev);
+
+static inline void set_iommu_table_base_and_group(struct device *dev,
+ void *base)
+{
+   set_iommu_table_base(dev, base);
+   iommu_add_device(dev);
+}
 
 extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl,
struct scatterlist *sglist, int nelems,
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 572bb5b..ecbf468 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl)
 }
 EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
-static int iommu_add_device(struct device *dev)
+int iommu_add_device(struct device *dev)
 {
struct iommu_table *tbl;
int ret = 0;
@@ -1134,46 +1134,13 @@ static int iommu_add_device(struct device *dev)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(iommu_add_device);
 
-static void iommu_del_device(struct device *dev)
+void iommu_del_device(struct device *dev)
 {
iommu_group_remove_device(dev);
 }
-
-static int iommu_bus_notifier(struct notifier_block *nb,
- unsigned long action, void *data)
-{
-   struct device *dev = data;
-
-   switch (action) {
-   case BUS_NOTIFY_ADD_DEVICE:
-   return iommu_add_device(dev);
-   case BUS_NOTIFY_DEL_DEVICE:
-   iommu_del_device(dev);
-   return 0;
-   default:
-   return 0;
-   }
-}
-
-static struct notifier_block tce_iommu_bus_nb = {
-   .notifier_call = iommu_bus_notifier,
-};
-
-static int __init tce_iommu_init(void)
-{
-   struct pci_dev *pdev = NULL;
-
-   BUILD_BUG_ON(PAGE_SIZE < IOMMU_PAGE_SIZE);
-
-   for_each_pci_dev(pdev)
-   iommu_add_device(>dev);
-
-   bus_register_notifier(_bus_type, _iommu_bus_nb);
-   return 0;
-}
-
-subsys_initcall_sync(tce_iommu_init);
+EXPORT_SYMBOL_GPL(iommu_del_device);
 
 #else
 
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 084cdfa..614356c 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -460,7 +460,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, 
struct pci_dev *pdev
return;
 
pe = >ioda.pe_array[pdn->pe_number];
-   set_iommu_table_base(>dev, >tce32_table);
+   set_iommu_table_base_and_group(>dev, >tce32_table);
 }
 
 static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
@@ -468,7 +468,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, 
struct 

[PATCH 1/2] dmaengine: ipu: fix warnings from 64-bit dma_addr_t printouts

2013-11-12 Thread Olof Johansson
This resolves a number of warnings such as the below when building with
64-bit dma_addr_t on arm:

drivers/dma/ipu/ipu_idmac.c:1235:2: warning: format '%x' expects argument
  of type 'unsigned int', but argument 5 has type 'dma_addr_t' [-Wformat=]

..by upcasting to u64 and using %llx.

Signed-off-by: Olof Johansson 
---
 drivers/dma/ipu/ipu_idmac.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/ipu/ipu_idmac.c b/drivers/dma/ipu/ipu_idmac.c
index cb9c0bc..128ca14 100644
--- a/drivers/dma/ipu/ipu_idmac.c
+++ b/drivers/dma/ipu/ipu_idmac.c
@@ -1232,8 +1232,10 @@ static irqreturn_t idmac_interrupt(int irq, void *dev_id)
desc = list_entry(ichan->queue.next, struct idmac_tx_desc, list);
descnew = desc;
 
-   dev_dbg(dev, "IDMAC irq %d, dma 0x%08x, next dma 0x%08x, current %d, 
curbuf 0x%08x\n",
-   irq, sg_dma_address(*sg), sgnext ? sg_dma_address(sgnext) : 0, 
ichan->active_buffer, curbuf);
+   dev_dbg(dev, "IDMAC irq %d, dma %#llx, next dma %#llx, current %d, 
curbuf %#x\n",
+   irq, (u64)sg_dma_address(*sg),
+   sgnext ? (u64)sg_dma_address(sgnext) : 0,
+   ichan->active_buffer, curbuf);
 
/* Find the descriptor of sgnext */
sgnew = idmac_sg_next(ichan, , *sg);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] dma: imx-sdma: Fix warnings for LPAE builds

2013-11-12 Thread Olof Johansson
This resolves a number of warnings such as the below when building with
64-bit dma_addr_t on arm:

drivers/dma/imx-sdma.c:1092:3: warning: format '%x' expects argument of
  type 'unsigned int', but argument 6 has type 'dma_addr_t' [-Wformat=]

..by upcasting to u64 and using %llx.

Signed-off-by: Olof Johansson 
---
 drivers/dma/imx-sdma.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index e43c040..4088ff8 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -1089,8 +1089,8 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
param &= ~BD_CONT;
}
 
-   dev_dbg(sdma->dev, "entry %d: count: %d dma: 0x%08x %s%s\n",
-   i, count, sg->dma_address,
+   dev_dbg(sdma->dev, "entry %d: count: %d dma: %#llx %s%s\n",
+   i, count, (u64)sg->dma_address,
param & BD_WRAP ? "wrap" : "",
param & BD_INTR ? " intr" : "");
 
@@ -1163,8 +1163,8 @@ static struct dma_async_tx_descriptor 
*sdma_prep_dma_cyclic(
if (i + 1 == num_periods)
param |= BD_WRAP;
 
-   dev_dbg(sdma->dev, "entry %d: count: %d dma: 0x%08x %s%s\n",
-   i, period_len, dma_addr,
+   dev_dbg(sdma->dev, "entry %d: count: %d dma: %#llx %s%s\n",
+   i, period_len, (u64)dma_addr,
param & BD_WRAP ? "wrap" : "",
param & BD_INTR ? " intr" : "");
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Greg Price
On Wed, Nov 13, 2013 at 01:08:07AM -0500, Theodore Ts'o wrote:
> On Tue, Nov 12, 2013 at 11:23:03PM -0500, Greg Price wrote:
> > That's a good idea.  I've worried about the same thing, but hadn't
> > thought of that solution.
> 
> I think the key is that we set a default of requiring 128 bits, or 5
> minutes, with boot-line options to change the defaults.  BTW, with the
> changes that are scheduled for 3.13, this shouldn't be a problem on
> most desktops.  From my T430s laptop: [...]
> 
> So even without adding device attach times (which is on the todo list)
> the /dev/urandom pool is getting an estimated 128 bits of entropy
> almost two seconds *before* the root file system is remouted
> read/write.

Great!


> This is why I've been working improving the random driver's efficiency
> in getting the urandom pool as soon as possible, as higher priority
> than adding blocking-on-boot for /dev/urandom.

Makes sense.  Blocking on boot is only sustainable anyway if it rarely
lasts past early boot.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 14/17] clk: at91: add PMC smd clock

2013-11-12 Thread Boris BREZILLON
This patch adds at91 smd (Soft Modem) clock implementation using common clk
framework.

Not used by any driver right now.

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 arch/arm/mach-at91/Kconfig |5 ++
 drivers/clk/at91/Makefile  |1 +
 drivers/clk/at91/clk-smd.c |  173 
 drivers/clk/at91/pmc.c |7 ++
 drivers/clk/at91/pmc.h |5 ++
 5 files changed, 191 insertions(+)
 create mode 100644 drivers/clk/at91/clk-smd.c

diff --git a/arch/arm/mach-at91/Kconfig b/arch/arm/mach-at91/Kconfig
index b76dc4c..97033f7 100644
--- a/arch/arm/mach-at91/Kconfig
+++ b/arch/arm/mach-at91/Kconfig
@@ -39,6 +39,9 @@ config AT91_SAM9G45_RESET
 config AT91_SAM9_TIME
bool
 
+config HAVE_AT91_SMD
+   bool
+
 config SOC_AT91SAM9
bool
select AT91_SAM9_TIME
@@ -85,6 +88,7 @@ config SOC_SAMA5D3
select HAVE_AT91_DBGU1
select AT91_USE_OLD_CLK
select HAVE_AT91_UTMI
+   select HAVE_AT91_SMD
select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's SAMA5D3 family SoC.
@@ -157,6 +161,7 @@ config SOC_AT91SAM9X5
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
select HAVE_AT91_UTMI
+   select HAVE_AT91_SMD
select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's AT91SAM9x5 family SoC.
diff --git a/drivers/clk/at91/Makefile b/drivers/clk/at91/Makefile
index 61db058..0e92b71 100644
--- a/drivers/clk/at91/Makefile
+++ b/drivers/clk/at91/Makefile
@@ -9,3 +9,4 @@ obj-y += clk-system.o clk-peripheral.o
 obj-$(CONFIG_AT91_PROGRAMMABLE_CLOCKS) += clk-programmable.o
 obj-$(CONFIG_HAVE_AT91_UTMI)   += clk-utmi.o
 obj-$(CONFIG_HAVE_AT91_USB_CLK)+= clk-usb.o
+obj-$(CONFIG_HAVE_AT91_SMD)+= clk-smd.o
diff --git a/drivers/clk/at91/clk-smd.c b/drivers/clk/at91/clk-smd.c
new file mode 100644
index 000..9f3fa39
--- /dev/null
+++ b/drivers/clk/at91/clk-smd.c
@@ -0,0 +1,173 @@
+/*
+ * drivers/clk/at91/clk-smd.c
+ *
+ *  Copyright (C) 2013 Boris BREZILLON 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pmc.h"
+
+#define SMD_SOURCE_MAX 2
+
+#define SMD_DIV_SHIFT  8
+#define SMD_MAX_DIV0xf
+
+struct at91sam9x5_clk_smd {
+   struct clk_hw hw;
+   struct at91_pmc *pmc;
+};
+
+#define to_at91sam9x5_clk_smd(hw) \
+   container_of(hw, struct at91sam9x5_clk_smd, hw)
+
+static unsigned long at91sam9x5_clk_smd_recalc_rate(struct clk_hw *hw,
+   unsigned long parent_rate)
+{
+   u32 tmp;
+   u8 smddiv;
+   struct at91sam9x5_clk_smd *smd = to_at91sam9x5_clk_smd(hw);
+   struct at91_pmc *pmc = smd->pmc;
+
+   tmp = pmc_read(pmc, AT91_PMC_SMD);
+   smddiv = (tmp & AT91_PMC_SMD_DIV) >> SMD_DIV_SHIFT;
+   return parent_rate / (smddiv + 1);
+}
+
+static long at91sam9x5_clk_smd_round_rate(struct clk_hw *hw, unsigned long 
rate,
+ unsigned long *parent_rate)
+{
+   unsigned long div;
+   unsigned long bestrate;
+   unsigned long tmp;
+
+   if (rate >= *parent_rate)
+   return *parent_rate;
+
+   div = *parent_rate / rate;
+   if (div > SMD_MAX_DIV)
+   return *parent_rate / (SMD_MAX_DIV + 1);
+
+   bestrate = *parent_rate / div;
+   tmp = *parent_rate / (div + 1);
+   if (bestrate - rate > rate - tmp)
+   bestrate = tmp;
+
+   return bestrate;
+}
+
+static int at91sam9x5_clk_smd_set_parent(struct clk_hw *hw, u8 index)
+{
+   u32 tmp;
+   struct at91sam9x5_clk_smd *smd = to_at91sam9x5_clk_smd(hw);
+   struct at91_pmc *pmc = smd->pmc;
+
+   if (index > 1)
+   return -EINVAL;
+   tmp = pmc_read(pmc, AT91_PMC_SMD) & ~AT91_PMC_SMDS;
+   if (index)
+   tmp |= AT91_PMC_SMDS;
+   pmc_write(pmc, AT91_PMC_SMD, tmp);
+   return 0;
+}
+
+static u8 at91sam9x5_clk_smd_get_parent(struct clk_hw *hw)
+{
+   struct at91sam9x5_clk_smd *smd = to_at91sam9x5_clk_smd(hw);
+   struct at91_pmc *pmc = smd->pmc;
+
+   return pmc_read(pmc, AT91_PMC_SMD) & AT91_PMC_SMDS;
+}
+
+static int at91sam9x5_clk_smd_set_rate(struct clk_hw *hw, unsigned long rate,
+  unsigned long parent_rate)
+{
+   u32 tmp;
+   struct at91sam9x5_clk_smd *smd = to_at91sam9x5_clk_smd(hw);
+   struct at91_pmc *pmc = smd->pmc;
+   unsigned long div = parent_rate / rate;
+
+   if (parent_rate % rate || div < 1 || div > (SMD_MAX_DIV + 1))
+   return -EINVAL;
+   tmp = pmc_read(pmc, AT91_PMC_SMD) & 

[PATCH] tools, perf: remove trivial extra semincolon

2013-11-12 Thread Davidlohr Bueso
Accidentally ran into these, get rid of them.

Signed-off-by: Davidlohr Bueso 
---
 tools/perf/ui/browser.c  | 2 +-
 tools/perf/util/evlist.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index bbc782e..3648d4e 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -680,7 +680,7 @@ static void __ui_browser__line_arrow_down(struct ui_browser 
*browser,
if (end >= browser->top_idx + browser->height)
end_row = browser->height - 1;
else
-   end_row = end - browser->top_idx;;
+   end_row = end - browser->top_idx;
 
ui_browser__gotorc(browser, row, column);
SLsmg_draw_vline(end_row - row + 1);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b939221..1038f4a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1135,7 +1135,7 @@ size_t perf_evlist__fprintf(struct perf_evlist *evlist, 
FILE *fp)
   perf_evsel__name(evsel));
}
 
-   return printed + fprintf(fp, "\n");;
+   return printed + fprintf(fp, "\n");
 }
 
 int perf_evlist__strerror_tp(struct perf_evlist *evlist __maybe_unused,
-- 
1.8.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: zsmalloc: Ensure handle is never 0 on success

2013-11-12 Thread Nitin Gupta

On 11/12/13, 6:42 PM, Greg KH wrote:

On Wed, Nov 13, 2013 at 12:41:38AM +0900, Minchan Kim wrote:

We spent much time with preventing zram enhance since it have been in staging
and Greg never want to improve without promotion.


It's not "improve", it's "Greg does not want you adding new features and
functionality while the code is in staging."  I want you to spend your
time on getting it out of staging first.

Now if something needs to be done based on review and comments to the
code, then that's fine to do and I'll accept that, but I've been seeing
new functionality be added to the code, which I will not accept because
it seems that you all have given up on getting it merged, which isn't
ok.



It's not that people have given up on getting it merged but every time 
patches are posted, there is really no response from maintainers perhaps 
due to their lack of interest in embedded, or perhaps they believe 
embedded folks are making a wrong choice by using zram. Either way, a 
final word, instead of just silence would be more helpful.


Thanks,
Nitin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 13/17] clk: at91: add PMC usb clock

2013-11-12 Thread Boris BREZILLON
This patch adds new at91 usb clock implementation using common clk framework.
This clock is used to clock usb ports (ohci, ehci and udc).

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 arch/arm/mach-at91/Kconfig |   11 ++
 drivers/clk/at91/Makefile  |1 +
 drivers/clk/at91/clk-usb.c |  400 
 drivers/clk/at91/pmc.c |   15 ++
 drivers/clk/at91/pmc.h |9 +
 5 files changed, 436 insertions(+)
 create mode 100644 drivers/clk/at91/clk-usb.c

diff --git a/arch/arm/mach-at91/Kconfig b/arch/arm/mach-at91/Kconfig
index 6ad37da..b76dc4c 100644
--- a/arch/arm/mach-at91/Kconfig
+++ b/arch/arm/mach-at91/Kconfig
@@ -3,6 +3,9 @@ if ARCH_AT91
 config HAVE_AT91_UTMI
bool
 
+config HAVE_AT91_USB_CLK
+   bool
+
 config HAVE_AT91_DBGU0
bool
 
@@ -82,6 +85,7 @@ config SOC_SAMA5D3
select HAVE_AT91_DBGU1
select AT91_USE_OLD_CLK
select HAVE_AT91_UTMI
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's SAMA5D3 family SoC.
  This support covers SAMA5D31, SAMA5D33, SAMA5D34, SAMA5D35.
@@ -96,12 +100,14 @@ config SOC_AT91RM9200
select MULTI_IRQ_HANDLER
select SPARSE_IRQ
select AT91_USE_OLD_CLK
+   select HAVE_AT91_USB_CLK
 
 config SOC_AT91SAM9260
bool "AT91SAM9260, AT91SAM9XE or AT91SAM9G20"
select HAVE_AT91_DBGU0
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's AT91SAM9260, AT91SAM9XE
  or AT91SAM9G20 SoC.
@@ -112,6 +118,7 @@ config SOC_AT91SAM9261
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's AT91SAM9261 or 
AT91SAM9G10 SoC.
 
@@ -121,6 +128,7 @@ config SOC_AT91SAM9263
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_USB_CLK
 
 config SOC_AT91SAM9RL
bool "AT91SAM9RL"
@@ -137,6 +145,7 @@ config SOC_AT91SAM9G45
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
select HAVE_AT91_UTMI
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's AT91SAM9G45 family SoC.
  This support covers AT91SAM9G45, AT91SAM9G46, AT91SAM9M10 and 
AT91SAM9M11.
@@ -148,6 +157,7 @@ config SOC_AT91SAM9X5
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
select HAVE_AT91_UTMI
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using one of Atmel's AT91SAM9x5 family SoC.
  This means that your SAM9 name finishes with a '5' (except if it is
@@ -161,6 +171,7 @@ config SOC_AT91SAM9N12
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_USB_CLK
help
  Select this if you are using Atmel's AT91SAM9N12 SoC.
 
diff --git a/drivers/clk/at91/Makefile b/drivers/clk/at91/Makefile
index a824883..61db058 100644
--- a/drivers/clk/at91/Makefile
+++ b/drivers/clk/at91/Makefile
@@ -8,3 +8,4 @@ obj-y += clk-system.o clk-peripheral.o
 
 obj-$(CONFIG_AT91_PROGRAMMABLE_CLOCKS) += clk-programmable.o
 obj-$(CONFIG_HAVE_AT91_UTMI)   += clk-utmi.o
+obj-$(CONFIG_HAVE_AT91_USB_CLK)+= clk-usb.o
diff --git a/drivers/clk/at91/clk-usb.c b/drivers/clk/at91/clk-usb.c
new file mode 100644
index 000..0454555
--- /dev/null
+++ b/drivers/clk/at91/clk-usb.c
@@ -0,0 +1,400 @@
+/*
+ * drivers/clk/at91/clk-usb.c
+ *
+ *  Copyright (C) 2013 Boris BREZILLON 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pmc.h"
+
+#define USB_SOURCE_MAX 2
+
+#define SAM9X5_USB_DIV_SHIFT   8
+#define SAM9X5_USB_MAX_DIV 0xf
+
+#define RM9200_USB_DIV_SHIFT   28
+#define RM9200_USB_DIV_TAB_SIZE4
+
+struct at91sam9x5_clk_usb {
+   struct clk_hw hw;
+   struct at91_pmc *pmc;
+};
+
+#define to_at91sam9x5_clk_usb(hw) \
+   container_of(hw, struct at91sam9x5_clk_usb, hw)
+
+struct at91rm9200_clk_usb {
+   struct clk_hw hw;
+   struct at91_pmc *pmc;
+   u32 divisors[4];
+};
+
+#define to_at91rm9200_clk_usb(hw) \
+   container_of(hw, struct at91rm9200_clk_usb, hw)
+
+static unsigned long at91sam9x5_clk_usb_recalc_rate(struct clk_hw *hw,
+   unsigned long parent_rate)
+{
+   u32 tmp;
+   u8 usbdiv;
+   struct at91sam9x5_clk_usb *usb = to_at91sam9x5_clk_usb(hw);
+   struct at91_pmc *pmc = usb->pmc;
+
+   tmp = pmc_read(pmc, AT91_PMC_USB);
+   usbdiv = 

[PATCH v5 12/17] clk: at91: add PMC utmi clock

2013-11-12 Thread Boris BREZILLON
This adds new at91 utmi clock implementation using common clk framework.

This clock is a pll with a fixed factor (x40).
It is used as a source for usb clock.

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 arch/arm/mach-at91/Kconfig  |7 ++
 drivers/clk/at91/Makefile   |1 +
 drivers/clk/at91/clk-utmi.c |  162 +++
 drivers/clk/at91/pmc.c  |7 ++
 drivers/clk/at91/pmc.h  |5 ++
 5 files changed, 182 insertions(+)
 create mode 100644 drivers/clk/at91/clk-utmi.c

diff --git a/arch/arm/mach-at91/Kconfig b/arch/arm/mach-at91/Kconfig
index 85b53a4..6ad37da 100644
--- a/arch/arm/mach-at91/Kconfig
+++ b/arch/arm/mach-at91/Kconfig
@@ -1,5 +1,8 @@
 if ARCH_AT91
 
+config HAVE_AT91_UTMI
+   bool
+
 config HAVE_AT91_DBGU0
bool
 
@@ -78,6 +81,7 @@ config SOC_SAMA5D3
select HAVE_FB_ATMEL
select HAVE_AT91_DBGU1
select AT91_USE_OLD_CLK
+   select HAVE_AT91_UTMI
help
  Select this if you are using one of Atmel's SAMA5D3 family SoC.
  This support covers SAMA5D31, SAMA5D33, SAMA5D34, SAMA5D35.
@@ -124,6 +128,7 @@ config SOC_AT91SAM9RL
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_UTMI
 
 config SOC_AT91SAM9G45
bool "AT91SAM9G45 or AT91SAM9M10 families"
@@ -131,6 +136,7 @@ config SOC_AT91SAM9G45
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_UTMI
help
  Select this if you are using one of Atmel's AT91SAM9G45 family SoC.
  This support covers AT91SAM9G45, AT91SAM9G46, AT91SAM9M10 and 
AT91SAM9M11.
@@ -141,6 +147,7 @@ config SOC_AT91SAM9X5
select HAVE_FB_ATMEL
select SOC_AT91SAM9
select AT91_USE_OLD_CLK
+   select HAVE_AT91_UTMI
help
  Select this if you are using one of Atmel's AT91SAM9x5 family SoC.
  This means that your SAM9 name finishes with a '5' (except if it is
diff --git a/drivers/clk/at91/Makefile b/drivers/clk/at91/Makefile
index 3873b62..a824883 100644
--- a/drivers/clk/at91/Makefile
+++ b/drivers/clk/at91/Makefile
@@ -7,3 +7,4 @@ obj-y += clk-main.o clk-pll.o clk-plldiv.o clk-master.o
 obj-y += clk-system.o clk-peripheral.o
 
 obj-$(CONFIG_AT91_PROGRAMMABLE_CLOCKS) += clk-programmable.o
+obj-$(CONFIG_HAVE_AT91_UTMI)   += clk-utmi.o
diff --git a/drivers/clk/at91/clk-utmi.c b/drivers/clk/at91/clk-utmi.c
new file mode 100644
index 000..9a133df
--- /dev/null
+++ b/drivers/clk/at91/clk-utmi.c
@@ -0,0 +1,162 @@
+/*
+ * drivers/clk/at91/clk-utmi.c
+ *
+ *  Copyright (C) 2013 Boris BREZILLON 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pmc.h"
+
+#define UTMI_FIXED_MUL 40
+
+struct clk_utmi {
+   struct clk_hw hw;
+   struct at91_pmc *pmc;
+   unsigned int irq;
+   wait_queue_head_t wait;
+};
+
+#define to_clk_utmi(hw) container_of(hw, struct clk_utmi, hw)
+
+static irqreturn_t clk_utmi_irq_handler(int irq, void *dev_id)
+{
+   struct clk_utmi *utmi = (struct clk_utmi *)dev_id;
+
+   wake_up(>wait);
+   disable_irq_nosync(utmi->irq);
+
+   return IRQ_HANDLED;
+}
+
+static int clk_utmi_prepare(struct clk_hw *hw)
+{
+   struct clk_utmi *utmi = to_clk_utmi(hw);
+   struct at91_pmc *pmc = utmi->pmc;
+   u32 tmp = at91_pmc_read(AT91_CKGR_UCKR) | AT91_PMC_UPLLEN |
+ AT91_PMC_UPLLCOUNT | AT91_PMC_BIASEN;
+
+   pmc_write(pmc, AT91_CKGR_UCKR, tmp);
+
+   while (!(pmc_read(pmc, AT91_PMC_SR) & AT91_PMC_LOCKU)) {
+   enable_irq(utmi->irq);
+   wait_event(utmi->wait,
+  pmc_read(pmc, AT91_PMC_SR) & AT91_PMC_LOCKU);
+   }
+
+   return 0;
+}
+
+static int clk_utmi_is_ready(struct clk_hw *hw)
+{
+   struct clk_utmi *utmi = to_clk_utmi(hw);
+   struct at91_pmc *pmc = utmi->pmc;
+
+   return !!(pmc_read(pmc, AT91_PMC_SR) & AT91_PMC_LOCKU);
+}
+
+static void clk_utmi_disable(struct clk_hw *hw)
+{
+   struct clk_utmi *utmi = to_clk_utmi(hw);
+   struct at91_pmc *pmc = utmi->pmc;
+   u32 tmp = at91_pmc_read(AT91_CKGR_UCKR) & ~AT91_PMC_UPLLEN;
+
+   pmc_write(pmc, AT91_CKGR_UCKR, tmp);
+}
+
+static unsigned long clk_utmi_recalc_rate(struct clk_hw *hw,
+ unsigned long parent_rate)
+{
+   /* UTMI clk is a fixed clk multiplier */
+   return parent_rate * UTMI_FIXED_MUL;
+}
+
+static const struct clk_ops utmi_ops = {
+   .prepare = clk_utmi_prepare,
+   .is_prepared = clk_utmi_is_ready,
+   .disable = 

Re: [PATCH] gpio: Renesas RZ GPIO driver

2013-11-12 Thread Magnus Damm
Hi Linus,

On Wed, Nov 13, 2013 at 4:59 AM, Linus Walleij  wrote:
> On Thu, Nov 7, 2013 at 12:47 AM, Magnus Damm  wrote:
>
>> From: Magnus Damm 
>>
>> This patch adds a GPIO driver for the RZ series of SoCs from
>> Renesas. The driver can be used as platform device with dynamic
>> or static GPIO assignment or via DT using dynamic GPIOs.
>
> So given that this is for a new system which should only ever
> be booted using device tree, why are we bothering with supporting
> platform data passing at all?

Mainly to support the same interfaces as our other GPIO drivers. But I
can easily remove the platform data init method if that is the
preferred way, no problem.

> Is it so that arch/sh is more soft on this for example...?
> Can some arch maintainer like SH/Paul ACK this approach?
>
> Read: SH is not moving to device tree...?

>From what I can tell this GPIO block is not used with SH, so I don't
think SH is related, but regarding DT on SH, do you know when it was
decided that other architectures also were supposed to move DT?

> (...)
>> Tested with yet-to-be-posted platform device and DT devices on
>> r7s72100 and Genmai using LEDs, DIP switches and I2C bitbang.
>
> Do you think the maintainers will merge the platform
> device approach?

I would not assume so. But the goal with these patches is  not
upstream, instead they basically serve as a stop-gap solution between
now and when I get OK that the DT bits in this GPIO driver looks fine.
If they are going to be merged or not is a different question IMO.

>> --- /dev/null
>> +++ work/include/linux/platform_data/gpio-rz.h  2013-11-06 
>> 14:18:46.0 +0900
>> @@ -0,0 +1,13 @@
>> +#ifndef __GPIO_RZ_H__
>> +#define __GPIO_RZ_H__
>> +
>> +struct gpio_rz_config {
>> +   int gpio_base;
>
> Passing these static base offsets around is not good for the
> kernel and we're trying to get rid of it :-(

Sure.

>> +   const char *pctl_name;
>
> Ho hum... This needs some kerneldoc describing that this is
> used to map the GPIO range to the right pin controller.

Ok, but it will just go away when the platform data init method is removed

>> +#define RZ_GPIOS_PER_PORT 16
>
> This is only used in the driver so move it into the driver.

It is also used by the macro below. =)

>> +#define RZ_PORT_PIN(bank, pin) (((bank) * RZ_GPIOS_PER_PORT) + (pin))
>
> This is not used anywhere so delete it.
>
> If it is to be kept I'd like "pin" replaced with "line" to avoid
> confusion with the pin control business.

The idea with that macro is to allow board code to select which pin
from which bank, but I realize it may clash with pinctrl terminology.
Also, since the only reason for this header file is to provide a
platform data init interface for the driver all these things will go
away if we go DT-only.

I'll ditch the platform data interface and post a V2.

Thanks for your feedback!

Cheers,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 11/17] clk: at91: add PMC programmable clocks

2013-11-12 Thread Boris BREZILLON
This patch adds new at91 programmable clocks implementation using common clk
framework.
A programmable clock is a clock which can be exported on a given pin to clock
external devices.
Each programmable clock is given an id (from 0 to 8).
The number of available programmable clocks depends on the SoC you're using.
Programmable clock driver only implements the clock setting (clock rate and
parent setting). It must be chained to a system clock in order to
enable/disable the generated clock.
The PCKX pins used to output the clock signals must be assigned to the
appropriate peripheral (see atmel's datasheets).

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 drivers/clk/at91/Makefile   |2 +
 drivers/clk/at91/clk-programmable.c |  368 +++
 drivers/clk/at91/pmc.c  |   15 ++
 drivers/clk/at91/pmc.h  |9 +
 4 files changed, 394 insertions(+)
 create mode 100644 drivers/clk/at91/clk-programmable.c

diff --git a/drivers/clk/at91/Makefile b/drivers/clk/at91/Makefile
index 04deba3..3873b62 100644
--- a/drivers/clk/at91/Makefile
+++ b/drivers/clk/at91/Makefile
@@ -5,3 +5,5 @@
 obj-y += pmc.o
 obj-y += clk-main.o clk-pll.o clk-plldiv.o clk-master.o
 obj-y += clk-system.o clk-peripheral.o
+
+obj-$(CONFIG_AT91_PROGRAMMABLE_CLOCKS) += clk-programmable.o
diff --git a/drivers/clk/at91/clk-programmable.c 
b/drivers/clk/at91/clk-programmable.c
new file mode 100644
index 000..1daa05f
--- /dev/null
+++ b/drivers/clk/at91/clk-programmable.c
@@ -0,0 +1,368 @@
+/*
+ * drivers/clk/at91/clk-programmable.c
+ *
+ *  Copyright (C) 2013 Boris BREZILLON 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pmc.h"
+
+#define PROG_SOURCE_MAX5
+#define PROG_ID_MAX7
+
+#define PROG_STATUS_MASK(id)   (1 << ((id) + 8))
+#define PROG_PRES_MASK 0x7
+#define PROG_MAX_RM9200_CSS3
+
+struct clk_programmable_layout {
+   u8 pres_shift;
+   u8 css_mask;
+   u8 have_slck_mck;
+};
+
+struct clk_programmable {
+   struct clk_hw hw;
+   struct at91_pmc *pmc;
+   unsigned int irq;
+   wait_queue_head_t wait;
+   u8 id;
+   u8 css;
+   u8 pres;
+   u8 slckmck;
+   const struct clk_programmable_layout *layout;
+};
+
+#define to_clk_programmable(hw) container_of(hw, struct clk_programmable, hw)
+
+
+static irqreturn_t clk_programmable_irq_handler(int irq, void *dev_id)
+{
+   struct clk_programmable *prog = (struct clk_programmable *)dev_id;
+
+   wake_up(>wait);
+
+   return IRQ_HANDLED;
+}
+
+static int clk_programmable_prepare(struct clk_hw *hw)
+{
+   u32 tmp;
+   struct clk_programmable *prog = to_clk_programmable(hw);
+   struct at91_pmc *pmc = prog->pmc;
+   const struct clk_programmable_layout *layout = prog->layout;
+   u8 id = prog->id;
+   u32 mask = PROG_STATUS_MASK(id);
+
+   tmp = prog->css | (prog->pres << layout->pres_shift);
+   if (layout->have_slck_mck && prog->slckmck)
+   tmp |= AT91_PMC_CSSMCK_MCK;
+
+   pmc_write(pmc, AT91_PMC_PCKR(id), tmp);
+
+   while (!(pmc_read(pmc, AT91_PMC_SR) & mask))
+   wait_event(prog->wait, pmc_read(pmc, AT91_PMC_SR) & mask);
+
+   return 0;
+}
+
+static int clk_programmable_is_ready(struct clk_hw *hw)
+{
+   struct clk_programmable *prog = to_clk_programmable(hw);
+   struct at91_pmc *pmc = prog->pmc;
+
+   return !!(pmc_read(pmc, AT91_PMC_SR) & AT91_PMC_PCKR(prog->id));
+}
+
+static unsigned long clk_programmable_recalc_rate(struct clk_hw *hw,
+ unsigned long parent_rate)
+{
+   u32 tmp;
+   struct clk_programmable *prog = to_clk_programmable(hw);
+   struct at91_pmc *pmc = prog->pmc;
+   const struct clk_programmable_layout *layout = prog->layout;
+
+   tmp = pmc_read(pmc, AT91_PMC_PCKR(prog->id));
+   prog->pres = (tmp >> layout->pres_shift) & PROG_PRES_MASK;
+
+   return parent_rate >> prog->pres;
+}
+
+static long clk_programmable_round_rate(struct clk_hw *hw, unsigned long rate,
+   unsigned long *parent_rate)
+{
+   unsigned long best_rate = *parent_rate;
+   unsigned long best_diff;
+   unsigned long new_diff;
+   unsigned long cur_rate;
+   int shift = shift;
+
+   if (rate > *parent_rate)
+   return *parent_rate;
+   else
+   best_diff = *parent_rate - rate;
+
+   if (!best_diff)
+   return best_rate;
+
+   for (shift = 1; shift < PROG_PRES_MASK; shift++) {
+   cur_rate = *parent_rate >> shift;
+

[PATCH v5 10/17] clk: at91: add peripheral clk macros for peripheral clk dt bindings

2013-11-12 Thread Boris BREZILLON
This patch adds the peripheral divisors macros (for sam9x5 compatible IPs)
which will be used by peripheral clk dt definitions.

Signed-off-by: Boris BREZILLON 
Acked-by: Nicolas Ferre 
---
 include/dt-bindings/clk/at91.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/dt-bindings/clk/at91.h b/include/dt-bindings/clk/at91.h
index 0b4cb99..a3b07ca 100644
--- a/include/dt-bindings/clk/at91.h
+++ b/include/dt-bindings/clk/at91.h
@@ -19,4 +19,10 @@
 #define AT91_PMC_MOSCRCS   17  /* Main On-Chip RC */
 #define AT91_PMC_CFDEV 18  /* Clock Failure Detector Event 
*/
 
+/* sam9x5 peripheral divisors */
+#define AT91SAM9X5_PERIPH_CLK_DIV1 0
+#define AT91SAM9X5_PERIPH_CLK_DIV2 1
+#define AT91SAM9X5_PERIPH_CLK_DIV4 2
+#define AT91SAM9X5_PERIPH_CLK_DIV8 3
+
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rds: Error on offset mismatch if not loopback

2013-11-12 Thread David Miller
From: Josh Hunt 
Date: Tue, 12 Nov 2013 22:22:11 -0600

> David - I can resubmit the patch with the proper signed-off-by and
> formatting if you are willing to apply it unless John wants to try again. I
> think it's time this got upstream.

Nothing is going to happen until the patch is submitted properly, so
just do, don't ask.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Theodore Ts'o
On Tue, Nov 12, 2013 at 11:23:03PM -0500, Greg Price wrote:
> > The basic idea is that we don't want to break systems, but we do want
> > to gently coerce people to do the right thing.  Otherwise, I'm worried
> > that distros, or embedded/mobile/consume electronics engineers would
> > just patch out the check.
> 
> That's a good idea.  I've worried about the same thing, but hadn't
> thought of that solution.

I think the key is that we set a default of requiring 128 bits, or 5
minutes, with boot-line options to change the defaults.  BTW, with the
changes that are scheduled for 3.13, this shouldn't be a problem on
most desktops.  From my T430s laptop:

...
[4.446047] random: nonblocking pool is initialized
[4.542119] usb 3-1.6: New USB device found, idVendor=04f2, idProduct=b2da
[4.542124] usb 3-1.6: New USB device strings: Mfr=1, Product=2, 
SerialNumber=0
[4.542128] usb 3-1.6: Product: Integrated Camera
[4.542131] usb 3-1.6: Manufacturer: Chicony Electronics Co., Ltd.
[4.575753] SELinux: initialized (dev tmpfs, type tmpfs), uses transition 
SIDs
[4.653338] udevd[462]: starting version 175
...
[6.253131] EXT4-fs (sdc3): re-mounted. Opts: (null)

So even without adding device attach times (which is on the todo list)
the /dev/urandom pool is getting an estimated 128 bits of entropy
almost two seconds *before* the root file system is remouted
read/write.

(And this is also before fixing the rc80211 minstrel code to stop
wasting about two dozen bits of entropy at startup --- it's using
get_random_bytes even though it doesn't actually need
cryptographically secure random numbers.)

This is why I've been working improving the random driver's efficiency
in getting the urandom pool as soon as possible, as higher priority
than adding blocking-on-boot for /dev/urandom.

> And, pray tell, how will you know that you have done that?
> 
> Even the best entropy estimation algorithms are nothing but estimations,
> and min-entropy is the hardest form of entropy to estimate.

Of course it's only an estimate.  Some researchers have looked into
this and their results show that at least for x86 desktop/servers, we
appear to be conservative enough in our entropy estimation.  But
ultimately, yes, that is an issue which I am concerned about.  But I
believe that's a separable problem that we can work on separately from
other /dev/random issues --- and I'm hoping we can get some students
to study this problem on a variety of different hardware platforms and
entropy sources.

- Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Greg Price
On Tue, Nov 12, 2013 at 08:51:18PM -0800, H. Peter Anvin wrote:
> On 11/12/2013 08:37 PM, Greg Price wrote:
> > I'm thinking only of boot-time blocking.  The idea is that once
> > /dev/urandom is seeded with, say, 128 bits of min-entropy in the
> > absolute, information-theoretic sense, it can produce an infinite
> > supply (or something like 2^128 bits, which amounts to the same thing)
> > of bits that can't be distinguished from random, short of breaking or
> > brute-forcing the crypto.  So once it's seeded, it's good forever.
> 
> And, pray tell, how will you know that you have done that?
> 
> Even the best entropy estimation algorithms are nothing but estimations,

Indeed.  We do our best, but we can't be sure we have as much entropy
as we think.

The status quo here is that /dev/urandom will cheerfully answer
requests even when, by our own estimates, we only have a small amount
of entropy and anything we return will be guessable.  What Ted and I
are discussing in this thread is to have it wait until, as best we can
estimate, it has enough entropy to give an unpredictable answer.  The
status quo has the same effect as an arbitrarily too-optimistic
estimate.

The key point when it comes to the question of going *back* to
blocking is that even if the estimates are bad and in reality the
answer is guessable, it won't get any *more* guessable in the future.
If we think we have 128 bits of input min-entropy but we only have
(say) 32, meaning some states we could be in are as likely as 2^(-32),
then once an attacker sees a handful of bytes of output (*) they can
check a guess at our input and predict all our other output with as
few as 2^32 guesses, depending on the distribution.  If the attacker
sees a gigabyte or a petabyte of output, they have exactly the same
ability.  So there's no good reason to stop.

On the other hand, because our estimates may be wrong it certainly
make sense to keep feeding new entropy into the pool.  Maybe a later
seeding will have enough real entropy to make us unpredictable from
then on.  We could also use bigger and bigger reseeds, as a hedge
against our estimates being systematically too low in some
environment.

Does that make sense?  Do you have other ideas for guarding against
the case where our estimates are low?

Greg


(*) Math note: the attacker morally needs only 32 bits.  They actually
need a little more than that, because some of the (at least) 2^32
possible states probably correspond to the same first 32 bits of
output.  By standard probability bounds, for any given set of 2^32
possible input states, if the generator is good then probably no more
than ln(2^32) = 22 or so of them correspond to the same first 32 bits.
About 37 bits of output is enough to probably make all the outputs
different, and with even 64 bits = 8 bytes of output it becomes
overwhelmingly likely that all the outputs are different.  If there
are more than 2^32 possible states because the min-entropy is 32 but
some inputs are less likely, then the attacker needs even less output
to be able to confirm the most likely guesses.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net V2 2/2] macvtap: limit head length of skb allocated

2013-11-12 Thread Jason Wang
We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.

To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.

Cc: Stefan Hajnoczi 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
The patch was needed for stable.
Changes from V1:
- Check the linear size in macvtap_get_user() to avoid iov_pages() under
  estimation.
---
 drivers/net/macvtap.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 9dccb1e..dc76670 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -628,6 +628,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
const struct iovec *iv, unsigned long total_len,
size_t count, int noblock)
 {
+   int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
struct sk_buff *skb;
struct macvlan_dev *vlan;
unsigned long len = total_len;
@@ -670,6 +671,8 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
 
if (m && m->msg_control && sock_flag(>sk, SOCK_ZEROCOPY)) {
copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
+   if (copylen > good_linear)
+   copylen = good_linear;
linear = copylen;
if (iov_pages(iv, vnet_hdr_len + copylen, count)
<= MAX_SKB_FRAGS)
@@ -678,7 +681,10 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
 
if (!zerocopy) {
copylen = len;
-   linear = vnet_hdr.hdr_len;
+   if (vnet_hdr.hdr_len > good_linear)
+   linear = good_linear;
+   else
+   linear = vnet_hdr.hdr_len;
}
 
skb = macvtap_alloc_skb(>sk, NET_IP_ALIGN, copylen,
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net V2 1/2] tuntap: limit head length of skb allocated

2013-11-12 Thread Jason Wang
We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.

To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.

Cc: Stefan Hajnoczi 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
The patch was needed for stable.
Changes from V1:
- check the linear size in tun_get_user() to avoid iov_pages() under estimation
---
 drivers/net/tun.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7cb105c..782e38b 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -981,6 +981,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct 
tun_file *tfile,
struct sk_buff *skb;
size_t len = total_len, align = NET_SKB_PAD, linear;
struct virtio_net_hdr gso = { 0 };
+   int good_linear;
int offset = 0;
int copylen;
bool zerocopy = false;
@@ -1021,12 +1022,16 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
return -EINVAL;
}
 
+   good_linear = SKB_MAX_HEAD(align);
+
if (msg_control) {
/* There are 256 bytes to be copied in skb, so there is
 * enough room for skb expand head in case it is used.
 * The rest of the buffer is mapped from userspace.
 */
copylen = gso.hdr_len ? gso.hdr_len : GOODCOPY_LEN;
+   if (copylen > good_linear)
+   copylen = good_linear;
linear = copylen;
if (iov_pages(iv, offset + copylen, count) <= MAX_SKB_FRAGS)
zerocopy = true;
@@ -1034,7 +1039,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
 
if (!zerocopy) {
copylen = len;
-   linear = gso.hdr_len;
+   if (gso.hdr_len > good_linear)
+   linear = good_linear;
+   else
+   linear = gso.hdr_len;
}
 
skb = tun_alloc_skb(tfile, align, copylen, linear, noblock);
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 2/2] macvtap: limit head length of skb allocated

2013-11-12 Thread Jason Wang
On 11/13/2013 01:45 AM, Greg Rose wrote:
> On Tue, 12 Nov 2013 18:02:57 +0800
> Jason Wang  wrote:
>
>> We currently use hdr_len as a hint of head length which is advertised
>> by guest. But when guest advertise a very big value, it can lead to
>> an 64K+ allocating of kmalloc() which has a very high possibility of
>> failure when host memory is fragmented or under heavy stress. The
>> huge hdr_len also reduce the effect of zerocopy or even disable if a
>> gso skb is linearized in guest.
>>
>> To solves those issues, this patch introduces an upper limit
>> (PAGE_SIZE) of the head, which guarantees an order 0 allocation each
>> time.
>>
>> Cc: Stefan Hajnoczi 
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Jason Wang 
>> ---
>> The patch was needed for stable.
>> ---
>>  drivers/net/macvtap.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index 9dccb1e..7ee6f9d 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -523,6 +523,11 @@ static inline struct sk_buff
>> *macvtap_alloc_skb(struct sock *sk, size_t prepad, int noblock, int
>> *err) {
>>  struct sk_buff *skb;
>> +int good_linear = SKB_MAX_HEAD(prepad);
>> +
>> +/* Don't use huge linear part */
>> +if (linear > good_linear)
>> +linear = good_linear;
>>  
>>  /* Under a page?  Don't bother with paged skb. */
>>  if (prepad + len < PAGE_SIZE || !linear)
> I see no problem with this or the tuntap patch except that in both
> cases kernel coding style would prefer that you align the local
> variable declarations in a reverse pyramid, longest at the beginning,
> shortest at the end.
>
> - Greg

Sure, will do it in V2.
Thanks
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: Add support for BCM2835.

2013-11-12 Thread Vinod Koul
On Fri, Nov 08, 2013 at 06:17:43PM +, Mark Brown wrote:
> On Fri, Nov 08, 2013 at 06:22:34PM +0100, Florian Meier wrote:
> > Add support for DMA controller of BCM2835 as used in the Raspberry Pi.
> > Currently it only supports cyclic DMA for serving the I2S driver.
> 
> Adding in Martin Sperl who's been looking at DMA with regard to the SPI
> controller (which will want non-cyclic mode but I guess there's a lot of
> shared code).
Is there a plan to add a library in SPI for dma ops, on lines of what is done in
sound?

--
~Vinod


signature.asc
Description: Digital signature


Re: [PATCH v2] sched: Check sched_domain before computing group power.

2013-11-12 Thread Srikar Dronamraju
* Peter Zijlstra  [2013-11-12 18:55:54]:

> On Tue, Nov 12, 2013 at 10:45:07PM +0530, Srikar Dronamraju wrote:
> > > 
> > > Hurm.. can you provide the actual topology of the machine that triggers
> > > this? My brain hurts trying to thing through the weird cases of this
> > > code.
> > > 
> > 
> > Hope this helps. Please do let me know if you were looking for pdf output.
> 
> PDFs go into /dev/null..
> 
> the below misses the interesting bits; being the node distance table.
> Also a complete sched_debug domain print is useful.
> 

available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
node 0 size: 64191 MB
node 0 free: 63194 MB
node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
node 1 size: 64481 MB
node 1 free: 63515 MB
node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
node 2 size: 64481 MB
node 2 free: 63536 MB
node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
node 3 size: 63968 MB
node 3 free: 62981 MB
node distances:
node   0   1   2   3 
  0:  10  11  11  12 
  1:  11  10  12  11 
  2:  11  12  10  11 
  3:  12  11  11  10 




x86: Booting SMP configuration:
 node  #0, CPUs:#1  #2  #3  #4  #5  #6  #7
 node  #1, CPUs:#8  #9 #10 #11 #12 #13 #14 #15
 node  #2, CPUs:   #16 #17 #18 #19 #20 #21 #22 #23
 node  #3, CPUs:   #24 #25 #26 #27 #28 #29 #30 #31
 node  #0, CPUs:   #32 #33 #34 #35 #36 #37 #38 #39
 node  #1, CPUs:   #40 #41 #42 #43 #44 #45 #46 #47
 node  #2, CPUs:   #48 #49 #50 #51 #52 #53 #54 #55
 node  #3, CPUs:   #56 #57 #58 #59 #60 #61 #62 #63
x86: Booted up 4 nodes, 64 CPUs
smpboot: Total of 64 processors activated (308393.92 BogoMIPS)
CPU0 attaching sched-domain:
 domain 0: span 0,32 level SIBLING
  groups: 0 (cpu_power = 588) 32 (cpu_power = 588)
  domain 1: span 0-7,32-39 level MC
   groups: 0,32 (cpu_power = 1176) 1,33 (cpu_power = 1178) 2,34 (cpu_power = 
1178) 3,35 (cpu_power = 1178) 4,36 (cpu_power = 1178) 5,37 (cpu_power = 1178) 
6,38 (cpu_power = 1178) 7,39 (cpu_power = 1176)
   domain 2: span 0-23,32-55 level NUMA
groups: 0-7,32-39 (cpu_power = 9420) 8-15,40-47 (cpu_power = 9408) 
16-23,48-55 (cpu_power = 9408)
domain 3: span 0-63 level NUMA
 groups: 0-23,32-55 (cpu_power = 28236) 8-31,40-63 (cpu_power = 28224)
CPU1 attaching sched-domain:
 domain 0: span 1,33 level SIBLING
  groups: 1 (cpu_power = 589) 33 (cpu_power = 589)
  domain 1: span 0-7,32-39 level MC
   groups: 1,33 (cpu_power = 1178) 2,34 (cpu_power = 1178) 3,35 (cpu_power = 
1178) 4,36 (cpu_power = 1178) 5,37 (cpu_power = 1178) 6,38 (cpu_power = 1178) 
7,39 (cpu_power = 1176) 0,32 (cpu_power = 1176)
   domain 2: span 0-23,32-55 level NUMA
groups: 0-7,32-39 (cpu_power = 9420) 8-15,40-47 (cpu_power = 9408) 
16-23,48-55 (cpu_power = 9408)
domain 3: span 0-63 level NUMA
 groups: 0-23,32-55 (cpu_power = 28236) 8-31,40-63 (cpu_power = 28224)
CPU2 attaching sched-domain:
 domain 0: span 2,34 level SIBLING
  groups: 2 (cpu_power = 589) 34 (cpu_power = 589)
  domain 1: span 0-7,32-39 level MC
   groups: 2,34 (cpu_power = 1178) 3,35 (cpu_power = 1178) 4,36 (cpu_power = 
1178) 5,37 (cpu_power = 1178) 6,38 (cpu_power = 1178) 7,39 (cpu_power = 1176) 
0,32 (cpu_power = 1176) 1,33 (cpu_power = 1178)
   domain 2: span 0-23,32-55 level NUMA
groups: 0-7,32-39 (cpu_power = 9420) 8-15,40-47 (cpu_power = 9408) 
16-23,48-55 (cpu_power = 9408)
domain 3: span 0-63 level NUMA
 groups: 0-23,32-55 (cpu_power = 28236) 8-31,40-63 (cpu_power = 28224)
CPU3 attaching sched-domain:
 domain 0: span 3,35 level SIBLING
  groups: 3 (cpu_power = 589) 35 (cpu_power = 589)
  domain 1: span 0-7,32-39 level MC
   groups: 3,35 (cpu_power = 1178) 4,36 (cpu_power = 1178) 5,37 (cpu_power = 
1178) 6,38 (cpu_power = 1178) 7,39 (cpu_power = 1176) 0,32 (cpu_power = 1176) 
1,33 (cpu_power = 1178) 2,34 (cpu_power = 1178)
   domain 2: span 0-23,32-55 level NUMA
groups: 0-7,32-39 (cpu_power = 9420) 8-15,40-47 (cpu_power = 9408) 
16-23,48-55 (cpu_power = 9408)
domain 3: span 0-63 level NUMA
 groups: 0-23,32-55 (cpu_power = 28236) 8-31,40-63 (cpu_power = 28224)
CPU4 attaching sched-domain:
 domain 0: span 4,36 level SIBLING
  groups: 4 (cpu_power = 589) 36 (cpu_power = 589)
  domain 1: span 0-7,32-39 level MC
   groups: 4,36 (cpu_power = 1178) 5,37 (cpu_power = 1178) 6,38 (cpu_power = 
1178) 7,39 (cpu_power = 1176) 0,32 (cpu_power = 1176) 1,33 (cpu_power = 1178) 
2,34 (cpu_power = 1178) 3,35 (cpu_power = 1178)
   domain 2: span 0-23,32-55 level NUMA
groups: 0-7,32-39 (cpu_power = 9420) 8-15,40-47 (cpu_power = 9408) 
16-23,48-55 (cpu_power = 9408)
domain 3: span 0-63 level NUMA
 groups: 0-23,32-55 (cpu_power = 28236) 8-31,40-63 (cpu_power = 28224)
CPU5 attaching sched-domain:
 domain 0: span 5,37 level SIBLING
  groups: 5 (cpu_power = 589) 37 (cpu_power = 589)
  domain 1: span 0-7,32-39 level MC
   groups: 5,37 (cpu_power = 1178) 6,38 (cpu_power = 1178) 7,39 (cpu_power = 
1176) 0,32 (cpu_power = 1176) 1,33 (cpu_power = 1178) 

[PATCH - v2] LEDS: tca6507 - fix up some comments.

2013-11-12 Thread NeilBrown

In particular fix the capitalisation of GPIO and LED and
correct TCA6507_MAKE_CPIO, but also rewrite the comment about
platform-data to include reference to devicetree.

Also re-wrap comments to fit 80 columns.

Reported-by: Bryan Wu 
Signed-off-by: NeilBrown 

diff --git a/drivers/leds/leds-tca6507.c b/drivers/leds/leds-tca6507.c
index 93a2b1759054..503df834c690 100644
--- a/drivers/leds/leds-tca6507.c
+++ b/drivers/leds/leds-tca6507.c
@@ -4,77 +4,87 @@
  * The TCA6507 is a programmable LED controller that can drive 7
  * separate lines either by holding them low, or by pulsing them
  * with modulated width.
- * The modulation can be varied in a simple pattern to produce a blink or
- * double-blink.
+ * The modulation can be varied in a simple pattern to produce a
+ * blink or double-blink.
  *
- * This driver can configure each line either as a 'GPIO' which is out-only
- * (no pull-up) or as an LED with variable brightness and hardware-assisted
- * blinking.
+ * This driver can configure each line either as a 'GPIO' which is
+ * out-only (pull-up resistor required) or as an LED with variable
+ * brightness and hardware-assisted blinking.
  *
- * Apart from OFF and ON there are three programmable brightness levels which
- * can be programmed from 0 to 15 and indicate how many 500usec intervals in
- * each 8msec that the led is 'on'.  The levels are named MASTER, BANK0 and
- * BANK1.
+ * Apart from OFF and ON there are three programmable brightness
+ * levels which can be programmed from 0 to 15 and indicate how many
+ * 500usec intervals in each 8msec that the led is 'on'.  The levels
+ * are named MASTER, BANK0 and BANK1.
  *
- * There are two different blink rates that can be programmed, each with
- * separate time for rise, on, fall, off and second-off.  Thus if 3 or more
- * different non-trivial rates are required, software must be used for the 
extra
- * rates. The two different blink rates must align with the two levels BANK0 
and
- * BANK1.
- * This driver does not support double-blink so 'second-off' always matches
- * 'off'.
+ * There are two different blink rates that can be programmed, each
+ * with separate time for rise, on, fall, off and second-off.  Thus if
+ * 3 or more different non-trivial rates are required, software must
+ * be used for the extra rates. The two different blink rates must
+ * align with the two levels BANK0 and BANK1.  This driver does not
+ * support double-blink so 'second-off' always matches 'off'.
  *
- * Only 16 different times can be programmed in a roughly logarithmic scale 
from
- * 64ms to 16320ms.  To be precise the possible times are:
+ * Only 16 different times can be programmed in a roughly logarithmic
+ * scale from 64ms to 16320ms.  To be precise the possible times are:
  *0, 64, 128, 192, 256, 384, 512, 768,
  *1024, 1536, 2048, 3072, 4096, 5760, 8128, 16320
  *
- * Times that cannot be closely matched with these must be
- * handled in software.  This driver allows 12.5% error in matching.
+ * Times that cannot be closely matched with these must be handled in
+ * software.  This driver allows 12.5% error in matching.
  *
- * This driver does not allow rise/fall rates to be set explicitly.  When 
trying
- * to match a given 'on' or 'off' period, an appropriate pair of 'change' and
- * 'hold' times are chosen to get a close match.  If the target delay is even,
- * the 'change' number will be the smaller; if odd, the 'hold' number will be
- * the smaller.
-
- * Choosing pairs of delays with 12.5% errors allows us to match delays in the
- * ranges: 56-72, 112-144, 168-216, 224-27504, 28560-36720.
- * 26% of the achievable sums can be matched by multiple pairings. For example
- * 1536 == 1536+0, 1024+512, or 768+768.  This driver will always choose the
- * pairing with the least maximum - 768+768 in this case.  Other pairings are
- * not available.
+ * This driver does not allow rise/fall rates to be set explicitly.
+ * When trying to match a given 'on' or 'off' period, an appropriate
+ * pair of 'change' and 'hold' times are chosen to get a close match.
+ * If the target delay is even, the 'change' number will be the
+ * smaller; if odd, the 'hold' number will be the smaller.
+
+ * Choosing pairs of delays with 12.5% errors allows us to match
+ * delays in the ranges: 56-72, 112-144, 168-216, 224-27504,
+ * 28560-36720.
+ * 26% of the achievable sums can be matched by multiple pairings.
+ * For example 1536 == 1536+0, 1024+512, or 768+768.
+ * This driver will always choose the pairing with the least
+ * maximum - 768+768 in this case.  Other pairings are not available.
  *
- * Access to the 3 levels and 2 blinks are on a first-come, first-served basis.
- * Access can be shared by multiple leds if they have the same level and
- * either same blink rates, or some don't blink.
- * When a led changes, it relinquishes access and tries again, so it might
- * lose access to hardware blink.
- * If a blink engine cannot be allocated, software blink is 

Re: [PATCH] clk: add generic driver for fixed rate clock

2013-11-12 Thread Stefan Kristiansson
Ping and adding Mike Turquette to CC

On Sun, Sep 01, 2013 at 07:40:20AM +0300, Stefan Kristiansson wrote:
> This adds a simple driver with the only purpose to initialise
> the fixed rate clock.
> This is useful for systems that do not wish to use seperate init
> code for the fixed rate clock init, but rather only rely on a
> device tree description of it.
> 
> Signed-off-by: Stefan Kristiansson 
> ---
>  drivers/clk/Kconfig |  8 ++
>  drivers/clk/Makefile|  1 +
>  drivers/clk/clk-generic-fixed.c | 59 
> +
>  3 files changed, 68 insertions(+)
>  create mode 100644 drivers/clk/clk-generic-fixed.c
> 
> diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
> index 51380d6..7c8ea78 100644
> --- a/drivers/clk/Kconfig
> +++ b/drivers/clk/Kconfig
> @@ -87,6 +87,14 @@ config CLK_PPC_CORENET
> This adds the clock driver support for Freescale PowerPC corenet
> platforms using common clock framework.
>  
> +config COMMON_CLK_GENERIC_FIXED
> + tristate "Generic fixed rate clock driver"
> + depends on OF
> + ---help---
> +   Driver for systems that do not want to register their fixed rate
> +   clocks through init code, but rather through the device tree
> +   description.
> +
>  endmenu
>  
>  source "drivers/clk/mvebu/Kconfig"
> diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
> index 4038c2b..2d46647 100644
> --- a/drivers/clk/Makefile
> +++ b/drivers/clk/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_COMMON_CLK)  += clk-fixed-rate.o
>  obj-$(CONFIG_COMMON_CLK) += clk-gate.o
>  obj-$(CONFIG_COMMON_CLK) += clk-mux.o
>  obj-$(CONFIG_COMMON_CLK) += clk-composite.o
> +obj-$(CONFIG_COMMON_CLK_GENERIC_FIXED) += clk-generic-fixed.o
>  
>  # SoCs specific
>  obj-$(CONFIG_ARCH_BCM2835)   += clk-bcm2835.o
> diff --git a/drivers/clk/clk-generic-fixed.c b/drivers/clk/clk-generic-fixed.c
> new file mode 100644
> index 000..85df8a8
> --- /dev/null
> +++ b/drivers/clk/clk-generic-fixed.c
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright 2013 Stefan Kristiansson 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Generic driver for fixed rate clock
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static const struct of_device_id generic_fixed_clk_match[] __initconst = {
> + { .compatible = "fixed-clock",},
> + {}
> +};
> +
> +static int generic_fixed_clk_probe(struct platform_device *pdev)
> +{
> + of_fixed_clk_setup(pdev->dev.of_node);
> +
> + return 0;
> +}
> +
> +static int generic_fixed_clk_remove(struct platform_device *pdev)
> +{
> + of_clk_del_provider(pdev->dev.of_node);
> +
> + return 0;
> +}
> +
> +static struct platform_driver generic_fixed_clk_driver = {
> + .driver = {
> + .name = "generic-fixed-clk",
> + .owner = THIS_MODULE,
> + .of_match_table = generic_fixed_clk_match,
> + },
> + .probe  = generic_fixed_clk_probe,
> + .remove = generic_fixed_clk_remove,
> +};
> +
> +static int __init generic_fixed_clk_init(void)
> +{
> + return platform_driver_register(_fixed_clk_driver);
> +}
> +subsys_initcall(generic_fixed_clk_init);
> +
> +static void __exit generic_fixed_exit(void)
> +{
> + platform_driver_unregister(_fixed_clk_driver);
> +}
> +module_exit(generic_fixed_exit);
> +
> +MODULE_AUTHOR("Stefan Kristiansson ");
> +MODULE_DESCRIPTION("Generic driver for fixed rate clock");
> +MODULE_LICENSE("GPL v2");
> -- 
> 1.8.1.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] cpufreq: cpufreq-cpu0: do not allow transitions with regulators suspended

2013-11-12 Thread Viresh Kumar
On 12 November 2013 20:41, Nishanth Menon  wrote:
> On 11/12/2013 12:03 AM, Viresh Kumar wrote:

>> Yes the problem looks real but there are issues with this patch.
>> - It doesn't solve your problem completely, because you returned -EBUSY,
>> your suspend operation failed and we resumed immediately.
>
> Seems like there was an error handling miss somewhere - for some
> reason, it did suspend properly.

Yeah, its missing in cpufreq_cpu_callback()..

>> But I think the problem can/should be solved some other way.. Looking 
>> closely,
>> we got to the problem because we called
>>
>> __cpufreq_governor(policy, CPUFREQ_GOV_START)
>>
>> at the first place. This happened because the policy structure had more than
>> one cpu to take care of and after stopping goveronr for CPU1 it has to start 
>> it
>> again for CPU0... But this is really not required as anyway we are going to
>> suspend.
>>
>> Can you try attached patch? I will then repost it formally...
>
> I tried a equivalent of this for v3.12 tag:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 04548f7..9ec243c 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1186,7 +1186,7 @@ static int __cpufreq_remove_dev_prepare(struct
> device *dev,
> return -EINVAL;
> }
>
> -   if (cpufreq_driver->target) {
> +   if (cpufreq_driver->target && (!frozen ||
> policy->governor_enabled)) {
> ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> if (ret) {
> pr_err("%s: Failed to stop governor\n", __func__);
> @@ -1252,7 +1252,7 @@ static int __cpufreq_remove_dev_finish(struct
> device *dev,
>
> /* If cpu is last user of policy, free policy */
> if (cpus == 1) {
> -   if (cpufreq_driver->target) {
> +   if (cpufreq_driver->target && !frozen) {
> ret = __cpufreq_governor(policy,
> CPUFREQ_GOV_POLICY_EXIT);

This is not an equivalent of my patch :)

@@ -1282,7 +1282,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
if (!frozen)
cpufreq_policy_free(policy);
} else {
-   if (has_target()) {
+   if (has_target() && !frozen) {
if ((ret = __cpufreq_governor(policy,
CPUFREQ_GOV_START)) ||
(ret =
__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS))) {


> And I see http://pastebin.mozilla.org/3528478
>
> with a WARN patch for generating call stack.

that's why you got it.. I was really surprised to see it just didn't
worked for you
and believe me it took me a lot of time understanding how isn't it
working for u.
Because I simply believed on your equivalent version and didn't looked at it
closely :)

> Finally squelched warnings with a net diff (v3.12) of
> http://pastebin.mozilla.org/3546062

we don't need that stuff in cpufreq_add_policy_cpu()

> However, ondemand is no longer functioning on resume (governor needs a
> start after being unfrozen.. and obviously by avoiding that entirely
> in frozen case.. not sure if I missed any other)..

It would be, try the right code once. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/17] dt: Consolidate __dtb_start declarations in

2013-11-12 Thread Vineet Gupta
On 11/13/2013 01:12 AM, Geert Uytterhoeven wrote:
> The different architectures used their own (and different) declarations:
> 
> extern struct boot_param_header __dtb_start;
> extern u32 __dtb_start[];
> extern char __dtb_start[];
> 
> Consolidate them using the first variant in .
> This requires adding a few "address of" operators on architectures where
> __dtb_start was an array before.
> 
> Signed-off-by: Geert Uytterhoeven 
> Cc: Vineet Gupta 
> Cc: James Hogan 
> Cc: Ralf Baechle 
> Cc: Jonas Bonn 
> CC: Chris Zankel 
> Cc: Rob Herring 
> Cc: devicet...@vger.kernel.org
> ---
>  arch/arc/include/asm/sections.h |1 -
>  arch/arc/kernel/setup.c |2 +-
>  arch/metag/kernel/setup.c   |6 +-
>  arch/mips/include/asm/mips-boards/generic.h |4 
>  arch/mips/lantiq/prom.h |2 --
>  arch/mips/netlogic/xlp/dt.c |4 ++--
>  arch/mips/ralink/of.c   |2 --
>  arch/openrisc/kernel/setup.c|2 +-
>  arch/openrisc/kernel/vmlinux.h  |2 --
>  arch/xtensa/kernel/setup.c  |3 +--
>  include/linux/of_fdt.h  |3 +++
>  11 files changed, 9 insertions(+), 22 deletions(-)


Acked-by: Vineet Gupta 

Thx,
-Vineet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 3.12.0-rt1: net: iwlwifi request only a threaded handler for interrupts

2013-11-12 Thread Clark Williams
On Tue, 12 Nov 2013 13:16:59 -0600
Clark Williams  wrote:

> Sebastian,
> 
> I needed this for my laptop on 3.12, so tweaked it to apply properly. 
> 
> Clark
>

So of course I sent the wrong patch:

commit 49d487614d56bc10969dd0dcbce709825fc06d0e
Author: Clark Williams 
Date:   Tue Nov 12 12:17:07 2013 -0600

net: iwlwifi: request only a threaded handler for interrupts

On RT the trans_pcie->irq_lock lock is converted into a sleeping lock
and can't be used in primary irq handler. The lock is used in mutliple
places which means turning it into a raw lock could increase the
latency of the system.
For now both handlers are moved into the thread.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Clark Williams 

diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c 
b/drivers/net/wireless/iwlwifi/pcie/trans.c
index c3f904d..60df2c1 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1375,6 +1375,20 @@ static const struct iwl_trans_ops trans_ops_pcie = {
.set_bits_mask = iwl_trans_pcie_set_bits_mask,
 };
 
+#ifdef CONFIG_PREEMPT_RT_BASE
+static irqreturn_t iwl_rt_irq_handler(int irq, void *dev_id)
+{
+   irqreturn_t ret;
+
+   local_bh_disable();
+   ret = iwl_pcie_isr_ict(irq, dev_id);
+   local_bh_enable();
+   if (ret == IRQ_WAKE_THREAD)
+   ret = iwl_pcie_irq_handler(irq, dev_id);
+   return ret;
+}
+#endif
+
 struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev,
   const struct pci_device_id *ent,
   const struct iwl_cfg *cfg)
@@ -1493,9 +1507,15 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev 
*pdev,
if (iwl_pcie_alloc_ict(trans))
goto out_free_cmd_pool;
 
+#ifdef CONFIG_PREEMPT_RT_BASE
+   err = request_threaded_irq(pdev->irq, NULL, iwl_rt_irq_handler,
+  IRQF_SHARED | IRQF_ONESHOT, 
+  DRV_NAME, trans);
+#else
err = request_threaded_irq(pdev->irq, iwl_pcie_isr_ict,
   iwl_pcie_irq_handler,
   IRQF_SHARED, DRV_NAME, trans);
+#endif
if (err) {
IWL_ERR(trans, "Error allocating IRQ %d\n", pdev->irq);
goto out_free_ict;


signature.asc
Description: PGP signature


Re: [PATCH 00/17] related cleanups

2013-11-12 Thread Vineet Gupta
On 11/13/2013 01:12 AM, Geert Uytterhoeven wrote:
> Most of this has been compile-tested. Notable exceptions are the changes
> to arc, c6x, and score code, due to lack of cross-compilers.

Mainline buildroot will enable you to build a cross compiler for ARC. It is not
relocatable (pending issue in Buildroot itself) but should suffice your needs.

Then a defconfig kernel build will be good enough (allmodconfig tends to select 
a
uncommon configuration - e.g. Big endian, -Os and since the default tools are 
not
multilib'ed, it will fail to link.

To avoid fiailure all*config builds I probably need to invert a few config items
on ARC (CONFIG_CPU_BIG_ENDIAN => CONFIG_CPU_LITTLE_ENDIAN)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] uprobes: Add uprobe_task->dup_work/dup_addr

2013-11-12 Thread Srikar Dronamraju
* Oleg Nesterov  [2013-11-12 20:20:38]:

> On 11/12, Srikar Dronamraju wrote:
> >
> > Okay, moving to arch_uprobe_task is fine. I probably got confused by
> > "First of all it is not really needed,"
> 
> OK, this doesn't look good, I agree.
> 
> Please see v2 below, I tried to improve the changelog.
> 
> > > OK. How about dup_xol_work/dup_xol_vaddr ?
> >
> > Yes fine with me.
> 
> Done.
> 
> I added your ack optimistically, please let me know if you think I
> should change something else.

Looks nice to me.

-- 
Thanks and Regards
Srikar Dronamraju

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARM: dts: add board dts file for EXYNOS4412 based TINY4412 board

2013-11-12 Thread kasim ling
On Wed, Nov 13, 2013 at 3:13 AM, Rob Herring  wrote:
> On 11/12/2013 09:02 AM, Alex Ling wrote:
>> Add a minimal board dts file for EXYNOS4412 based FriendlyARM's
>> TINY4412 board. This patch including adds the node to support
>> peripherals like UART, SD card on SDMMC2 port, and this patch
>> adds GPIO connected LEDS and configure its properties like
>> following:
>> LED1: use 'heartbeat' trigger, blinking while the board is running.
>> LED4: use 'mmc0' trigger, on when mmc0 is accessing.
>> LED2 and LED3 can be controlled from userspace.
>
> Please send patches with [PATCH] prefix.
Well noted. Thanks.
>
>> Signed-off-by: Alex Ling 
>> ---
>>  arch/arm/boot/dts/Makefile|1 +
>>  arch/arm/boot/dts/exynos4412-tiny4412.dts |   89 
>> +
>>  2 files changed, 90 insertions(+)
>>  create mode 100644 arch/arm/boot/dts/exynos4412-tiny4412.dts
>>
>> diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
>> index 802720e..91671a2 100644
>> --- a/arch/arm/boot/dts/Makefile
>> +++ b/arch/arm/boot/dts/Makefile
>> @@ -59,6 +59,7 @@ dtb-$(CONFIG_ARCH_EXYNOS) += exynos4210-origen.dtb \
>>   exynos4412-odroidx.dtb \
>>   exynos4412-origen.dtb \
>>   exynos4412-smdk4412.dtb \
>> + exynos4412-tiny4412.dtb \
>>   exynos4412-trats2.dtb \
>>   exynos5250-arndale.dtb \
>>   exynos5250-smdk5250.dtb \
>> diff --git a/arch/arm/boot/dts/exynos4412-tiny4412.dts 
>> b/arch/arm/boot/dts/exynos4412-tiny4412.dts
>> new file mode 100644
>> index 000..78ace14
>> --- /dev/null
>> +++ b/arch/arm/boot/dts/exynos4412-tiny4412.dts
>> @@ -0,0 +1,89 @@
>> +/*
>> + * FriendlyARM's Exynos4412 based TINY4412 board device tree source
>> + *
>> + * Copyright (c) 2013 Alex Ling 
>> + *
>> + * Device tree source file for FriendlyARM's TINY4412 board which is based 
>> on
>> + * Samsung's Exynos4412 SoC.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> +*/
>> +
>> +/dts-v1/;
>> +#include "exynos4412.dtsi"
>> +
>> +/ {
>> + model = "FriendlyARM TINY4412 board based on Exynos4412";
>> + compatible = "friendlyarm,tiny4412", "samsung,exynos4412";
>
> The compatible string needs to be documented.
Could you please advise where this should be documented to? I'm not
sure if "Documentation/devicetree/bindings/arm/samsung-boards.txt" is
a proper place or not.
>
> Rob
>
>> +
>> + memory {
>> + reg = <0x4000 0x4000>;
>> + };
>> +
>> + leds {
>> + compatible = "gpio-leds";
>> + led1 {
>> + label = "led1:heart";
>> + gpios = < 0 1>;
>> + default-state = "off";
>> + linux,default-trigger = "heartbeat";
>> + };
>> + led2 {
>> + label = "led2";
>> + gpios = < 1 1>;
>> + default-state = "off";
>> + };
>> + led3 {
>> + label = "led3";
>> + gpios = < 2 1>;
>> + default-state = "off";
>> + };
>> + led4 {
>> + label = "led4:mmc0";
>> + gpios = < 3 1>;
>> + default-state = "off";
>> + linux,default-trigger = "mmc0";
>> + };
>> + };
>> +
>> + rtc@1007 {
>> + status = "okay";
>> + };
>> +
>> + sdhci@1253 {
>> + bus-width = <4>;
>> + pinctrl-0 = <_clk _cmd _cd _bus4>;
>> + pinctrl-names = "default";
>> + status = "okay";
>> + };
>> +
>> + serial@1380 {
>> + status = "okay";
>> + };
>> +
>> + serial@1381 {
>> + status = "okay";
>> + };
>> +
>> + serial@1382 {
>> + status = "okay";
>> + };
>> +
>> + serial@1383 {
>> + status = "okay";
>> + };
>> +
>> + fixed-rate-clocks {
>> + xxti {
>> + compatible = "samsung,clock-xxti";
>> + clock-frequency = <0>;
>> + };
>> +
>> + xusbxti {
>> + compatible = "samsung,clock-xusbxti";
>> + clock-frequency = <2400>;
>> + };
>> + };
>> +};
>>
>

BR,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Nov 13

2013-11-12 Thread Stephen Rothwell
Hi all,

Please do *not* add any v3.14 material to linux-next until after
v3.13-rc1 is released.

Changes since 20131112:

The aio-direct tree gained a conflict against the btrfs tree.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 29 trees of patches
pending for Linus' tree), more are welcome (even if they are currently
empty). Thanks to those who have contributed, and to those who haven't,
please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (10d0c9705e80 Merge tag 'devicetree-for-3.13' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux)
Merging fixes/master (fa8218def1b1 Merge tag 'regmap-v3.11-rc7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (737d5b980be8 ARC: [plat-arcfpga] defconfig update)
Merging arm-current/fixes (6ecf830e5029 ARM: 7880/1: Clear the IT state 
independent of the Thumb-2 mode)
CONFLICT (content): Merge conflict in arch/arm/mach-tegra/Kconfig
Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated 
IRQF_DISABLED)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (8b5ede69d24d powerpc/irq: Don't switch to irq 
stack from softirq stack)
Merging sparc/master (6d15ee492809 Merge 
git://git.kernel.org/pub/scm/virt/kvm/kvm)
Merging net/master (be408cd3e1fe Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging ipsec/master (be408cd3e1fe Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging sound-current/for-linus (468ac413045a ALSA: hda - Check keep_eapd_on 
before inv_eapd)
Merging pci-current/for-linus (67d470e0e171 Revert "x86/PCI: MMCONFIG: Check 
earlier for MMCONFIG region at address zero")
Merging wireless/master (8e3ffa471091 prism54: set netdev type to "wlan")
CONFLICT (content): Merge conflict in include/linux/netdevice.h
CONFLICT (modify/delete): arch/h8300/include/uapi/asm/socket.h deleted in HEAD 
and modified in wireless/master. Version wireless/master of 
arch/h8300/include/uapi/asm/socket.h left in tree.
$ git rm -f arch/h8300/include/uapi/asm/socket.h
Merging driver-core.current/driver-core-linus (31d141e3a666 Linux 3.12-rc6)
Merging tty.current/tty-linus (6e757ad2c92c tty/serial: at91: fix uart/usart 
selection for older products)
Merging usb.current/usb-linus (e1466ad5b1ae USB: serial: ftdi_sio: add id for 
Z3X Box device)
Merging staging.current/staging-linus (31d141e3a666 Linux 3.12-rc6)
Merging char-misc.current/char-misc-linus (31d141e3a666 Linux 3.12-rc6)
Merging input-current/for-linus (5beea882e641 Input: ALPS - add support for 
model found on Dell XT2)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (f262f0f5cad0 crypto: s390 - Fix aes-cbc IV 
corruption)
CONFLICT (content): Merge conflict in drivers/crypto/caam/jr.c
Merging ide/master (64110c16e012 ide: sgiioc4: Staticize ioc4_ide_attach_one())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-curr

[PATCH] arch: um: kernel: skas: mmu: remove pmd_free() and pud_free() for failure processing in init_stub_pte()

2013-11-12 Thread Chen Gang
Unfortunately, p?d_alloc() and p?d_free() are not pair!! If p?d_alloc()
succeed, they may be used, so in the next failure, we have to skip them
to let exit_mmap() or do_munmap() to process it.

According to "Documentation/vm/locking", 'mm->page_table_lock' is for
using vma list, so not need it when its related vmas are detached or
unmapped from using vma list.

The related work flow:

  exit_mmap() ->
unmap_vmas(); /* so not need mm->page_table_lock */
free_pgtables();

  do_munmap()->
detach_vmas_to_be_unmapped(); /* so not need mm->page_table_lock */
unmap_region() ->
  free_pgtables();

  free_pgtables() ->
free_pgd_range() ->
  free_pud_range() ->
free_pmd_range() ->
  free_pte_range() ->
pmd_clear();
pte_free_tlb();
  pud_clear();
  pmd_free_tlb();
pgd_clear(); 
pud_free_tlb();


Signed-off-by: Chen Gang 
---
 arch/um/kernel/skas/mmu.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/um/kernel/skas/mmu.c b/arch/um/kernel/skas/mmu.c
index 007d550..3fd1951 100644
--- a/arch/um/kernel/skas/mmu.c
+++ b/arch/um/kernel/skas/mmu.c
@@ -40,9 +40,9 @@ static int init_stub_pte(struct mm_struct *mm, unsigned long 
proc,
return 0;
 
  out_pte:
-   pmd_free(mm, pmd);
+   /* used by mm->pgd->pud, will free in do_munmap() or exit_mmap() */
  out_pmd:
-   pud_free(mm, pud);
+   /* used by mm->pgd, will free in do_munmap() or exit_mmap() */
  out:
return -ENOMEM;
 }
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/12] mtd: nand: davinci: don't set timings if AEMIF is used

2013-11-12 Thread Sekhar Nori
On Monday 11 November 2013 10:40 PM, Khoronzhuk, Ivan wrote:
> If Davinci AEMIF is used we don't need to set timings and bus width.
> It is done by AEMIF driver (drivers/memory/davinci-aemfi.c).
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
>  drivers/mtd/nand/davinci_nand.c |   22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
> index 4705214..879e915 100644
> --- a/drivers/mtd/nand/davinci_nand.c
> +++ b/drivers/mtd/nand/davinci_nand.c
> @@ -742,27 +742,35 @@ static int __init nand_davinci_probe(struct 
> platform_device *pdev)
> goto err_clk_enable;
> }
> 
> +#if !IS_ENABLED(CONFIG_TI_DAVINCI_AEMIF)

This is a hack! Just because AEMIF driver is enabled, it does not
guarantee that the timings have been setup by it. Instead of configuring
timings in two drivers, why not just convert everyone over to use the
new driver. Dont worry about breaking old platforms, I will help test
and ack them.

Thanks,
Sekhar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf tip: fails to convert comm

2013-11-12 Thread David Ahern

Hi Namhyung and Frederic:

If you recall I mentioned noting a problem with the callchain series 
showing comm's. Well, it fails on acme's perf/core. git bisect points to:


$ git bisect bad
4dfced359fbc719a35527416f1b4b3999647f68b is the first bad commit
commit 4dfced359fbc719a35527416f1b4b3999647f68b
Author: Namhyung Kim 
Date:   Fri Sep 13 16:28:57 2013 +0900

perf tools: Get current comm instead of last one

At insert time, a hist entry should reference comm at the time 
otherwise

it'll get the last comm anyway.


How to re-create:

Start point is tools/perf directory for 3.12 (Linus tree):
  $ perf sched record -o /tmp/perf.data -g -- make -j 16
  $ perf script -i /tmp/perf.data > /tmp/1

cd to Arnaldo's tree, make perf and use it to create /tmp/2:
  $ perf script -i /tmp/perf.data > /tmp/1
  $ diff -U3 /tmp/1 /tmp/2 | less

You'll see a number of comm's showing as : instead of make, etc.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread H. Peter Anvin
On 11/12/2013 08:37 PM, Greg Price wrote:
> 
> I'm thinking only of boot-time blocking.  The idea is that once
> /dev/urandom is seeded with, say, 128 bits of min-entropy in the
> absolute, information-theoretic sense, it can produce an infinite
> supply (or something like 2^128 bits, which amounts to the same thing)
> of bits that can't be distinguished from random, short of breaking or
> brute-forcing the crypto.  So once it's seeded, it's good forever.
> 

And, pray tell, how will you know that you have done that?

Even the best entropy estimation algorithms are nothing but estimations,
and min-entropy is the hardest form of entropy to estimate.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel BUG at kernel/kallsyms.c:222!

2013-11-12 Thread Ming Lei
On Wed, Nov 13, 2013 at 1:36 AM, Jonathan Austin
 wrote:
>> Currently, I suggest to filter only on ARM as attachment patch if we plan
>> to merge Jonathan's patch, otherwise a more complicated approach has
>> to be figured out  to do the filter(such as, define a readonly symbol in
>> kernel to store PAGE_OFFSET, and let scripts/kallsyms use it for
>> filtering).
>
>
> I'm happy with that approach, though allowing only ARM seems a bit
> conservative - is it the only architecture we actually expect to work?

Yes, the problem is only reported on ARM, so let's do it.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Greg Price
On Tue, Nov 12, 2013 at 08:02:09PM -0800, H. Peter Anvin wrote:
> One thing, too, if we are talking about anything other than
> boot-time-only blocking: going from a nonblocking to a blocking
> condition means being able to accept a short read, and right now *many*
> users of /dev/urandom are not ready to accept a short read.

I'm thinking only of boot-time blocking.  The idea is that once
/dev/urandom is seeded with, say, 128 bits of min-entropy in the
absolute, information-theoretic sense, it can produce an infinite
supply (or something like 2^128 bits, which amounts to the same thing)
of bits that can't be distinguished from random, short of breaking or
brute-forcing the crypto.  So once it's seeded, it's good forever.

We don't even strictly need to keep adding more entropy once it's
seeded, but it's good because (a) hey, it's cheap, (b) entropy
estimation is hard, and maybe in some situations we're too optimistic
and think we're well seeded before we really are, (c) some
cryptographers like to imagine having a PRNG recover from an attacker
learning its internal state by using fresh entropy.  Other
cryptographers think (c) is a little silly because an attacker that
can do that can probably keep doing it, or take over the machine
entirely, but it's not inconceivable, and there's (a) and (b).  So we
keep adding entropy when we have it, but if we don't have new entropy
for a long time there's no need to start blocking again.

Cheers,
Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rds: Error on offset mismatch if not loopback

2013-11-12 Thread Josh Hunt
On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt  wrote:
> On Sat, Sep 22, 2012 at 2:25 PM, David Miller  wrote:
>>
>> From: John Jolly 
>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>
>> > Attempting an rds connection from the IP address of an IPoIB interface
>> > to itself causes a kernel panic due to a BUG_ON() being triggered.
>> > Making the test less strict allows rds-ping to work without crashing
>> > the machine.
>> >
>> > A local unprivileged user could use this flaw to crash the system.
>> >
>> > Signed-off-by: John Jolly 
>>
>> Besides the questions being asked of you by Venkat Venkatsubra, this
>> patch has another issue.
>>
>> It has been completely corrupted by your email client, it has
>> turned all TAB characters into spaces, making the patch useless.
>>
>> Please learn how to send a patch unmolested in the body of your
>> email.  Test it by emailing the patch to yourself, and verifying
>> that you can in fact apply the patch you receive in that email.
>> Then, and only then, should you consider making a new submission
>> of this patch.
>>
>> Use Documentation/email-clients.txt for guidance.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>
> I think this issue was lost in the shuffle. It appears that redhat, ubuntu,
> and oracle are maintaining local patches to resolve this:
>
> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8
> https://bugzilla.redhat.com/show_bug.cgi?id=822754
> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td4985388.html
>
> Given that Oracle has applied it I'll make the assumption that Venkat's
> question was answered at some point.
>
> David - I can resubmit the patch with the proper signed-off-by and
> formatting if you are willing to apply it unless John wants to try again. I
> think it's time this got upstream.
>
> --
> Josh

Ugh.. hopefully resending with all the html crap removed...

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Greg Price
On Tue, Nov 12, 2013 at 10:32:05PM -0500, Theodore Ts'o wrote:
> One of the things I've been thinking about with respect to making
> /dev/urandom block is being able to configure (via a module parameter
> which could be specified on the boot command line) which allows us to
> set a limit for how long /dev/urandom will block after which we log a
> high priority message that there was an attempt to read from
> /dev/urandom which couldn't be satisified, and then allowing the
> /dev/urandom read to succed.
> 
> The basic idea is that we don't want to break systems, but we do want
> to gently coerce people to do the right thing.  Otherwise, I'm worried
> that distros, or embedded/mobile/consume electronics engineers would
> just patch out the check.

That's a good idea.  I've worried about the same thing, but hadn't
thought of that solution.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Add strong pullup emulation to w1-gpio master driver.

2013-11-12 Thread David Fries
On Wed, Nov 13, 2013 at 05:15:52AM +0400, Evgeny Boger wrote:
> 11/12/2013 12:01 PM, David Fries:
> >On Tue, Nov 12, 2013 at 05:07:14AM +0400, Evgeny Boger wrote:
> >>+David Fries 
> >>
> >>Hi David,
> >>
> >>Would you please comment on this?
> >
> >On Mon, Nov 11, 2013 at 06:36:54PM +0400, Evgeny Boger wrote:
> >>  Strong pullup is emulated by driving pin logic high after write
> >>  command when
> >>  using tri-state push-pull GPIO.

Acked-by: David Fries 

Looks good to me.

> >Not knowing the hardware involved, is driving the logic high a
> >stronger pullup than the normal weak pullup input high?  Meaning it
> >was already being left high, just with a lessor pullup and this will
> >provide a stronger one?
> 
> 
> 
> 
> Sure. The push-pull GPIO on common SoC's are usually able to provide
> up to 10 mA of current.
> 
> 
> 
> 
> >
> >On Tue, Nov 12, 2013 at 03:09:36AM +0400, Evgeniy Polyakov wrote:
> >>>+ msleep(pdata->pullup_duration);
> >>This doesn't look like a good idea - kernel will sleep for that long
> >>not doing usual w1 job
> >Not speaking for Evgeny Boger, but I'm thinking that's intended here.
> >The original strong pullup code change 6a158c0de791a81 I wrote will
> >msleep in w1_post_write when a hardware pullup isn't available, while
> >the hardware ds2490 ds9490r_set_pullup sleeps for the strong pullup
> >using spu_sleep variable.  The user requests a strong pullup for a
> >given time and any other operations on the bus will interrupt the
> >strong pullup, so locking any other operations sounds desired.
> >
> >>11/12/2013 05:03 AM, Evgeniy Polyakov:
> >>>Hi
> >>>
> >>>12.11.2013, 03:32, "Evgeny Boger" :
> >Why did you drop this check? It has nothing with w1-gpio driver
> This check prevents master from implementing "set_pullup"  provided it 
> does support only "write_bit" method.
> The comment above states that
> >  w1_io.c would need to support calling set_pullup before - * the last 
> > write_bit operation of a w1_write_8 which it currently - * doesn't.
> which is kind of strange, since it describes what w1_io.c actually does 
> support.
> 
> w1_write_8 (w1_io.c:154, 
> https://github.com/torvalds/linux/blob/master/drivers/w1/w1_io.c#L154):
> > for (i = 0; i < 8; ++i) {
> > if (i == 7)
> > w1_pre_write(dev);
> > w1_touch_bit(dev, (byte >> i) & 0x1);
> > }
> It seems like w1_write_8() calls w1_pre_write(), which in turn calls 
> set_pullup() just before the last write_bit().
> >I'm not seeing any harm in removing this check and clear
> >master->set_pullup.  It doesn't seem correct for this code to override
> >a master that claims to provide something of a stronger pullup.  It's
> >been about five years since I wrote that code, I think it was just to
> >protect against a stupid master.
> >
> >With this patch the last w1_write_bit will go logic 1, for 64 or 10 us
> >before returning, then w1_gpio_set_pullup is called to enable the
> >strong pullup.  What I wouldn't know is if in that last bit if the
> >logic 1 would be a go up to the strong pullup, or if it would finish
> >that time slot with a weak pullup and then go to a strong pullup.  I
> >would have to dig into the timing specifications much more than I have
> >time to right now to say what is supposed to happen.  The 18b20
> >datasheet lists, "The DQ line must be switched over to the strong
> >pullup within 10 us maximum after issuing any protocol that involves
> >copying the E2 memory or initiates temperature conversions."  It isn't
> >clear where that 10 us starts from.  You might try to dig around and
> >see if that last bit written should go to weak pullup 1 or strong
> >pullup 1.  It would take more changes if it should go right to a
> >strong pullup.
> 
> 
> I wasn't able to find any support for the latter statement.
> It looks like the strong pull-up should be enabled *after* the last
> bit has been sent
> so no need to set strong pull-up there.

It think that is correct, the 2480b data sheet "Strong Pullup to 5V,
armed, predefined duration", the strong pullup starts after the
timeslot of the last bit completes.

> However setting strong pullup for last bit makes sense just to ensure we
> fit to 10us time window.

They'll be just a couple function calls apart and it would complicate
the code to do so.  I think what you have is good.

> On the other hand, I didn't experienced any problems with the proposed
> implementation.

-- 
David Fries PGP pub CB1EE8F0
http://fries.net/~david/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread H. Peter Anvin
On 11/12/2013 07:32 PM, Theodore Ts'o wrote:
> On Tue, Nov 12, 2013 at 05:40:09PM -0500, Greg Price wrote:
>>
>> Beyond these easy cleanups, I have a couple of patches queued up (just
>> written yesterday, not quite finished) to make /dev/urandom block at
>> boot until it has enough entropy, as the "Mining your P's and Q's"
>> paper recommended and people have occasionally discussed since then.
>> Those patches were definitely for after 3.13 anyway, and I'll send
>> them when they're ready.  I see some notifications and warnings in
>> this direction in the random.git tree, which is great.
> 
> One of the things I've been thinking about with respect to making
> /dev/urandom block is being able to configure (via a module parameter
> which could be specified on the boot command line) which allows us to
> set a limit for how long /dev/urandom will block after which we log a
> high priority message that there was an attempt to read from
> /dev/urandom which couldn't be satisified, and then allowing the
> /dev/urandom read to succed.
> 
> The basic idea is that we don't want to break systems, but we do want
> to gently coerce people to do the right thing.  Otherwise, I'm worried
> that distros, or embedded/mobile/consume electronics engineers would
> just patch out the check.  If we make the default be something like
> "block for 5 minutes", and then log a message, we won't completely
> break a user who is trying to login to a VM, but it will be obvious,
> both from the delay and from the kern.crit log message, that there is
> a potential problem here that a system administrator needs to worry
> about.
> 

One thing, too, if we are talking about anything other than
boot-time-only blocking: going from a nonblocking to a blocking
condition means being able to accept a short read, and right now *many*
users of /dev/urandom are not ready to accept a short read.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] watchdog: bcm281xx: Watchdog Driver

2013-11-12 Thread Guenter Roeck

On 11/12/2013 02:17 PM, Markus Mayer wrote:


+
+ if (!timeout)
+ dev_info(wdog->dev, "Watchdog timer stopped");
+

All that noise.


Would it be acceptable to turn these calls into dev_dbg() calls, here
and elsewhere?



Ok with me.



+
+ wdt->resolution = SECWDOG_DEFAULT_RESOLUTION;
+ ret = bcm_kona_wdt_set_resolution_reg(wdt);
+ if (ret) {
+ dev_err(dev, "Failed to set resolution (error: %d)", ret);


ret can be -EAGAIN or -EINVAL. -EINVAL suggests a bad internale error (hopefully
SECWDOG_DEFAULT_RESOLUTION is defined to be smaller than SECWDOG_MAX_RES),
and if it is -EAGAIN there should be no error message.

Actually, bcm_kona_wdt_set_resolution_reg is only called from here, meaning the
error check in the function is really unnecessary.


This again goes back to making resolution available to userland. Then
bcm_kona_wdt_set_resolution_reg() would be called from elsewhere. Why
is it bad to print an error message on timeout? Would this still apply


That was related to -EAGAIN. Which would be bad here anyway as it could result
in an endless loop if there is a problem with the chip.


if I switch the code to -ETIMEDOUT?


That is one option, or -EIO if the condition indicates a chip error.


+ return ret;
+ }
+
+ spin_lock_init(>lock);
+ platform_set_drvdata(pdev, wdt);
+ watchdog_set_drvdata(_kona_wdt_wdd, wdt);
+
+ ret = bcm_kona_wdt_set_timeout_reg(_kona_wdt_wdd);
+ if (ret) {
+ dev_err(dev, "Failed set watchdog timeout");


The only error case is -EAGAIN. I don't think there should be an error mesasge
in this case (though I am not sure what the reaction should be).


I am thinking that probe() needs to return an error if setting the
timeout fails, as it can't really rely on the watchdog timer or let
the system use it. Shouldn't that be accompanied by an error message
letting the user know what happened?


Oh, I agree it should return an error, and an error message is ok as well.
I am just sure it should not be -EAGAIN, but I don't know what it should be.
Maybe -ETIMEDOUT, or -EIO.


+ return ret;
+ }
+
+ ret = watchdog_register_device(_kona_wdt_wdd);
+ if (ret) {
+ dev_err(dev, "Failed to register watchdog device");
+ return ret;
+ }
+
+#ifdef CONFIG_BCM_KONA_WDT_DEBUG
+ wdt->debugfs = bcm_kona_wdt_debugfs_init(wdt, _kona_wdt_wdd);
+#endif
+ dev_info(dev, "Broadcom Kona Watchdog Timer");
+

Such messages are in general considered nuisance nowadays. I would suggest to
remove it (or ask Greg KH for advice).



Referring to your other mail seems those messages are falling out of favor.
I consider it a nuisance, though so far I let it go through. The messages do
increase boot time, especially on systems with slow serial console, and IMO
do not provide any real value. Users either don't care, or can check if the
driver is loaded by other means. I would suggest to at least make it dev_dbg.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12 BUG() on ext4, kernel crash on nbd-client when nbd server rebooting

2013-11-12 Thread Denys Fedoryshchenko

Hi

On 2013-11-12 23:46, Jan Kara wrote:

Hello,

On Tue 12-11-13 16:34:07, Denys Fedoryshchenko wrote:

I just did some fault testing for test nbd setup, and found that if
i reboot nbd server i will get immediately BUG() message on nbd
client and filesystem that i cannot unmount, and any operations on
it will freeze and lock processes trying to access it.
  So how exactly did you do the fault testing? Because it seems 
something
has discarded the block device under filesystem's toes and the 
superblock

buffer_head got unmapped. Didn't something call NBD_CLEAR_SOCK ioctl?
Because that calls kill_bdev() which would do exactly that...


Client side:
modprobe nbd
nbd-client 2.2.2.29 /dev/nbd0 -name export1
nbd-client 2.2.2.29 /dev/nbd1 -name export2
nbd-client 2.2.2.29 /dev/nbd2 -name export3
mount /dev/nbd0 /mnt/disk1
mount /dev/nbd1 /mnt/disk2
mount /dev/nbd2 /mnt/disk3

On server i have config:
[generic]
[export1]
exportname = /dev/sda1
[export2]
exportname = /dev/sdb1
[export3]
exportname = /dev/sdc1

Steps to reproduce:
1)Start some large file copy on client side to /mnt/disk1/
2)Reboot server. It reboots quite fast, just few seconds, server system 
will get ip before nbd-server process started listening, so probably 
nbd-client will see connection refused.

3)seems when client gets connection refused - it is going mad

I can try to capture traffic dump, or do any other debug operation, 
please let me know, what i should run :)
P.S. I noticed maybe i should run persist mode, but anyway it should not 
crash like this i think.




Honza

Kernel 3.12, x86_64

Please let me know if you need more information

Here is dmesg contents i got:
[  102.269270] block nbd1: Receive control failed (result -32)
[  102.269443] block nbd1: shutting down socket
[  102.269461] block nbd1: queue cleared
[  102.269859] block nbd2: Receive control failed (result -32)
[  102.269873] block nbd2: shutting down socket
[  102.269883] block nbd2: queue cleared
[  102.271353] block nbd0: Receive control failed (result -32)
[  102.271518] block nbd0: shutting down socket
[  102.271536] block nbd0: queue cleared
[  106.297217] block nbd0: Attempted send on closed socket
[  106.297219] end_request: I/O error, dev nbd0, sector 73992
[  106.297226] EXT4-fs warning (device nbd0):
__ext4_read_dirblock:908: error reading directory block (ino 2,
block 0)
[  106.297233] block nbd0: Attempted send on closed socket
[  106.297235] end_request: I/O error, dev nbd0, sector 8456
[  106.297245] [ cut here ]
[  106.297343] kernel BUG at fs/buffer.c:3015!
[  106.297438] invalid opcode:  [#1] SMP
[  106.297716] Modules linked in: nbd act_mirred cls_u32 sch_ingress
sch_htb iptable_filter i2c_i801
[  106.298568] CPU: 0 PID: 2587 Comm: ls Not tainted 3.12.0noc-02 #1
[  106.298665] Hardware name:  /DH55TC, BIOS
TCIBX10H.86A.0037.2010.0614.1712 06/14/2010
[  106.298772] task: 880231da9770 ti: 880231cd4000 task.ti:
880231cd4000
[  106.298879] RIP: 0010:[]  []
_submit_bh+0x26/0x1d3
[  106.299078] RSP: 0018:880231cd5b48  EFLAGS: 00010246
[  106.299182] RAX: 0005 RBX: 8800b7456b60 RCX:
0008
[  106.299285] RDX:  RSI: 8800b7456b60 RDI:
0411
[  106.299388] RBP: 880231cd5b68 R08: 0040 R09:
81a9a370
[  106.299487] R10: 810c0d61 R11:  R12:
0411
[  106.299590] R13: 880231b21400 R14:  R15:
0aea9ff5
[  106.299697] FS:  7f4f0d755700() GS:88023fc0()
knlGS:
[  106.299800] CS:  0010 DS:  ES:  CR0: 8005003b
[  106.300114] CR2: 022275c8 CR3: 000235538000 CR4:
07f0
[  106.300438] Stack:
[  106.300750]  8800b7456b60 0411 880231b21400
0001
[  106.301652]  880231cd5b78 81125598 880231cd5ba8
8112761a
[  106.307886]  880231cd5bb8 81293a72 8800b7456b60
8802358d4800
[  106.308794] Call Trace:
[  106.309105]  [] submit_bh+0xb/0xd
[  106.309419]  [] __sync_dirty_buffer+0x53/0x86
[  106.309736]  [] ? __percpu_counter_sum+0x4d/0x63
[  106.310058]  [] sync_dirty_buffer+0xe/0x10
[  106.310368]  [] ext4_commit_super+0x19e/0x1e7
[  106.310687]  [] save_error_info+0x1e/0x22
[  106.311002]  [] __ext4_error_inode+0x52/0x10b
[  106.311326]  [] ? __cond_resched+0x25/0x30
[  106.311634]  [] __ext4_get_inode_loc+0x310/0x336
[  106.311954]  [] ? ext4_dirty_inode+0x3b/0x54
[  106.312277]  [] ext4_get_inode_loc+0x17/0x19
[  106.312596]  [] 
ext4_reserve_inode_write+0x21/0x7e

[  106.312916]  [] ? jbd2__journal_start+0xe0/0x199
[  106.313229]  [] ext4_mark_inode_dirty+0x67/0x1e4
[  106.313549]  [] ? ext4_dirty_inode+0x25/0x54
[  106.313861]  [] ext4_dirty_inode+0x3b/0x54
[  106.314177]  [] __mark_inode_dirty+0x60/0x224
[  106.314493]  [] update_time+0x99/0xa2
[ 

Re: [PATCH v3 1/3] OF: Introduce Device Tree resolve support.

2013-11-12 Thread Grant Likely
On Tue, 12 Nov 2013 09:28:42 +0100, Pantelis Antoniou 
 wrote:
> 
> On Nov 11, 2013, at 7:17 PM, Grant Likely wrote:
> 
> > On Fri,  8 Nov 2013 17:06:08 +0200, Pantelis Antoniou 
> >  wrote:
> >> Introduce support for dynamic device tree resolution.
> >> Using it, it is possible to prepare a device tree that's
> >> been loaded on runtime to be modified and inserted at the kernel
> >> live tree.
> >> 
> >> Export of of_resolve by Guenter Roeck 
> >> 
> >> Signed-off-by: Pantelis Antoniou 
> >> ---
> >> .../devicetree/dynamic-resolution-notes.txt|  25 ++
> >> drivers/of/Kconfig |   9 +
> >> drivers/of/Makefile|   1 +
> >> drivers/of/resolver.c  | 396 
> >> +
> >> include/linux/of.h |  17 +
> >> 5 files changed, 448 insertions(+)
> >> create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt
> >> create mode 100644 drivers/of/resolver.c
> >> 
> >> diff --git a/Documentation/devicetree/dynamic-resolution-notes.txt 
> >> b/Documentation/devicetree/dynamic-resolution-notes.txt
> >> new file mode 100644
> >> index 000..0b396c4
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/dynamic-resolution-notes.txt
> >> @@ -0,0 +1,25 @@
> >> +Device Tree Dynamic Resolver Notes
> >> +--
> >> +
> >> +This document describes the implementation of the in-kernel
> >> +Device Tree resolver, residing in drivers/of/resolver.c and is a
> >> +companion document to Documentation/devicetree/dt-object-internal.txt[1]
> > 
> > dt-object-internal.txt is in the DTC patch, not the kernel tree.
> > 
> 
> Yes, good catch. I will fix the reference.
> 
> BTW, what about moving/copying some of the DTC docs in the kernel doc
> directory? The dtc Documentation directory is missing from the kernel tree.
> 
> 
> >> +
> >> +How the resolver works
> >> +--
> >> +
> >> +The resolver is given as an input an arbitrary tree compiled with the
> >> +proper dtc option and having a /plugin/ tag. This generates the
> >> +appropriate __fixups__ & __local_fixups__ nodes as described in [1].
> > 
> > Missing footnote reference line for [1]?
> > 
> 
> Yes.
> 
> >> +
> >> +In sequence the resolver works by the following steps:
> >> +
> >> +1. Get the maximum device tree phandle value from the live tree + 1.
> > 
> > Is there a (realistic) worry about leaking phandle number space from
> > plugging/unplugging trees repeated addition/removal of overlays?
> > 
> 
> I think not. But doing it this way has the nice property of keeping all 
> phandle
> values the same each time you do a load-unload-load sequence. 

It will break if there are two overlays "leapfrogging" each other on
loads/unloads. It may be a very outside corner case, but it is worth
thinking about.

> 
> >> +2. Adjust all the local phandles of the tree to resolve by that amount.
> >> +3. Using the __local__fixups__ node information adjust all local 
> >> references
> >> +   by the same amount.
> >> +4. For each property in the __fixups__ node locate the node it references
> >> +   in the live tree. This is the label used to tag the node.
> >> +5. Retrieve the phandle of the target of the fixup.
> >> +5. For each fixup in the property locate the node:property:offset location
> >> +   and replace it with the phandle value.
> >> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
> >> index 78cc760..2a00ae5 100644
> >> --- a/drivers/of/Kconfig
> >> +++ b/drivers/of/Kconfig
> >> @@ -74,4 +74,13 @@ config OF_MTD
> >>depends on MTD
> >>def_bool y
> >> 
> >> +config OF_RESOLVE
> >> +  bool "OF Dynamic resolution support"
> >> +  depends on OF
> >> +  select OF_DYNAMIC
> >> +  select OF_DEVICE
> >> +  help
> >> +Enable OF dynamic resolution support. This allows you to
> >> +load Device Tree object fragments are run time.
> >> +
> >> endmenu # OF
> >> diff --git a/drivers/of/Makefile b/drivers/of/Makefile
> >> index 9bc6d8c..93da457 100644
> >> --- a/drivers/of/Makefile
> >> +++ b/drivers/of/Makefile
> >> @@ -9,3 +9,4 @@ obj-$(CONFIG_OF_MDIO)  += of_mdio.o
> >> obj-$(CONFIG_OF_PCI)   += of_pci.o
> >> obj-$(CONFIG_OF_PCI_IRQ)  += of_pci_irq.o
> >> obj-$(CONFIG_OF_MTD)   += of_mtd.o
> >> +obj-$(CONFIG_OF_RESOLVE)  += resolver.o
> >> diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
> >> new file mode 100644
> >> index 000..dfbb51a
> >> --- /dev/null
> >> +++ b/drivers/of/resolver.c
> >> @@ -0,0 +1,396 @@
> >> +/*
> >> + * Functions for dealing with DT resolution
> >> + *
> >> + * Copyright (C) 2012 Pantelis Antoniou 
> >> + * Copyright (C) 2012 Texas Instruments Inc.
> >> + *
> >> + * This program is free software; you can redistribute it and/or
> >> + * modify it under the terms of the GNU General Public License
> >> + * version 2 as published by the Free Software Foundation.
> >> + */
> >> +
> >> +#include 
> >> +#include 
> >> +#include 
> >> 

Re: [PATCH 5/5] OF: Introduce utility helper functions

2013-11-12 Thread Grant Likely
On Tue, 12 Nov 2013 11:39:08 +0100, Pantelis Antoniou 
 wrote:
> Hi Grant,
> 
> On Nov 11, 2013, at 5:37 PM, Grant Likely wrote:
> 
> > On Tue,  5 Nov 2013 19:50:16 +0200, Pantelis Antoniou 
> >  wrote:
> >> Introduce helper functions for working with the live DT tree.
> >> 
> >> __of_free_property() frees a dynamically created property
> >> __of_free_tree() recursively frees a device node tree
> >> __of_copy_property() copies a property dynamically
> >> __of_create_empty_node() creates an empty node
> >> __of_find_node_by_full_name() finds the node with the full name
> >> and
> >> of_multi_prop_cmp() performs a multi property compare but without
> >> having to take locks.
> >> 
> >> Signed-off-by: Pantelis Antoniou 
> > 
> > So, this all looks like private stuff, or stuff that belongs in
> > drivers/of/base.c. Can you move stuff around. I've made more comments
> > below.
> > 
> 
> Placement is no big issue;
> 
> > g.
> > 
> 
> [snip]
> 
> >> +  } else {
> >> +  pr_warn("%s: node %p cannot be freed; memory is gone\n",
> >> +  __func__, node);
> >> +  }
> >> +}
> > 
> > All of the above is potentially dangerous. There is no way to determine
> > if anything still holds a reference to a node. The proper way to handle
> > removal of properties is to have a release method when the last
> > of_node_put is called.
> > 
> 
> This is safe, and expected to be called only on a dynamically created tree,
> that's what all the checks against OF_DYNAMIC guard against.
> 
> It is not ever meant to be called on an arbitrary tree, created by 
> unflattening
> a blob.

I am talking about when being used on a dynamic tree. The problem is
when a driver or other code holds a reference to a dynamic nodes, but
doesn't release it correctly. The memory must not be freed until all of
the references are relased. OF_DYNAMIC doesn't actually help in that
case, and it is the reason for of_node_get()/of_node_put()

> Perhaps we could have a switch to control whether an unflattened tree is 
> created dynamically and then freeing/releasing will work.
> 
> kobject-ifcation will require it anyway, don't you agree? 

Yes. Kobjectifcation will also take care of the release method.

> 
> >> +
> >> +/**
> >> + * __of_copy_property - Copy a property dynamically.
> >> + * @prop: Property to copy
> >> + * @flags:Allocation flags (typically pass GFP_KERNEL)
> >> + *
> >> + * Copy a property by dynamically allocating the memory of both the
> >> + * property stucture and the property name & contents. The property's
> >> + * flags have the OF_DYNAMIC bit set so that we can differentiate between
> >> + * dynamically allocated properties and not.
> >> + * Returns the newly allocated property or NULL on out of memory error.
> >> + */
> > 
> > What do you intend the use-case to be for this function? Will the
> > duplicated property be immediately modified? If so, what happens if the
> > property needs to be grown in size?
> > 
> 
> No, the property will no be modified. If it needs to grow it will be moved to 
> deadprops (since we don't track refs to props) and a new one will be 
> allocated.
> 
> >> +struct property *__of_copy_property(const struct property *prop, gfp_t 
> >> flags)
> >> +{
> >> +  struct property *propn;
> >> +
> >> +  propn = kzalloc(sizeof(*prop), flags);
> >> +  if (propn == NULL)
> >> +  return NULL;
> >> +
> >> +  propn->name = kstrdup(prop->name, flags);
> >> +  if (propn->name == NULL)
> >> +  goto err_fail_name;
> >> +
> >> +  if (prop->length > 0) {
> >> +  propn->value = kmalloc(prop->length, flags);
> >> +  if (propn->value == NULL)
> >> +  goto err_fail_value;
> >> +  memcpy(propn->value, prop->value, prop->length);
> >> +  propn->length = prop->length;
> >> +  }
> >> +
> >> +  /* mark the property as dynamic */
> >> +  of_property_set_flag(propn, OF_DYNAMIC);
> >> +
> >> +  return propn;
> >> +
> >> +err_fail_value:
> >> +  kfree(propn->name);
> >> +err_fail_name:
> >> +  kfree(propn);
> >> +  return NULL;
> >> +}
> >> +
> >> +/**
> >> + * __of_create_empty_node - Create an empty device node dynamically.
> >> + * @name: Name of the new device node
> >> + * @type: Type of the new device node
> >> + * @full_name:Full name of the new device node
> >> + * @phandle:  Phandle of the new device node
> >> + * @flags:Allocation flags (typically pass GFP_KERNEL)
> >> + *
> >> + * Create an empty device tree node, suitable for further modification.
> >> + * The node data are dynamically allocated and all the node flags
> >> + * have the OF_DYNAMIC & OF_DETACHED bits set.
> >> + * Returns the newly allocated node or NULL on out of memory error.
> >> + */
> >> +struct device_node *__of_create_empty_node(
> >> +  const char *name, const char *type, const char *full_name,
> >> +  phandle phandle, gfp_t flags)
> > 
> > I would like to see a user of this function in the core DT paths that
> > allocate 

[tip:core/urgent] smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK= y usable without debug dependency

2013-11-12 Thread tip-bot for Josh Boyer
Commit-ID:  9dd1220114e00d8ec5cdc20085bbe198b21e1985
Gitweb: http://git.kernel.org/tip/9dd1220114e00d8ec5cdc20085bbe198b21e1985
Author: Josh Boyer 
AuthorDate: Mon, 11 Nov 2013 09:08:15 -0500
Committer:  Ingo Molnar 
CommitDate: Wed, 13 Nov 2013 00:45:50 +0100

smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency

When CONFIG_CPUMASK_OFFSTACK was added in 2008, it was dependent upon
CONFIG_DEBUG_PER_CPU_MAPS being enabled, or an architecture could
select it.

The debug dependency adds additional overhead that isn't required
for operation of the feature and which is undesirable for distro
kernels. CONFIG_CPUMASK_OFFSTACK=y is needed to increase the
CONFIG_NR_CPUS value beyond 512 on x86.

So drop the current dependency, its only real dependency is CONFIG_SMP=y.

Signed-off-by: Josh Boyer 
Cc: Rusty Russell 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/2013140815.gb20...@hansolo.jdub.homelinux.org
Signed-off-by: Ingo Molnar 
---
 lib/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index b3c8be0..50b47cd 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -342,7 +342,8 @@ config CHECK_SIGNATURE
bool
 
 config CPUMASK_OFFSTACK
-   bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+   bool "Force CPU masks off stack"
+   depends on SMP
help
  Use dynamic allocation for cpumask_var_t, instead of putting
  them on the stack.  This is a bit more expensive, but avoids
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH 1/2] f2fs: add a new function to support for merging contiguous read

2013-11-12 Thread Gu Zheng
On 11/12/2013 01:15 PM, Chao Yu wrote:

> For better read performance, we add a new function to support for merging 
> contiguous read as the one for write.

Nice shot!

> 
> Signed-off-by: Chao Yu 

Acked-by: Gu Zheng 

> ---
>  fs/f2fs/data.c |   45 +
>  fs/f2fs/f2fs.h |2 ++
>  2 files changed, 47 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index aa3438c..f30060b 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -404,6 +404,51 @@ int f2fs_readpage(struct f2fs_sb_info *sbi, struct page 
> *page,
>   return 0;
>  }
>  
> +void f2fs_submit_read_bio(struct f2fs_sb_info *sbi, int rw)
> +{
> + down_read(>bio_sem);
> + if (sbi->read_bio) {
> + submit_bio(rw, sbi->read_bio);
> + sbi->read_bio = NULL;
> + }
> + up_read(>bio_sem);
> +}
> +
> +void submit_read_page(struct f2fs_sb_info *sbi, struct page *page,
> + block_t blk_addr, int rw)
> +{
> + struct block_device *bdev = sbi->sb->s_bdev;
> + int bio_blocks;
> +
> + verify_block_addr(sbi, blk_addr);
> +
> + down_read(>bio_sem);
> +
> + if (sbi->read_bio && sbi->last_read_block != blk_addr - 1) {
> + submit_bio(rw, sbi->read_bio);
> + sbi->read_bio = NULL;
> + }
> +
> +alloc_new:
> + if (sbi->read_bio == NULL) {
> + bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
> + sbi->read_bio = f2fs_bio_alloc(bdev, bio_blocks);
> + sbi->read_bio->bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr);
> + sbi->read_bio->bi_end_io = read_end_io;
> + }
> +
> + if (bio_add_page(sbi->read_bio, page, PAGE_CACHE_SIZE, 0) <
> + PAGE_CACHE_SIZE) {
> + submit_bio(rw, sbi->read_bio);
> + sbi->read_bio = NULL;
> + goto alloc_new;
> + }
> +
> + sbi->last_read_block = blk_addr;
> +
> + up_read(>bio_sem);
> +}
> +
>  /*
>   * This function should be used by the data read flow only where it
>   * does not check the "create" flag that indicates block allocation.
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 89dc750..0afdcec 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -359,6 +359,8 @@ struct f2fs_sb_info {
>  
>   /* for segment-related operations */
>   struct f2fs_sm_info *sm_info;   /* segment manager */
> + struct bio *read_bio;   /* read bios to merge */
> + sector_t last_read_block;   /* last read block number */
>   struct bio *bio[NR_PAGE_TYPE];  /* bios to merge */
>   sector_t last_block_in_bio[NR_PAGE_TYPE];   /* last block number */
>   struct rw_semaphore bio_sem;/* IO semaphore */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH 2/2] f2fs: read contiguous sit entry pages by merging for mount performance

2013-11-12 Thread Gu Zheng
Hi Yu,
On 11/12/2013 01:18 PM, Chao Yu wrote:

> Previously we read sit entries page one by one, this method lost the chance 
> of reading contiguous page together.
> So we read pages as contiguous as possible for better mount performance.
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/f2fs.h|2 ++
>  fs/f2fs/segment.c |   65 
> ++---
>  fs/f2fs/segment.h |2 ++
>  3 files changed, 66 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 0afdcec..bfe9d87 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1113,6 +1113,8 @@ struct page *find_data_page(struct inode *, pgoff_t, 
> bool);
>  struct page *get_lock_data_page(struct inode *, pgoff_t);
>  struct page *get_new_data_page(struct inode *, struct page *, pgoff_t, bool);
>  int f2fs_readpage(struct f2fs_sb_info *, struct page *, block_t, int);
> +void f2fs_submit_read_bio(struct f2fs_sb_info *, int);
> +void submit_read_page(struct f2fs_sb_info *, struct page *, block_t, int);

Better to move these declarations into PATCH 1/2.

>  int do_write_data_page(struct page *);
>  
>  /*
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 86dc289..414c351 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1474,19 +1474,72 @@ static int build_curseg(struct f2fs_sb_info *sbi)
>   return restore_curseg_summaries(sbi);
>  }
>  
> +static int ra_sit_pages(struct f2fs_sb_info *sbi, int start,
> + int nrpages, bool *is_order)
> +{
> + struct address_space *mapping = sbi->meta_inode->i_mapping;
> + struct sit_info *sit_i = SIT_I(sbi);
> + struct page *page;
> + block_t blk_addr;
> + int blkno, readcnt = 0;
> + int sit_blk_cnt = SIT_BLK_CNT(sbi);
> +
> + for (blkno = start; blkno < start + nrpages; blkno++) {
> +
> + if (blkno >= sit_blk_cnt)

Merge these two judgements:
for (blkno = start; blkno < start + nrpages && blkno < sit_blk_cnt; blkno++)

> + goto out;

> + if ((!f2fs_test_bit(blkno, sit_i->sit_bitmap) ^ !*is_order)) {
> + *is_order = !*is_order;
> + goto out;

'Break' seems more suitable.

> + }
> +
> + blk_addr = sit_i->sit_base_addr + blkno;
> + if (*is_order)
> + blk_addr += sit_i->sit_blocks;
> +repeat:
> + page = grab_cache_page(mapping, blk_addr);
> + if (!page) {
> + cond_resched();
> + goto repeat;
> + }
> + if (PageUptodate(page)) {
> + f2fs_put_page(page, 1);
> + readcnt++;
> + goto out;

Here may be 'Continue'.

> + }
> +
> + submit_read_page(sbi, page, blk_addr, READ_SYNC);
> +
> + page_cache_release(page);

Put page here seems not a good idea, otherwise all your work may be in vain.

> + readcnt++;
> + }
> +out:
> + f2fs_submit_read_bio(sbi, READ_SYNC);
> + return readcnt;
> +}
> +
>  static void build_sit_entries(struct f2fs_sb_info *sbi)
>  {
>   struct sit_info *sit_i = SIT_I(sbi);
>   struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_COLD_DATA);
>   struct f2fs_summary_block *sum = curseg->sum_blk;
> - unsigned int start;
> + bool is_order = f2fs_test_bit(0, sit_i->sit_bitmap) ? true : false;
> + int sit_blk_cnt = SIT_BLK_CNT(sbi);
> + int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
> + unsigned int i, start, end;
> + unsigned int readed, start_blk = 0;
>  
> - for (start = 0; start < TOTAL_SEGS(sbi); start++) {
> +next:
> + readed = ra_sit_pages(sbi, start_blk, bio_blocks, _order);

In fact, you know how many blocks that you want to read(SIT_BLK_CNT(sbi)),
so here sit_blk_cnt is more suitable than a MAX one, and it also can make
the logic of ra_sit_pages more simple.

> +
> + start = start_blk * sit_i->sents_per_block;
> + end = (start_blk + readed) * sit_i->sents_per_block;
> +
> + for (; start < end && start < TOTAL_SEGS(sbi); start++) {
>   struct seg_entry *se = _i->sentries[start];
>   struct f2fs_sit_block *sit_blk;
>   struct f2fs_sit_entry sit;
>   struct page *page;
> - int i;
>  
>   mutex_lock(>curseg_mutex);
>   for (i = 0; i < sits_in_cursum(sum); i++) {
> @@ -1497,6 +1550,7 @@ static void build_sit_entries(struct f2fs_sb_info *sbi)
>   }
>   }
>   mutex_unlock(>curseg_mutex);
> +
>   page = get_current_sit_page(sbi, start);
>   sit_blk = (struct f2fs_sit_block *)page_address(page);
>   sit = sit_blk->entries[SIT_ENTRY_OFFSET(sit_i, start)];
> @@ -1509,6 +1563,11 @@ got_it:
>   e->valid_blocks += se->valid_blocks;
>   }
>   }
> +
> + start_blk 

[PATCH] block: Employ u64_stats_init()

2013-11-12 Thread John Stultz
From: Peter Zijlstra 

Now that seqcounts are lockdep enabled objects, we need to properly
initialize them.

Without this patch, Fengguang was seeing:
[4.127282] INFO: trying to register non-static key.
[4.128027] the code is fine but needs lockdep annotation.
[4.128027] turning off the locking correctness validator.
[4.128027] CPU: 0 PID: 96 Comm: kworker/u4:1 Not tainted 
3.12.0-next-20131108-10601-gbad570d #2
[4.128027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[4.128027] Workqueue: events_unbound async_run_entry_fn
[4.128027]  7908e744  78019968 79dc7cf2 7a80e0a8 780199a0 7908953e 
7a1b7f4d
[4.128027]  7a1b7fa7 7a1b7f7d 7f368608  0011 44374011 a805 
7f368110
[4.128027]  7f368110 85bf2a70  780199cc 7908a1c5  0001 

[4.128027] Call Trace:
[4.128027]  [<7908e744>] ? console_unlock+0x353/0x380
[4.128027]  [<79dc7cf2>] dump_stack+0x48/0x60
[4.128027]  [<7908953e>] __lock_acquire.isra.26+0x7e3/0xceb
[4.128027]  [<7908a1c5>] lock_acquire+0x71/0x9a
[4.128027]  [<794079aa>] ? blk_throtl_bio+0x1c3/0x485
[4.128027]  [<7940658b>] throtl_update_dispatch_stats+0x7c/0x153
[4.128027]  [<794079aa>] ? blk_throtl_bio+0x1c3/0x485
[4.128027]  [<794079aa>] blk_throtl_bio+0x1c3/0x485
...

Cc: Vivek Goyal 
Cc: Jens Axboe 
Cc: Fengguang Wu 
Cc: Ingo Molnar 
Reported-by: Fengguang Wu 
Signed-off-by: Peter Zijlstra 
[jstultz: Folded in another fix from the mailing list as well as a fix
to that fix. Tweaked commit message.]
Signed-off-by: John Stultz 
---
 block/blk-cgroup.h   | 10 ++
 block/blk-throttle.c | 10 ++
 block/cfq-iosched.c  | 25 +
 3 files changed, 45 insertions(+)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index ae6969a..1610b22 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -402,6 +402,11 @@ struct request_list *__blk_queue_next_rl(struct 
request_list *rl,
 #define blk_queue_for_each_rl(rl, q)   \
for ((rl) = &(q)->root_rl; (rl); (rl) = __blk_queue_next_rl((rl), (q)))
 
+static inline void blkg_stat_init(struct blkg_stat *stat)
+{
+   u64_stats_init(>syncp);
+}
+
 /**
  * blkg_stat_add - add a value to a blkg_stat
  * @stat: target blkg_stat
@@ -458,6 +463,11 @@ static inline void blkg_stat_merge(struct blkg_stat *to, 
struct blkg_stat *from)
blkg_stat_add(to, blkg_stat_read(from));
 }
 
+static inline void blkg_rwstat_init(struct blkg_rwstat *rwstat)
+{
+   u64_stats_init(>syncp);
+}
+
 /**
  * blkg_rwstat_add - add a value to a blkg_rwstat
  * @rwstat: target blkg_rwstat
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 8331aba..0653404 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -256,6 +256,12 @@ static struct throtl_data *sq_to_td(struct 
throtl_service_queue *sq)
}   \
 } while (0)
 
+static void tg_stats_init(struct tg_stats_cpu *tg_stats)
+{
+   blkg_rwstat_init(_stats->service_bytes);
+   blkg_rwstat_init(_stats->serviced);
+}
+
 /*
  * Worker for allocating per cpu stat for tgs. This is scheduled on the
  * system_wq once there are some groups on the alloc_list waiting for
@@ -269,12 +275,16 @@ static void tg_stats_alloc_fn(struct work_struct *work)
 
 alloc_stats:
if (!stats_cpu) {
+   int cpu;
+
stats_cpu = alloc_percpu(struct tg_stats_cpu);
if (!stats_cpu) {
/* allocation failed, try again after some time */
schedule_delayed_work(dwork, msecs_to_jiffies(10));
return;
}
+   for_each_possible_cpu(cpu)
+   tg_stats_init(per_cpu_ptr(stats_cpu, cpu));
}
 
spin_lock_irq(_stats_alloc_lock);
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 434944c..4d5cec1 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1508,6 +1508,29 @@ static void cfq_init_cfqg_base(struct cfq_group *cfqg)
 }
 
 #ifdef CONFIG_CFQ_GROUP_IOSCHED
+static void cfqg_stats_init(struct cfqg_stats *stats)
+{
+   blkg_rwstat_init(>service_bytes);
+   blkg_rwstat_init(>serviced);
+   blkg_rwstat_init(>merged);
+   blkg_rwstat_init(>service_time);
+   blkg_rwstat_init(>wait_time);
+   blkg_rwstat_init(>queued);
+
+   blkg_stat_init(>sectors);
+   blkg_stat_init(>time);
+
+#ifdef CONFIG_DEBUG_BLK_CGROUP
+   blkg_stat_init(>unaccounted_time);
+   blkg_stat_init(>avg_queue_size_sum);
+   blkg_stat_init(>avg_queue_size_samples);
+   blkg_stat_init(>dequeue);
+   blkg_stat_init(>group_wait_time);
+   blkg_stat_init(>idle_time);
+   blkg_stat_init(>empty_time);
+#endif
+}
+
 static void cfq_pd_init(struct blkcg_gq *blkg)
 {
struct cfq_group *cfqg = blkg_to_cfqg(blkg);
@@ -1515,6 +1538,8 @@ static void cfq_pd_init(struct blkcg_gq *blkg)

Re: [PATCH] CPU Jitter RNG: Executing time variation tests on bare metal

2013-11-12 Thread Stephan Mueller
Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:

Hi Theodore,

> On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
> > Based on this suggestion, I now added the tests in Appendix F.46.8 where
> > I disable the caches and the tests in Appendix F.46.9 where I disable
> > the caches and interrupts.
> 
> What you've added in F.46 is a good start, but as a suggestiom,
> instead of disabling one thing at a time, try disabling *everything*
> and then see what you get, and then enabling one thing at a time.  The
> best thing is if you can get to the point where the amount of entropy
> is close to zero.  Then as you add things back, there's a much better
> sense of where the unpredictability might be coming from, and whether
> the unpredictability is coming from something which is fundamentally
> arising from something which is chaotic or quantum effect, or just
> because we don't have a good way of modelling the behavior of the
> L1/L2 cache (for example) and that is spoofing your entropy estimator.

I was now able to implement two more test buckets that were in my mind for 
quite some time. They are documented in the new sections 6.3 and 6.4 in [1].

The tests for the time variation measurements are now executed on bare metal, 
i.e. without *any* operating system underneath. For achieving that, I used the 
memtest86+ tool, ripped out the memory tests and added the time variation 
testing into it.

The time variation tests now execute single threaded without any interference 
from an underlying operating system. Again, I added all the variations of 
disabling CPU support (TLB flushes, L1/2 flushes, cache disabling, ...).

And, surprise: all the jitter is still there.

Furthermore, I use the same vehicle to just measure the variations by 
obtaining two timestamps immediately after each other and calculate the 
difference. As before, there are various tests which disable the different CPU 
mechanisms.

And, surprise: there is still variations visible. Granted, these variations 
are smaller than the ones for the folding loop. But the smallest variations 
still have way more than 1 bit when applying the Shannon Entropy formula.

The code is uploaded to [2] and can be used to play with.

In addition, I added test resutls with varying loads as explained in section 
6.2 (thanks to Nicholas Mc Guire for helping here).

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

[2] http://www.chronox.de/

Ciao
Stephan
-- 
| Cui bono? |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [seqcount] INFO: trying to register non-static key.

2013-11-12 Thread John Stultz
On 11/12/2013 07:29 AM, Peter Zijlstra wrote:
> On Tue, Nov 12, 2013 at 10:15:41AM -0500, Vivek Goyal wrote:
>> I see that we allocate per cpu stats but don't do any initializations.
>>
>> static void tg_stats_alloc_fn(struct work_struct *work)
>> {
>> static struct tg_stats_cpu *stats_cpu;  /* this fn is non-reentrant 
>> */
>> struct delayed_work *dwork = to_delayed_work(work);
>> bool empty = false;
>>
>> alloc_stats:
>> if (!stats_cpu) {
>> stats_cpu = alloc_percpu(struct tg_stats_cpu);
>> if (!stats_cpu) {
>> /* allocation failed, try again after some time */
>> schedule_delayed_work(dwork, msecs_to_jiffies(10));
>> return;
>> }
>> }
>>
>> spin_lock_irq(_stats_alloc_lock);
> Absolutely!
>
> Something like this perhaps? Did I miss more blkg_[rw]stats? If I read
> the git grep output right, this was the last one.

Hey Peter,
Thanks for chasing these issues down so fast! And sorry for my slow
response here (power outage for a chunk of the day :P) I finally got the
issue reproduced and gave your two patches a whirl, but unfortunately I
get the following bug:

[0.728658] BUG: unable to handle kernel NULL pointer dereference at
0008
[0.729351] IP: [<78287084>] lockdep_init_map+0xd/0x544
[0.729351] *pde = 
[0.729351] Oops: 0002 [#1] SMP
[0.729351] CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted
3.12.0-00185-g838cc7b-dirty #13
[0.729351] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[0.729351] Workqueue: events tg_stats_alloc_fn
[0.729351] task: 7e9e0230 ti: 7e9e6000 task.ti: 7e9e6000
[0.729351] EIP: 0060:[<78287084>] EFLAGS: 00010216 CPU: 0
[0.729351] EIP is at lockdep_init_map+0xd/0x544
[0.729351] EAX: 0004 EBX: 79d554b0 ECX: 79d554b0 EDX: 793f682a
[0.729351] ESI: 0004 EDI: 79d554b8 EBP: 7e9e7e90 ESP: 7e9e7e78
[0.729351]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[0.729351] CR0: 8005003b CR2: 0008 CR3: 01898000 CR4: 06d0
[0.729351] Stack:
[0.729351]  014a 014b 7e9b7340   79d554b8
7e9e7eac 785fae1d
[0.729351]   795a2a60 7e9b7340   7e9e7ef8
7826bae8 
[0.729351]  0001  7826ba8f 7e806c00 7f08d700 795a2a60
7f08c380 795a2a60
[0.729351] Call Trace:
[0.729351]  [<785fae1d>] tg_stats_alloc_fn+0xe3/0x155
...

>
> ---
>  block/blk-throttle.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 8331aba9426f..fd743d98c41d 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -256,6 +256,12 @@ static struct throtl_data *sq_to_td(struct 
> throtl_service_queue *sq)
>   }   \
>  } while (0)
>  
> +static void tg_stats_init(struct tg_stats_cpu *tg_stats)
> +{
> + blkg_rwstat_init(_stats->service_bytes);
> + blkg_rwstat_init(_stats->serviced);
> +}
> +
>  /*
>   * Worker for allocating per cpu stat for tgs. This is scheduled on the
>   * system_wq once there are some groups on the alloc_list waiting for
> @@ -269,12 +275,16 @@ static void tg_stats_alloc_fn(struct work_struct *work)
>  
>  alloc_stats:
>   if (!stats_cpu) {
> + int cpu;
> +
>   stats_cpu = alloc_percpu(struct tg_stats_cpu);
>   if (!stats_cpu) {
>   /* allocation failed, try again after some time */
>   schedule_delayed_work(dwork, msecs_to_jiffies(10));
>   return;
>   }
> + for_each_possible_cpu(cpu)
> + tg_stats_init(per_cpu(stats_cpu, cpu));
It looks like this line should be per_cpu_*ptr*

With that line changed I don't trigger either the BUG or the warning
Fengguang found (btw, great work again Fengguang!).

I'll send out a net patch here that includes all the fixes in a second.

thanks
-john



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [seqcount] INFO: trying to register non-static key.

2013-11-12 Thread Fengguang Wu
On Tue, Nov 12, 2013 at 04:29:56PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 12, 2013 at 10:15:41AM -0500, Vivek Goyal wrote:
> > I see that we allocate per cpu stats but don't do any initializations.
> > 
> > static void tg_stats_alloc_fn(struct work_struct *work)
> > {
> > static struct tg_stats_cpu *stats_cpu;  /* this fn is non-reentrant 
> > */
> > struct delayed_work *dwork = to_delayed_work(work);
> > bool empty = false;
> > 
> > alloc_stats:
> > if (!stats_cpu) {
> > stats_cpu = alloc_percpu(struct tg_stats_cpu);
> > if (!stats_cpu) {
> > /* allocation failed, try again after some time */
> > schedule_delayed_work(dwork, msecs_to_jiffies(10));
> > return;
> > }
> > }
> > 
> > spin_lock_irq(_stats_alloc_lock);
> 
> Absolutely!
> 
> Something like this perhaps? Did I miss more blkg_[rw]stats? If I read
> the git grep output right, this was the last one.

It changed the error into another one:

/kernel/i386-randconfig-j7-11082318/9d0d532888f6e77970016c0c270ddececfab8f9e

+-+---+--+--+--+
| | v3.12 | 
838cc7b488f8 | d3516a7318f8 | 9d0d532888f6 |
+-+---+--+--+--+
| boot_successes  | 129   | 0   
 | 0| 0|
| boot_failures   | 1 | 100 
 | 100  | 100  |
| BUG:kernel_early_hang_without_any_printk_output | 1 | 
 |  |  |
| INFO:trying_to_register_non-static_key  | 0 | 100 
 | 100  |  |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference_at | 0 | 0   
 | 0| 98   |
| Oops:SMP| 0 | 0   
 | 0| 100  |
| BUG:unable_to_handle_kernel_paging_request_at   | 0 | 0   
 | 0| 100  |
| BUG:unable_to_handle_kernel | 0 | 0   
 | 0| 2|
+-+---+--+--+--+

Here are 3 of the call traces. Attached 1 full dmesg.

dmesg-quantal-ant-3:20131113183320:i386-randconfig-j7-11082318:3.12.0-00187-g9d0d532:1

[   52.578884] SCSI Media Changer driver v0.25 
[   52.781957] scsi_debug: host protection
[   52.783001] scsi0 : scsi_debug, version 1.82 [20100324], dev_size_mb=8, 
opts=0x0
[   52.846181] BUG: unable to handle kernel NULL pointer dereference at 0008
[   52.846293] IP: [<78287084>] lockdep_init_map+0xd/0x544
[   52.846293] *pde =  
[   52.846293] Oops: 0002 [#1] SMP 
[   52.846293] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 
3.12.0-00187-g9d0d532 #1
[   52.846293] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   52.846293] Workqueue: events tg_stats_alloc_fn
[   52.846293] task: 854f0070 ti: 854f6000 task.ti: 854f6000
[   52.846293] EIP: 0060:[<78287084>] EFLAGS: 0296 CPU: 0
[   52.846293] EIP is at lockdep_init_map+0xd/0x544
[   52.846293] EAX: 0004 EBX: 79d614b0 ECX: 79d614b0 EDX: 793fba74
[   52.846293] ESI: 0004 EDI: 79d614b8 EBP: 854f7e90 ESP: 854f7e78
[   52.846293]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[   52.846293] CR0: 8005003b CR2: 0008 CR3: 018a4000 CR4: 06d0
[   52.846293] DR0:  DR1:  DR2:  DR3: 
[   52.846293] DR6:  DR7: 
[   52.846293] Stack:
[   52.846293]  014a 014b 854bd740   79d614b8 854f7eac 
785fae1d
[   52.846293]   795aea60 854bd740   854f7ef8 7826bae8 

[   52.846293]  0001  7826ba8f 854a6a00 85bf0700 795aea60 85bef380 
795aea60
[   52.846293] Call Trace:
[   52.846293]  [<785fae1d>] tg_stats_alloc_fn+0xe3/0x155
[   52.846293]  [<7826bae8>] process_one_work+0x1ef/0x349
[   52.846293]  [<7826ba8f>] ? process_one_work+0x196/0x349
[   52.846293]  [<7826c039>] worker_thread+0x1a3/0x274
[   52.846293]  [<7826be96>] ? rescuer_thread+0x233/0x233
[   52.846293]  [<78270649>] kthread+0x8a/0x8f
[   52.846293]  [<7827>] ? SyS_timer_gettime+0x43/0x9e
[   52.846293]  [<78fbac7b>] ret_from_kernel_thread+0x1b/0x30
[   52.846293]  [<782705bf>] ? __kthread_parkme+0x50/0x50
[   52.846293] Code: f0 8b 7d e8 64 ff 0d 68 d7 74 79 89 70 10 89 78 14 e9 9b 
fe ff ff 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 56 89 c6 53 89 cb 83 ec 0c  40 
04 00 00 00 00 c7 40 08 00 00 00 00 64 a1 10 d0 74 79 89
[   52.846293] EIP: [<78287084>] lockdep_init_map+0xd/0x544 SS:ESP 0068:854f7e78
[   52.846293] CR2: 0008
[   52.870492] ---[ 

Re: [PATCH 00/11] random: code cleanups

2013-11-12 Thread Theodore Ts'o
On Tue, Nov 12, 2013 at 05:40:09PM -0500, Greg Price wrote:
> 
> Beyond these easy cleanups, I have a couple of patches queued up (just
> written yesterday, not quite finished) to make /dev/urandom block at
> boot until it has enough entropy, as the "Mining your P's and Q's"
> paper recommended and people have occasionally discussed since then.
> Those patches were definitely for after 3.13 anyway, and I'll send
> them when they're ready.  I see some notifications and warnings in
> this direction in the random.git tree, which is great.

One of the things I've been thinking about with respect to making
/dev/urandom block is being able to configure (via a module parameter
which could be specified on the boot command line) which allows us to
set a limit for how long /dev/urandom will block after which we log a
high priority message that there was an attempt to read from
/dev/urandom which couldn't be satisified, and then allowing the
/dev/urandom read to succed.

The basic idea is that we don't want to break systems, but we do want
to gently coerce people to do the right thing.  Otherwise, I'm worried
that distros, or embedded/mobile/consume electronics engineers would
just patch out the check.  If we make the default be something like
"block for 5 minutes", and then log a message, we won't completely
break a user who is trying to login to a VM, but it will be obvious,
both from the delay and from the kern.crit log message, that there is
a potential problem here that a system administrator needs to worry
about.

- Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ACPI/AC: Remove the pointer of struct acpi_device in the struct acpi_ac

2013-11-12 Thread Lan Tianyu
Now the pointer of struct acpi_device can be got by
ACPI_COMPANION(struct acpi_ac->pdev->dev). So the pointer
is not necessary and remove it.

Signed-off-by: Lan Tianyu 
---
 drivers/acpi/ac.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c
index b9f0d5f..8711e37 100644
--- a/drivers/acpi/ac.c
+++ b/drivers/acpi/ac.c
@@ -56,7 +56,6 @@ static int ac_sleep_before_get_state_ms;
 
 struct acpi_ac {
struct power_supply charger;
-   struct acpi_device *adev;
struct platform_device *pdev;
unsigned long long state;
 };
@@ -70,8 +69,9 @@ struct acpi_ac {
 static int acpi_ac_get_state(struct acpi_ac *ac)
 {
acpi_status status;
+   acpi_handle handle = ACPI_HANDLE(>pdev->dev);
 
-   status = acpi_evaluate_integer(ac->adev->handle, "_PSR", NULL,
+   status = acpi_evaluate_integer(handle, "_PSR", NULL,
   >state);
if (ACPI_FAILURE(status)) {
ACPI_EXCEPTION((AE_INFO, status,
@@ -119,6 +119,7 @@ static enum power_supply_property ac_props[] = {
 static void acpi_ac_notify_handler(acpi_handle handle, u32 event, void *data)
 {
struct acpi_ac *ac = data;
+   struct acpi_device *adev;
 
if (!ac)
return;
@@ -141,10 +142,11 @@ static void acpi_ac_notify_handler(acpi_handle handle, 
u32 event, void *data)
msleep(ac_sleep_before_get_state_ms);
 
acpi_ac_get_state(ac);
-   acpi_bus_generate_netlink_event(ac->adev->pnp.device_class,
+   adev = ACPI_COMPANION(>pdev->dev);
+   acpi_bus_generate_netlink_event(adev->pnp.device_class,
dev_name(>pdev->dev),
event, (u32) ac->state);
-   acpi_notifier_call_chain(ac->adev, event, (u32) ac->state);
+   acpi_notifier_call_chain(adev, event, (u32) ac->state);
kobject_uevent(>charger.dev->kobj, KOBJ_CHANGE);
}
 
@@ -178,8 +180,8 @@ static int acpi_ac_probe(struct platform_device *pdev)
if (!pdev)
return -EINVAL;
 
-   result = acpi_bus_get_device(ACPI_HANDLE(>dev), );
-   if (result)
+   adev = ACPI_COMPANION(>dev);
+   if (!adev)
return -ENODEV;
 
ac = kzalloc(sizeof(struct acpi_ac), GFP_KERNEL);
@@ -188,7 +190,6 @@ static int acpi_ac_probe(struct platform_device *pdev)
 
strcpy(acpi_device_name(adev), ACPI_AC_DEVICE_NAME);
strcpy(acpi_device_class(adev), ACPI_AC_CLASS);
-   ac->adev = adev;
ac->pdev = pdev;
platform_set_drvdata(pdev, ac);
 
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oom-kill && frozen()

2013-11-12 Thread Tejun Heo
Hello,

On Tue, Nov 12, 2013 at 05:56:43PM +0100, Oleg Nesterov wrote:
> On 11/12, Oleg Nesterov wrote:
> > I am also wondering if it makes any sense to turn PF_FROZEN into
> > TASK_FROZEN, something like (incomplete, probably racy) patch below.
> > Note that it actually adds the new state, not the the qualifier.
> 
> As for the current usage of PF_FROZEN... David, it seems that
> oom_scan_process_thread()->__thaw_task() is dead? Probably this
> was fine before, when __thaw_task() cleared the "need to freeze"
> condition, iirc it was PF_FROZEN.
> 
> But today __thaw_task() can't help, no? the task will simply
> schedule() in D state again.

Yeah, it'll have to be actively excluded using e.g. PF_FREEZER_SKIP,
which, BTW, can usually only be manipulated by the task itself.  I've
been saying this multiple times but this is yet another cost of having
"frozen" as a separate completely alien task state, which is
essentially close to being undefined when viewed from userland.  We're
spreading broken behaviors and complexity throughout the kernel by
making this half broken state visible.  :(

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

2013-11-12 Thread Stephan Mueller
Am Sonntag, 10. November 2013, 21:28:06 schrieb Clemens Ladisch:

Hi Clemens,

> Stephan Mueller wrote:
> > Am Sonntag, 10. November 2013, 17:31:07 schrieb Clemens Ladisch:
> >> In the case of CPUs, the jitter you observe in delta
> >> times results in part from the complexities of the inner state, and in
> >> part from real random noise.  The first part is deterministic and might
> >> be predicted by anyone who has enough knowledge about the CPU's
> >> internals.
> > 
> > Right, and that is why I tried to eliminate the CPU mechanisms that may be
> > having a deterministic impact. If I miss a mechanism or your have other
> > suggestions, please help me.
> 
> Many CPUs allow to disable branch prediction, but this is very vendor
> specific (try to find MSR documentation).  The biggest offender probably
> is the out-of-order execution engine, which cannot be disabled.

I was also digging around in this area. My research showed so far that only on 
ARM one can disable the branch prediction unit. Unfortunately, I do not have 
an ARM available where I can do that. I have my Samsung phone, but that runs 
Android and I am not sure how to generate a kernel module here.

For x86, I have not found a way to disable the unit. Nonetheless, I tried to 
bring down the effect by "warming" the caches and the branch prediction up 
(see section 6.1.1 of the new version of the documentation). There I execute 
the testing 1000 times and use only the last result for further analysis.
> 
> >>> When you ask for testing of stuck values, what shall I really test for?
> >>> Shall I test adjacent measurements for the same or alternating values?
> >> 
> >> Same or alternating delta time values happen even on random CPUs.  You
> >> need a theory of how random and non-random CPUs work, and how this
> >> difference affects the delta times, before you can test for that.
> > 
> > Are you telling me that I should invent a formula and apply it?
> 
> I was not implying that the theory has nothing to do with the physical
> device.  It must correctly _describe_ the relevant physical processes.

Right, but currently I am not sure how I can find such description. In 
particular, I created a new round of testing which is even more interesting as 
the results do not allow to pinpoint the exact root cause. More to that in a 
separate email.

Do you have an idea?
> 
> >>> The test for the same values is caught with the Von-Neumann unbiaser.
> >> 
> >> No, the von Neumann unbiaser is run on the whitened bitstream, i.e.,
> >> _after_ the folding operation.
> > 
> > The folding is whitened? How do you reach that conclusion? Yes, the
> > folding is my (very simple) post-processing. But I am not calling it
> > whitened as all statistical problems the underlying variations have
> > *will* be still visible in the folded value.
> 
> If you don't want to call it "whitening", call it "randomness extraction"
> instead.  But its input is a series of delta times like this:
>   01010011
>   10011010
>   01011011
>   01100100
>   10111000
> and the purpose of the folding is to remove these zero patterns.

Not quite. Let me explain the motive behind the folding loop. To maintain 
entropy mathematically, there are only a few operations allowed. One of them 
is a bijective operation which implies that two strings combined with a 
bijective operation will form a new string which contains the maximum entropy 
of the initial strings. XOR is a bijective operation.

Hence, if you have a string with 10 bits that holds 5 bits of entropy and XOR 
it with a 20 bit string that holds 2 bits of entropy, you receive a string 
that is 20 bits in length and holds 5 bits of entropy.

In any case, with the bijective operation, it is not possible to loose entropy 
even when you use the bijective operation to add fully deterministic values to 
a bit stream that is believed to have entropy.

That said, the folding loop uses that line of thought. The loop XORs each bit 
with every other bit to receive one bit at the end. The idea is to collapse 
the initial bit stream by still retaining the entropy present in each 
individual bit. The goal is now that the resulting bit will hold one bit of 
entropy by "collecting" the combined entropy found in the individual bits.

That folding operation will loose entropy, if the overall entropy in the 
folded bit stream is more than one bit. But that point is our advantage, 
because it provides a safety margin, if the value to be folded really holds 
more than one bit of entropy. All my measurements in section 5.1 and appendix 
F just try to show that on every CPU there is always more than one bit of 
entropy.

There is a catch, however: what happens if each individual bit of the bit 
stream holds less than one bit? I.e. the entire bit stream may hold more than 
one bit, but when chopping the bit stream, the none of the individual bits 

Linux 3.11.8

2013-11-12 Thread Greg KH
I'm announcing the release of the 3.11.8 kernel.

All users of the 3.11 kernel series must upgrade.

The updated 3.11.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.11.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/arc/mm/fault.c  |6 
 arch/parisc/kernel/head.S|4 
 arch/um/kernel/exitcode.c|4 
 arch/x86/kernel/apic/x2apic_uv_x.c   |2 
 arch/xtensa/kernel/signal.c  |2 
 drivers/ata/libata-eh.c  |6 
 drivers/clk/clk-nomadik.c|   21 ++
 drivers/clk/versatile/clk-icst.c |2 
 drivers/cpufreq/intel_pstate.c   |7 
 drivers/cpufreq/s3c64xx-cpufreq.c|2 
 drivers/gpu/drm/drm_drv.c|9 +
 drivers/gpu/drm/i915/intel_crt.c |   30 +++
 drivers/gpu/drm/i915/intel_ddi.c |   21 ++
 drivers/gpu/drm/i915/intel_display.c |  131 ++--
 drivers/gpu/drm/i915/intel_dp.c  |  132 +---
 drivers/gpu/drm/i915/intel_drv.h |2 
 drivers/gpu/drm/i915/intel_lvds.c|   16 ++
 drivers/gpu/drm/radeon/atombios_encoders.c   |2 
 drivers/gpu/drm/radeon/ni.c  |1 
 drivers/gpu/drm/radeon/r600.c|1 
 drivers/gpu/drm/radeon/si.c  |1 
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |   17 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |2 
 drivers/hid/hid-core.c   |1 
 drivers/hid/hid-ids.h|1 
 drivers/hid/hid-wiimote-core.c   |5 
 drivers/md/bcache/request.c  |2 
 drivers/md/md.c  |5 
 drivers/md/raid1.c   |1 
 drivers/md/raid10.c  |1 
 drivers/md/raid5.c   |   20 ++
 drivers/net/can/at91_can.c   |4 
 drivers/net/can/flexcan.c|   14 +
 drivers/net/wireless/ath/ath9k/main.c|   23 +-
 drivers/net/wireless/iwlwifi/iwl-6000.c  |6 
 drivers/net/wireless/iwlwifi/iwl-config.h|1 
 drivers/net/wireless/iwlwifi/mvm/scan.c  |   12 +
 drivers/net/wireless/iwlwifi/pcie/drv.c  |   10 +
 drivers/net/wireless/mwifiex/main.c  |6 
 drivers/net/wireless/rt2x00/rt2x00pci.c  |9 -
 drivers/net/wireless/rtlwifi/rtl8192cu/trx.c |3 
 drivers/ntb/ntb_hw.c |   41 -
 drivers/ntb/ntb_hw.h |   16 ++
 drivers/ntb/ntb_regs.h   |4 
 drivers/ntb/ntb_transport.c  |   17 --
 drivers/scsi/BusLogic.c  |   16 +-
 drivers/scsi/aacraid/linit.c |2 
 drivers/scsi/sd.c|2 
 drivers/staging/bcm/Bcmchar.c|1 
 drivers/staging/ozwpan/ozcdev.c  |3 
 drivers/staging/sb105x/sb_pci_mp.c   |2 
 drivers/staging/wlags49_h2/wl_priv.c |9 -
 drivers/target/target_core_pscsi.c   |8 -
 drivers/uio/uio.c|   41 +++--
 drivers/usb/core/quirks.c|6 
 drivers/usb/host/xhci-hub.c  |   26 ---
 drivers/usb/musb/musb_core.c |   46 +
 drivers/usb/musb/musb_core.h |1 
 drivers/usb/musb/musb_gadget.c   |2 
 drivers/usb/musb/musb_virthub.c  |   46 -
 drivers/usb/serial/ftdi_sio.c|1 
 drivers/usb/serial/ftdi_sio_ids.h|6 
 drivers/usb/serial/option.c  |  216 +++
 drivers/usb/storage/scsiglue.c   |5 
 drivers/usb/storage/unusual_devs.h   |7 
 drivers/vhost/scsi.c |2 
 drivers/video/au1100fb.c |   26 ---
 drivers/video/au1200fb.c |   23 --
 fs/cifs/cifsfs.c |6 
 fs/ecryptfs/crypto.c |2 
 fs/ecryptfs/keystore.c   |3 
 fs/eventpoll.c   |4 
 fs/jfs/jfs_inode.c   |3 
 fs/proc/task_mmu.c   |4 
 fs/select.c  |3 
 fs/seq_file.c|2 
 include/linux/usb_usual.h|4 
 include/trace/events/target.h|4 
 include/uapi/drm/drm_mode.h  |2 
 kernel/cgroup.c  |6 
 kernel/mutex.c   |   32 ++--
 kernel/time/clockevents.c

[PATCH] sched: fix the endless sync_sched/rcu() inside _cpu_down()

2013-11-12 Thread Michael wang

Commit 6acce3ef8:

sched: Remove get_online_cpus() usage

try to do sync_sched/rcu() inside _cpu_down() but trigger:

INFO: task swapper/0:1 blocked for more than 120 seconds.
...
[] synchronize_rcu+0x2c/0x30
[] _cpu_down+0x2b2/0x340
...

It was caused by that in rcu boost case, we rely on smpboot thread to
finish the rcu callback, which has already parked before sync in here
and lead to the endless sync_sched/rcu().

This patch exchange the sequence of smpboot_park_threads() and
sync_sched/rcu() to fix the BUG.

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Reported-by: Fengguang Wu 
Tested-by: Fengguang Wu 
Signed-off-by: Michael Wang 
---
 kernel/cpu.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 63aa50d..2227b58 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int 
tasks_frozen)
__func__, cpu);
goto out_release;
}
-   smpboot_park_threads(cpu);
 
/*
 * By now we've cleared cpu_active_mask, wait for all preempt-disabled
@@ -315,12 +314,16 @@ static int __ref _cpu_down(unsigned int cpu, int 
tasks_frozen)
 *
 * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might
 * not imply sync_sched(), so explicitly call both.
+*
+* Do sync before park smpboot threads to take care the rcu boost case.
 */
 #ifdef CONFIG_PREEMPT
synchronize_sched();
 #endif
synchronize_rcu();
 
+   smpboot_park_threads(cpu);
+
/*
 * So now all preempt/rcu users must observe !cpu_active().
 */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 3.10.19

2013-11-12 Thread Greg KH
I'm announcing the release of the 3.10.19 kernel.

All users of the 3.10 kernel series must upgrade.

The updated 3.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.10.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/arc/mm/fault.c  |6 
 arch/parisc/kernel/head.S|4 
 arch/um/kernel/exitcode.c|4 
 arch/x86/kernel/apic/x2apic_uv_x.c   |2 
 arch/xtensa/kernel/signal.c  |2 
 drivers/ata/libata-eh.c  |6 
 drivers/clk/versatile/clk-icst.c |2 
 drivers/cpufreq/intel_pstate.c   |7 
 drivers/gpu/drm/drm_drv.c|9 +
 drivers/gpu/drm/radeon/atombios_encoders.c   |2 
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |   17 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |2 
 drivers/md/bcache/request.c  |2 
 drivers/md/md.c  |5 
 drivers/md/raid1.c   |1 
 drivers/md/raid10.c  |1 
 drivers/md/raid5.c   |   20 ++
 drivers/net/can/at91_can.c   |4 
 drivers/net/can/flexcan.c|   14 +
 drivers/net/wireless/ath/ath9k/main.c|   23 +-
 drivers/net/wireless/iwlwifi/iwl-6000.c  |6 
 drivers/net/wireless/iwlwifi/iwl-config.h|1 
 drivers/net/wireless/iwlwifi/pcie/drv.c  |   10 +
 drivers/net/wireless/mwifiex/main.c  |6 
 drivers/net/wireless/rtlwifi/rtl8192cu/trx.c |3 
 drivers/ntb/ntb_hw.c |   41 -
 drivers/ntb/ntb_hw.h |   16 ++
 drivers/ntb/ntb_regs.h   |4 
 drivers/ntb/ntb_transport.c  |   17 --
 drivers/scsi/aacraid/linit.c |2 
 drivers/scsi/sd.c|2 
 drivers/staging/bcm/Bcmchar.c|1 
 drivers/staging/ozwpan/ozcdev.c  |3 
 drivers/staging/sb105x/sb_pci_mp.c   |2 
 drivers/staging/wlags49_h2/wl_priv.c |9 -
 drivers/target/target_core_pscsi.c   |8 -
 drivers/uio/uio.c|   41 +++--
 drivers/usb/core/quirks.c|6 
 drivers/usb/serial/ftdi_sio.c|1 
 drivers/usb/serial/ftdi_sio_ids.h|6 
 drivers/usb/serial/option.c  |  216 +++
 drivers/usb/storage/scsiglue.c   |5 
 drivers/usb/storage/unusual_devs.h   |7 
 drivers/vhost/scsi.c |2 
 drivers/video/au1100fb.c |   28 ---
 drivers/video/au1200fb.c |   27 ---
 fs/ecryptfs/keystore.c   |3 
 fs/jfs/jfs_inode.c   |3 
 fs/seq_file.c|2 
 include/linux/usb_usual.h|4 
 include/uapi/drm/drm_mode.h  |2 
 kernel/cgroup.c  |6 
 kernel/time/clockevents.c|   65 ++--
 lib/scatterlist.c|3 
 mm/huge_memory.c |   70 ++--
 mm/memory.c  |   54 ++
 mm/migrate.c |   19 +-
 mm/mprotect.c|2 
 mm/pagewalk.c|2 
 mm/vmalloc.c |6 
 net/mac80211/cfg.c   |2 
 net/mac80211/ieee80211_i.h   |3 
 net/mac80211/rx.c|3 
 net/mac80211/scan.c  |   19 ++
 net/mac80211/status.c|3 
 net/mac80211/tx.c|3 
 net/mac80211/util.c  |4 
 net/wireless/ibss.c  |3 
 scripts/kallsyms.c   |   12 +
 scripts/link-vmlinux.sh  |2 
 sound/core/pcm.c |4 
 sound/pci/hda/hda_codec.c|4 
 sound/pci/hda/hda_generic.c  |4 
 sound/pci/hda/patch_realtek.c|1 
 sound/soc/codecs/wm_hubs.c   |1 
 sound/soc/soc-dapm.c |2 
 77 files changed, 680 insertions(+), 236 deletions(-)

Aaron Lu (1):
  SCSI: sd: call blk_pm_runtime_init before add_disk

Al Viro (2):
  au1100fb: VM_IO is set by io_remap_pfn_range()
  au1200fb: io_remap_pfn_range() sets VM_IO

Alex Deucher (1):
  drm/radeon/atom: workaround vbios bug in transmitter table on rs780


Linux 3.4.69

2013-11-12 Thread Greg KH
I'm announcing the release of the 3.4.69 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/parisc/kernel/head.S|4 
 arch/um/kernel/exitcode.c|4 
 arch/xtensa/kernel/signal.c  |2 
 drivers/ata/libata-eh.c  |6 
 drivers/gpu/drm/drm_drv.c|9 +
 drivers/gpu/drm/radeon/atombios_encoders.c   |2 
 drivers/md/raid1.c   |1 
 drivers/md/raid10.c  |1 
 drivers/net/can/flexcan.c|   10 +
 drivers/net/wireless/rtlwifi/rtl8192cu/trx.c |3 
 drivers/scsi/aacraid/linit.c |2 
 drivers/staging/bcm/Bcmchar.c|1 
 drivers/staging/ozwpan/ozcdev.c  |3 
 drivers/usb/core/quirks.c|6 
 drivers/usb/serial/ftdi_sio.c|1 
 drivers/usb/serial/ftdi_sio_ids.h|6 
 drivers/usb/serial/option.c  |  216 +++
 fs/jfs/jfs_inode.c   |3 
 kernel/time/clockevents.c|   65 ++--
 lib/scatterlist.c|3 
 mm/swap.c|   31 +++
 net/mac80211/ieee80211_i.h   |3 
 net/mac80211/scan.c  |   19 ++
 net/mac80211/status.c|3 
 sound/core/pcm.c |4 
 sound/pci/hda/patch_realtek.c|1 
 sound/soc/codecs/wm_hubs.c   |1 
 sound/soc/soc-dapm.c |2 
 29 files changed, 382 insertions(+), 32 deletions(-)

Alex Deucher (1):
  drm/radeon/atom: workaround vbios bug in transmitter table on rs780

Baruch Siach (1):
  xtensa: don't use alternate signal stack on threads

Chris Wilson (1):
  drm: Prevent overwriting from userspace underallocating core ioctl structs

Dan Carpenter (4):
  staging: ozwpan: prevent overflow in oz_cdev_write()
  Staging: bcm: info leak in ioctl
  uml: check length in exitcode_proc_write()
  aacraid: missing capable() check in compat ioctl

Dave Kleikamp (1):
  jfs: fix error path in ialloc

Emmanuel Grumbach (1):
  mac80211: correctly close cancelled scans

Fangxiaozhi (Franko) (1):
  USB: support new huawei devices in option.c

Felix Fietkau (1):
  mac80211: update sta->last_rx on acked tx frames

Greg Kroah-Hartman (1):
  Linux 3.4.69

Gwendal Grignou (1):
  libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures

Helge Deller (1):
  parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM

Khalid Aziz (1):
  mm: fix aio performance regression for database caused by THP

Lukasz Dorau (1):
  md: Fix skipping recovery for read-only arrays.

Marc Kleine-Budde (1):
  can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and 
abort pending TX

Mark Cave-Ayland (1):
  rtlwifi: rtl8192cu: Fix error in pointer arithmetic

Ming Lei (1):
  lib/scatterlist.c: don't flush_kernel_dcache_page on slab page

Oliver Neukum (2):
  USB: quirks.c: add one device that cannot deal with suspension
  USB: quirks: add touchscreen that is dazzeled by remote wakeup

Russell King (1):
  ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM

Takashi Iwai (3):
  ALSA: hda - Add a fixup for ASUS N76VZ
  ASoC: wm_hubs: Add missing break in hp_supply_event()
  ASoC: dapm: Fix source list debugfs outputs

Thomas Gleixner (1):
  clockevents: Sanitize ticks to nsec conversion

Алексей Крамаренко (1):
  USB: serial: ftdi_sio: add id for Z3X Box device



signature.asc
Description: Digital signature


Re: Linux 3.4.69

2013-11-12 Thread Greg KH

diff --git a/Makefile b/Makefile
index 656c45f09128..2f9a0467ede5 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 68
+SUBLEVEL = 69
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/parisc/kernel/head.S b/arch/parisc/kernel/head.S
index 37aabd772fbb..d2d58258aea6 100644
--- a/arch/parisc/kernel/head.S
+++ b/arch/parisc/kernel/head.S
@@ -195,6 +195,8 @@ common_stext:
ldw MEM_PDC_HI(%r0),%r6
depd%r6, 31, 32, %r3/* move to upper word */
 
+   mfctl   %cr30,%r6   /* PCX-W2 firmware bug */
+
ldo PDC_PSW(%r0),%arg0  /* 21 */
ldo PDC_PSW_SET_DEFAULTS(%r0),%arg1 /* 2 */
ldo PDC_PSW_WIDE_BIT(%r0),%arg2 /* 2 */
@@ -203,6 +205,8 @@ common_stext:
copy%r0,%arg3
 
 stext_pdc_ret:
+   mtctl   %r6,%cr30   /* restore task thread info */
+
/* restore rfi target address*/
ldd TI_TASK-THREAD_SZ_ALGN(%sp), %r10
tophys_r1   %r10
diff --git a/arch/um/kernel/exitcode.c b/arch/um/kernel/exitcode.c
index 829df49dee99..41ebbfebb333 100644
--- a/arch/um/kernel/exitcode.c
+++ b/arch/um/kernel/exitcode.c
@@ -40,9 +40,11 @@ static ssize_t exitcode_proc_write(struct file *file,
const char __user *buffer, size_t count, loff_t *pos)
 {
char *end, buf[sizeof("n\0")];
+   size_t size;
int tmp;
 
-   if (copy_from_user(buf, buffer, count))
+   size = min(count, sizeof(buf));
+   if (copy_from_user(buf, buffer, size))
return -EFAULT;
 
tmp = simple_strtol(buf, , 0);
diff --git a/arch/xtensa/kernel/signal.c b/arch/xtensa/kernel/signal.c
index d78869a00b11..b08caaa59813 100644
--- a/arch/xtensa/kernel/signal.c
+++ b/arch/xtensa/kernel/signal.c
@@ -343,7 +343,7 @@ static int setup_frame(int sig, struct k_sigaction *ka, 
siginfo_t *info,
 
sp = regs->areg[1];
 
-   if ((ka->sa.sa_flags & SA_ONSTACK) != 0 && ! on_sig_stack(sp)) {
+   if ((ka->sa.sa_flags & SA_ONSTACK) != 0 && sas_ss_flags(sp) == 0) {
sp = current->sas_ss_sp + current->sas_ss_size;
}
 
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index e47c224d7c28..37fb4d6069a2 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1287,14 +1287,14 @@ void ata_eh_qc_complete(struct ata_queued_cmd *qc)
  * should be retried.  To be used from EH.
  *
  * SCSI midlayer limits the number of retries to scmd->allowed.
- * scmd->retries is decremented for commands which get retried
+ * scmd->allowed is incremented for commands which get retried
  * due to unrelated failures (qc->err_mask is zero).
  */
 void ata_eh_qc_retry(struct ata_queued_cmd *qc)
 {
struct scsi_cmnd *scmd = qc->scsicmd;
-   if (!qc->err_mask && scmd->retries)
-   scmd->retries--;
+   if (!qc->err_mask)
+   scmd->allowed++;
__ata_eh_qc_complete(qc);
 }
 
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 6116e3b75393..e9f1ef5d9340 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -420,9 +420,16 @@ long drm_ioctl(struct file *filp,
asize = drv_size;
}
else if ((nr >= DRM_COMMAND_END) || (nr < DRM_COMMAND_BASE)) {
+   u32 drv_size;
+
ioctl = _ioctls[nr];
-   cmd = ioctl->cmd;
+
+   drv_size = _IOC_SIZE(ioctl->cmd);
usize = asize = _IOC_SIZE(cmd);
+   if (drv_size > asize)
+   asize = drv_size;
+
+   cmd = ioctl->cmd;
} else
goto err_i1;
 
diff --git a/drivers/gpu/drm/radeon/atombios_encoders.c 
b/drivers/gpu/drm/radeon/atombios_encoders.c
index 2f755e2aeb86..6f4627fe24a1 100644
--- a/drivers/gpu/drm/radeon/atombios_encoders.c
+++ b/drivers/gpu/drm/radeon/atombios_encoders.c
@@ -1430,7 +1430,7 @@ radeon_atom_encoder_dpms_dig(struct drm_encoder *encoder, 
int mode)
 * does the same thing and more.
 */
if ((rdev->family != CHIP_RV710) && (rdev->family != 
CHIP_RV730) &&
-   (rdev->family != CHIP_RS880))
+   (rdev->family != CHIP_RS780) && (rdev->family != 
CHIP_RS880))
atombios_dig_transmitter_setup(encoder, 
ATOM_TRANSMITTER_ACTION_ENABLE_OUTPUT, 0, 0);
}
if (ENCODER_MODE_IS_DP(atombios_get_encoder_mode(encoder)) && 
connector) {
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ce5f0449e1b6..75e66c612505 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1357,6 +1357,7 @@ static int raid1_spare_active(struct mddev *mddev)
}
}
if (rdev
+   && 

Re: [tip:x86/urgent] x86/microcode/amd: Tone down printk(), don' t treat a missing firmware file as an error

2013-11-12 Thread Ingo Molnar

* H. Peter Anvin  wrote:

> On 11/12/2013 03:00 PM, Ingo Molnar wrote:
> > 
> > Indeed that is the documented alias, although sta...@kernel.org works 
> > as well and is used frequently:
> > 
> 
> No, it doesn't; it has bounced for the past two years.

'works' as in the patches get picked up into -stable by Greg. :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] net: sctp: bug fixing when sctp path recovers

2013-11-12 Thread Chang


On 11/13/2013 03:37 AM, Vlad Yasevich wrote:

On 11/12/2013 08:34 PM, Chang Xiangzhong wrote:
Look for the __two__ most recently used path/transport and set to 
active_path

and retran_path respectively

Signed-off-by: changxiangzh...@gmail.com
---
  net/sctp/associola.c |4 
  1 file changed, 4 insertions(+)

diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index ab67efc..070011a 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -913,11 +913,15 @@ void sctp_assoc_control_transport(struct 
sctp_association *asoc,

  if (!first || t->last_time_heard > first->last_time_heard) {
  second = first;
  first = t;
+continue;
  }
  if (!second || t->last_time_heard > second->last_time_heard)
  second = t;


You might as well remove this bit and then you don't need a continue.
I don't think we could remove this bit. My understanding of these 
algorithms are to find the 1st recently used path and the 2nd, assigning 
to active_path and retran_path respectively. If we remove the 
looking-for-second block, how are we suppose to find the 2nd?
I think we can remove the continue and use else-if in the 
2nd-assignment-block.



  }

+if (!second)
+second = first;
+


This needs to move down 1 more block.  Set the second transport after we
check to see if the primary is back up and we need to go back to using 
it.


-vlad


I agree with this change

  /* RFC 2960 6.4 Multi-Homed SCTP Endpoints
   *
   * By default, an endpoint should always transmit to the





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 06/11] perf record: Add an option to force per-cpu mmaps

2013-11-12 Thread Sukadev Bhattiprolu
Ingo Molnar [mi...@kernel.org] wrote:
| 
| * Peter Zijlstra  wrote:
| 
| > On Tue, Nov 05, 2013 at 02:31:52PM -0300, Arnaldo Carvalho de Melo wrote:
| > > PeterZ,
| > > 
| > >   Can I have your Acked-by for this one? I guess now the goal is
| > > achieved, no?

Being able to profile children with the --pid is a big plus.

| > 
| > So this option allows -t/-p/-u to create one buffer per cpu and attach
| > all the various thread/process/user tasks' their counters to that one
| > buffer?
| > 
| > As opposed to the current state where each such counter would have its
| > own buffer.
| > 
| > If this is what the patch does, then yes, although I would prefer a
| > slightly clearer Changelog.
| > 
| > Acked-by: Peter Zijlstra 
| 
| Is there any reason why we wouldn't want to make this the default 
| behavior?
| 
| That way we could also lose the somewhat suboptimal 'force' naming: 
| there's nothing forced really, we simply switch to another ring-buffer 
| setup ...

It would be also good if the man page added a comment on when a user
would want one ring-buffer setup over the other.

If the main benefit is to have the children profiled when tasks are 
specified, how about changing the option to --inherit (-I) ?

Or for consistency with 'perf record ', have --pid profile 
children by default and let users specify --no-inherit with --pid if 
they don't want children profiled.

Sukadev

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[git pull] device mapper changes for 3.13

2013-11-12 Thread Mike Snitzer
The following changes since commit 61e6cfa80de5760bbe406f4e815b7739205754d2:

  Linux 3.12-rc5 (2013-10-13 15:41:28 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git 
tags/dm-3.13-changes

for you to fetch changes up to 7b6b2bc98c0303b7f043ad5b35906f833e56308d:

  dm cache: resolve small nits and improve Documentation (2013-11-12 13:11:09 
-0500)

Please pull, thanks.
Mike


A set of device-mapper changes for 3.13.

Improve reliability of buffer allocations for dm messages with a small
number of arguments, a couple path group initialization fixes for dm
multipath, a fix for resizing a dm array, various fixes and
optimizations for dm cache, a fix for device mapper's Kconfig menu
indentation.

Features added include:
- dm crypt support for activating legacy CBC TrueCrypt containers
  (useful for forensics of these old TCRYPT containers)
- reduced dm-cache memory requirements for each block in the cache
- basic support for shrinking a dm-cache's cache (fast) device
- most notably, dm-cache support for managing cache coherency when
  deploying dm-cache with sophisticated origin volumes (that support
  hardware snapshots and/or clustering): these changes come in the form
  of a new passthrough operation mode and a cache block invalidation
  interface.


Hannes Reinecke (1):
  dm mpath: requeue I/O during pg_init

Heinz Mauelshagen (3):
  dm cache: use cell_defer() boolean argument consistently
  dm cache: log error message if dm_kcopyd_copy() fails
  dm cache: optimize commit_if_needed

Joe Thornber (18):
  dm array: fix bug in growing array
  dm cache policy mq: protect residency method with existing mutex
  dm cache: io destined for the cache device can now serve as tick bios
  dm cache: fix a race condition between queuing new migrations and 
quiescing for a shutdown
  dm cache: improve efficiency of quiescing flag management
  dm cache policy: remove return from void policy_remove_mapping
  dm cache policy mq: a few small fixes
  dm cache metadata: return bool from __superblock_all_zeroes
  dm space map disk: optimise sm_disk_dec_block
  dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty()
  dm cache: be much more aggressive about promoting writes to discarded 
blocks
  dm cache: promotion optimisation for writes
  dm cache: cache shrinking support
  dm cache: add passthrough mode
  dm cache metadata: check the metadata version when reading the superblock
  dm cache policy mq: reduce memory requirements
  dm cache: add remove_cblock method to policy interface
  dm cache: add cache block invalidation support

Mike Snitzer (3):
  dm table: print error on preresume failure
  MAINTAINERS: add reference to device-mapper's linux-dm.git tree
  dm cache: resolve small nits and improve Documentation

Mikulas Patocka (4):
  dm: allocate buffer for messages with small number of arguments using 
GFP_NOIO
  dm cache: return -EINVAL if the user specifies unknown cache policy
  dm: allow remove to be deferred
  dm: fix Kconfig menu indentation

Milan Broz (2):
  dm crypt: properly handle extra key string in initialization
  dm crypt: add TCW IV mode for old CBC TCRYPT containers

Shiva Krishna Merla (1):
  dm mpath: fix race condition between multipath_dtr and pg_init_done

 Documentation/device-mapper/cache-policies.txt |   6 +-
 Documentation/device-mapper/cache.txt  |  57 +-
 Documentation/device-mapper/dm-crypt.txt   |  11 +-
 MAINTAINERS|   1 +
 drivers/md/Kconfig |  22 +-
 drivers/md/dm-cache-metadata.c | 104 +++-
 drivers/md/dm-cache-metadata.h |   5 +
 drivers/md/dm-cache-policy-internal.h  |   7 +-
 drivers/md/dm-cache-policy-mq.c| 681 ++--
 drivers/md/dm-cache-policy.c   |   4 +-
 drivers/md/dm-cache-policy.h   |  21 +-
 drivers/md/dm-cache-target.c   | 687 +
 drivers/md/dm-crypt.c  | 214 +++-
 drivers/md/dm-ioctl.c  |  36 +-
 drivers/md/dm-mpath.c  |  34 +-
 drivers/md/dm-table.c  |  23 +-
 drivers/md/dm.c|  47 +-
 drivers/md/dm.h|  13 +-
 drivers/md/persistent-data/dm-array.c  |   5 +-
 drivers/md/persistent-data/dm-space-map-disk.c |  18 +-
 include/uapi/linux/dm-ioctl.h  |  15 +-
 21 files changed, 1544 insertions(+), 467 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

Re: [raid5] kernel BUG at drivers/md/raid5.c:701!

2013-11-12 Thread NeilBrown
On Wed, 13 Nov 2013 08:28:51 +0800 Shaohua Li  wrote:

> On Tue, Nov 12, 2013 at 11:55:56AM +1100, NeilBrown wrote:
> > On Mon, 11 Nov 2013 22:47:57 +0800 fengguang...@intel.com wrote:
> > 
> > > 28cc2127527dcba2a0817afa8fd5a33c9e023090 is the first bad commit
> > > commit 28cc2127527dcba2a0817afa8fd5a33c9e023090
> > > Author: Shaohua Li 
> > > Date:   Tue Sep 10 15:37:56 2013 +0800
> > > 
> > > raid5: relieve lock contention in get_active_stripe()
> > > 
> > > get_active_stripe() is the last place we have lock contention. It has 
> > > two
> > > paths. One is stripe isn't found and new stripe is allocated, the 
> > > other is
> > > stripe is found.
> > > 
> > > The first path basically calls __find_stripe and init_stripe. It 
> > > accesses
> > > conf->generation, conf->previous_raid_disks, conf->raid_disks,
> > > conf->prev_chunk_sectors, conf->chunk_sectors, conf->max_degraded,
> > > conf->prev_algo, conf->algorithm, the stripe_hashtbl and 
> > > inactive_list. Except
> > > stripe_hashtbl and inactive_list, other fields are changed very 
> > > rarely.
> > > 
> > > With this patch, we split inactive_list and add new hash locks. Each 
> > > free
> > > stripe belongs to a specific inactive list. Which inactive list is 
> > > determined
> > > by stripe's lock_hash. Note, even a stripe hasn't a sector assigned, 
> > > it has a
> > > lock_hash assigned. Stripe's inactive list is protected by a hash 
> > > lock, which
> > > is determined by it's lock_hash too. The lock_hash is derivied from 
> > > current
> > > stripe_hashtbl hash, which guarantees any stripe_hashtbl list will be 
> > > assigned
> > > to a specific lock_hash, so we can use new hash lock to protect 
> > > stripe_hashtbl
> > > list too. The goal of the new hash locks introduced is we can only 
> > > use the new
> > > locks in the first path of get_active_stripe(). Since we have several 
> > > hash
> > > locks, lock contention is relieved significantly.
> > > 
> > > The first path of get_active_stripe() accesses other fields, since 
> > > they are
> > > changed rarely, changing them now need take conf->device_lock and all 
> > > hash
> > > locks. For a slow path, this isn't a problem.
> > > 
> > > If we need lock device_lock and hash lock, we always lock hash lock 
> > > first. The
> > > tricky part is release_stripe and friends. We need take device_lock 
> > > first.
> > > Neil's suggestion is we put inactive stripes to a temporary list and 
> > > readd it
> > > to inactive_list after device_lock is released. In this way, we add 
> > > stripes to
> > > temporary list with device_lock hold and remove stripes from the list 
> > > with hash
> > > lock hold. So we don't allow concurrent access to the temporary list, 
> > > which
> > > means we need allocate temporary list for all participants of 
> > > release_stripe.
> > > 
> > > One downside is free stripes are maintained in their inactive list, 
> > > they can't
> > > across between the lists. By default, we have total 256 stripes and 8 
> > > lists, so
> > > each list will have 32 stripes. It's possible one list has free 
> > > stripe but
> > > other list hasn't. The chance should be rare because stripes 
> > > allocation are
> > > even distributed. And we can always allocate more stripes for cache, 
> > > several
> > > mega bytes memory isn't a big deal.
> > > 
> > > This completely removes the lock contention of the first path of
> > > get_active_stripe(). It slows down the second code path a little bit 
> > > though
> > > because we now need takes two locks, but since the hash lock isn't 
> > > contended,
> > > the overhead should be quite small (several atomic instructions). The 
> > > second
> > > path of get_active_stripe() (basically sequential write or big 
> > > request size
> > > randwrite) still has lock contentions.
> > > 
> > > Signed-off-by: Shaohua Li 
> > > Signed-off-by: NeilBrown 
> > > 
> > > :04 04 84ab47136c389751c7c08ded47b1761b1bee7184 
> > > 351047cfe3ac66013fc5c77f23d9bb04f869081d Mdrivers
> > > bisect run success
> > > 
> > > # bad: [86737931c2be292ec985df48f2e7fbafb4467f0e] Merge 'md/master' into 
> > > devel-hourly-201307
> > > # good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12
> > > git bisect start '86737931c2be292ec985df48f2e7fbafb4467f0e' 
> > > '5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52' '--'
> > > # good: [21136946c495b0e1e0f7e25a8de6f170efbdeadf] drm/vmwgfx: fix 
> > > warning if config intel iommu is off.
> > > git bisect good 21136946c495b0e1e0f7e25a8de6f170efbdeadf
> > > # good: [ee360d688c8e37f81c92039f76bebaddbe36befe] Merge branch 
> > > 'acpi-assorted'
> > > git bisect good ee360d688c8e37f81c92039f76bebaddbe36befe
> > > # good: [cf0613d242805797f252535fcf7bb019512beb46] Merge branch 
> > > 'gma500-next' of 

Re: [PATCH] staging: zsmalloc: Ensure handle is never 0 on success

2013-11-12 Thread Greg KH
On Wed, Nov 13, 2013 at 12:41:38AM +0900, Minchan Kim wrote:
> We spent much time with preventing zram enhance since it have been in staging
> and Greg never want to improve without promotion.

It's not "improve", it's "Greg does not want you adding new features and
functionality while the code is in staging."  I want you to spend your
time on getting it out of staging first.

Now if something needs to be done based on review and comments to the
code, then that's fine to do and I'll accept that, but I've been seeing
new functionality be added to the code, which I will not accept because
it seems that you all have given up on getting it merged, which isn't
ok.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] net: sctp: bug fixing when sctp path recovers

2013-11-12 Thread Vlad Yasevich

On 11/12/2013 08:34 PM, Chang Xiangzhong wrote:

Look for the __two__ most recently used path/transport and set to active_path
and retran_path respectively

Signed-off-by: changxiangzh...@gmail.com
---
  net/sctp/associola.c |4 
  1 file changed, 4 insertions(+)

diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index ab67efc..070011a 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -913,11 +913,15 @@ void sctp_assoc_control_transport(struct sctp_association 
*asoc,
if (!first || t->last_time_heard > first->last_time_heard) {
second = first;
first = t;
+   continue;
}
if (!second || t->last_time_heard > second->last_time_heard)
second = t;


You might as well remove this bit and then you don't need a continue.


}

+   if (!second)
+   second = first;
+


This needs to move down 1 more block.  Set the second transport after we
check to see if the primary is back up and we need to go back to using it.

-vlad


/* RFC 2960 6.4 Multi-Homed SCTP Endpoints
 *
 * By default, an endpoint should always transmit to the



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2013-11-12 Thread David Turner
On Tue, 2013-11-12 at 15:03 -0800, Darrick J. Wong wrote:
> On Mon, Nov 11, 2013 at 07:30:18PM -0500, Theodore Ts'o wrote:
> > On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote:
> > > b. Use Andreas's encoding, which is incompatible with pre-1970 files
> > > written on 64-bit systems.
> > >
> > > I don't care about currently-existing post-2038 files, because I believe
> > > that nobody has a valid reason to have such files.  However, I do
> > > believe that pre-1970 files are probably important to someone.
> > > 
> > > Despite this, I prefer option (b), because I think the simplicity is
> > > valuable, and because I hate to give up date ranges (even ones that I
> > > think we'll "never" need). Option (b) is not actually lossy, because we
> > > could correct pre-1970 files with e2fsck; under Andreas's encoding,
> > > their dates would be in the far future (and thus cannot be legitimate).
> > > 
> > > Would a patch that does (b) be accepted?  I would accompany it with a
> > > patch to e2fsck (which I assume would also go to the ext4 developers
> > > mailing list?).
> > 
> > I agree, I think this is the best way to go.  I'm going to drop your
> > earlier patch, and wait for an updated patch from you.  It may miss
> > this merge window, but as Andreas has pointed out, we still have a few
> > years to get this right.  :-)
> 
> Just to be clear... we're going with Andreas' fix, wherein
> 
> time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32;
> 
> becomes:
> 
> time->tv_sec += (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32;
> 
> "or" becomes "plus"?  So I can update fuse2fs.

Yes, but with a kernel-version-dependent change, which is something like
this:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,20,0)
time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 
32;
#else
u64 extra_bits = le32_to_cpu(extra) & EXT4_EPOCH_MASK;
if (extra_bits == 3)
extra_bits = 0;
time->tv_sec += extra_bits << 32;
#endif

> Also, can someone proofread [1] and make sure it's correct?

It's not quite right.  I've requested an account so that I can correct
it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5] mm, oom: Fix race when selecting process to kill

2013-11-12 Thread David Rientjes
On Tue, 12 Nov 2013, Sameer Nanda wrote:

> The selection of the process to be killed happens in two spots:
> first in select_bad_process and then a further refinement by
> looking for child processes in oom_kill_process. Since this is
> a two step process, it is possible that the process selected by
> select_bad_process may get a SIGKILL just before oom_kill_process
> executes. If this were to happen, __unhash_process deletes this
> process from the thread_group list. This results in oom_kill_process
> getting stuck in an infinite loop when traversing the thread_group
> list of the selected process.
> 
> Fix this race by adding a pid_alive check for the selected process
> with tasklist_lock held in oom_kill_process.
> 
> Change-Id: I62f9652a780863467a8174e18ea5e19bbcd78c31

Is this needed?

> Signed-off-by: Sameer Nanda 
> ---
>  mm/oom_kill.c | 42 +-
>  1 file changed, 29 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 6738c47..5108c2b 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -412,31 +412,40 @@ void oom_kill_process(struct task_struct *p, gfp_t 
> gfp_mask, int order,
>   static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
> DEFAULT_RATELIMIT_BURST);
>  
> + if (__ratelimit(_rs))
> + dump_header(p, gfp_mask, order, memcg, nodemask);
> +
> + task_lock(p);
> + pr_err("%s: Kill process %d (%s) score %d or sacrifice child\n",
> + message, task_pid_nr(p), p->comm, points);
> + task_unlock(p);
> +
> + /*
> +  * while_each_thread is currently not RCU safe. Lets hold the
> +  * tasklist_lock across all invocations of while_each_thread (including
> +  * the one in find_lock_task_mm) in this function.
> +  */
> + read_lock(_lock);
> +
>   /*
>* If the task is already exiting, don't alarm the sysadmin or kill
>* its children or threads, just set TIF_MEMDIE so it can die quickly
>*/
> - if (p->flags & PF_EXITING) {
> + if (p->flags & PF_EXITING || !pid_alive(p)) {
> + pr_info("%s: Not killing process %d, just setting TIF_MEMDIE\n",
> + message, task_pid_nr(p));

That makes no sense in the kernel log to have

Out of Memory: Kill process 1234 (comm) score 50 or sacrifice child
Out of Memory: Not killing process 1234, just setting TIF_MEMDIE

Those are contradictory statements (and will actually mess with kernel log 
parsing at Google) and nobody other than kernel developers are going to 
know what TIF_MEMDIE is.

>   set_tsk_thread_flag(p, TIF_MEMDIE);
>   put_task_struct(p);
> + read_unlock(_lock);
>   return;
>   }
>  
> - if (__ratelimit(_rs))
> - dump_header(p, gfp_mask, order, memcg, nodemask);
> -
> - task_lock(p);
> - pr_err("%s: Kill process %d (%s) score %d or sacrifice child\n",
> - message, task_pid_nr(p), p->comm, points);
> - task_unlock(p);
> -
>   /*
>* If any of p's children has a different mm and is eligible for kill,
>* the one with the highest oom_badness() score is sacrificed for its
>* parent.  This attempts to lose the minimal amount of work done while
>* still freeing memory.
>*/
> - read_lock(_lock);
>   do {
>   list_for_each_entry(child, >children, sibling) {
>   unsigned int child_points;
> @@ -456,12 +465,17 @@ void oom_kill_process(struct task_struct *p, gfp_t 
> gfp_mask, int order,
>   }
>   }
>   } while_each_thread(p, t);
> - read_unlock(_lock);
>  
> - rcu_read_lock();
>   p = find_lock_task_mm(victim);
> +
> + /*
> +  * Since while_each_thread is currently not RCU safe, this unlock of
> +  * tasklist_lock may need to be moved further down if any additional
> +  * while_each_thread loops get added to this function.
> +  */

This comment should be moved to sched.h to indicate how 
while_each_thread() needs to be handled with respect to tasklist_lock, 
it's not specific to the oom killer.

> + read_unlock(_lock);
> +
>   if (!p) {
> - rcu_read_unlock();
>   put_task_struct(victim);
>   return;
>   } else if (victim != p) {
> @@ -478,6 +492,8 @@ void oom_kill_process(struct task_struct *p, gfp_t 
> gfp_mask, int order,
>   K(get_mm_counter(victim->mm, MM_FILEPAGES)));
>   task_unlock(victim);
>  
> + rcu_read_lock();
> +
>   /*
>* Kill all user processes sharing victim->mm in other thread groups, if
>* any.  They don't get access to memory reserves, though, to avoid

Please move this rcu_read_lock() to be immediatley before the 
for_each_process() instead of before the comment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body 

Re: [RFC 0/4] Add basic support for ASV

2013-11-12 Thread MyungJoo Ham
On Mon, Nov 11, 2013 at 11:27 PM, Yadwinder Singh Brar
 wrote:
> gentle ping for suggestions/reviews ..
>
>
> On Wed, Sep 11, 2013 at 8:14 PM, Yadwinder Singh Brar
>  wrote:
>> This series is to add basic common infrastructure for ASV.
>>  Basically ASV is a technique used on samsung SoCs, which provides the
>> recommended supply voltage for dvfs of arm, mif etc. For a given operating
>> frequency, the voltage is recommended based on SoC's ASV group.
>> ASV group gets fussed on SoCs during process of mass production.

ASV is an instance of AVS. Please recondier and try to reuse
what's already there (drivers/power/avs)

Quote from drivers/power/avs/Kconfig:
  "At a given operating point the voltage is adapted depending on
  static factors (chip manufacturing process) and dynamic factors
  (temperature depending performance)."
It seems that the current ASV is subset of AVS.
Although the current implementation of AVS does not provide significant
infrastructure to its sisters, we may start by sharing the directory.


Added Jean Pihet, who has submitted AVS (TI).

Cheers,
MyungJoo.

>>
>> This series includes:
>>  - basic common infrastructue for ASV. It provides common APIs for user 
>> drivers
>> like cpufreq & devfreq and and an interface for SoC specific drivers to
>> register ASV members(instances)
>>  - a common platform driver to register ASV members for exynos SoCs
>>  - an example providing minimal support (only for ARM ASV) for exynos5250 
>> chips
>>
>> Its just basic skelton which I wanted to get it reviewed or discussed in
>> early stage, before going ahead on further development based on it.
>>  Presently example is based on static ASV table provided in SoC specific 
>> file,
>> which I expects to go into DT. But exactly how and where needs to be 
>> discussed,
>> may be in next revisions once we get through the basic skelton.
>>  Also the location of driver in kernel may also seem odd to someone and
>> many more things :).
>>
>> Looking for your valuable reviews and suggestions.
>>
>> Thanks
>>
>> Yadwinder Singh Brar (4):
>>   power: asv: Add common ASV support for samsung SoCs
>>   power: asv: Add a common asv driver for exynos SoCs.
>>   power: asv: Add support for exynos5250
>>   arm: exynos5: Register static platform device for ASV.
>>
>>  arch/arm/mach-exynos/mach-exynos5-dt.c   |3 +
>>  drivers/power/Kconfig|1 +
>>  drivers/power/Makefile   |1 +
>>  drivers/power/asv/Kconfig|   24 
>>  drivers/power/asv/Makefile   |2 +
>>  drivers/power/asv/exynos-asv.c   |   81 ++
>>  drivers/power/asv/exynos-asv.h   |   22 
>>  drivers/power/asv/exynos5250-asv.c   |  141 
>>  drivers/power/asv/samsung-asv.c  |  175 
>> ++
>>  include/linux/power/samsung-asv-driver.h |   61 +++
>>  include/linux/power/samsung-asv.h|   37 +++
>>  11 files changed, 548 insertions(+), 0 deletions(-)
>>  create mode 100644 drivers/power/asv/Kconfig
>>  create mode 100644 drivers/power/asv/Makefile
>>  create mode 100644 drivers/power/asv/exynos-asv.c
>>  create mode 100644 drivers/power/asv/exynos-asv.h
>>  create mode 100644 drivers/power/asv/exynos5250-asv.c
>>  create mode 100644 drivers/power/asv/samsung-asv.c
>>  create mode 100644 include/linux/power/samsung-asv-driver.h
>>  create mode 100644 include/linux/power/samsung-asv.h
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
MyungJoo Ham, Ph.D.
System S/W Lab, S/W Center, Samsung Electronics
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/14] sched: SCHED_DEADLINE structures & implementation.

2013-11-12 Thread Steven Rostedt
On Thu,  7 Nov 2013 14:43:37 +0100
Juri Lelli  wrote:

> From: Dario Faggioli 

> --- /dev/null
> +++ b/include/linux/sched/deadline.h
> @@ -0,0 +1,24 @@
> +#ifndef _SCHED_DEADLINE_H
> +#define _SCHED_DEADLINE_H
> +
> +/*
> + * SCHED_DEADLINE tasks has negative priorities, reflecting
> + * the fact that any of them has higher prio than RT and
> + * NORMAL/BATCH tasks.
> + */
> +
> +#define MAX_DL_PRIO  0
> +
> +static inline int dl_prio(int prio)
> +{
> + if (unlikely(prio < MAX_DL_PRIO))
> + return 1;
> + return 0;
> +}
> +
> +static inline int dl_task(struct task_struct *p)
> +{
> + return dl_prio(p->prio);
> +}
> +
> +#endif /* _SCHED_DEADLINE_H */
> diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> index 440434d..a157797 100644
> --- a/include/linux/sched/rt.h
> +++ b/include/linux/sched/rt.h
> @@ -22,7 +22,7 @@
>  
>  static inline int rt_prio(int prio)
>  {
> - if (unlikely(prio < MAX_RT_PRIO))
> + if ((unsigned)prio < MAX_RT_PRIO)

Why remove the "unlikely" here?

>   return 1;
>   return 0;
>  }
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 5a0f945..2d5e49a 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -39,6 +39,7 @@
>  #define SCHED_BATCH  3
>  /* SCHED_ISO: reserved but not implemented yet */
>  #define SCHED_IDLE   5
> +#define SCHED_DEADLINE   6
>  /* Can be ORed in to make sure the process is reverted back to SCHED_NORMAL 
> on fork */
>  #define SCHED_RESET_ON_FORK 0x4000
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 086fe73..55fc95f 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1313,7 +1313,9 @@ static struct task_struct *copy_process(unsigned long 
> clone_flags,
>  #endif
>  
>   /* Perform scheduler related setup. Assign this task to a CPU. */
> - sched_fork(p);
> + retval = sched_fork(p);
> + if (retval)
> + goto bad_fork_cleanup_policy;
>  
>   retval = perf_event_init_task(p);
>   if (retval)
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 383319b..0909436 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -46,6 +46,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -1610,7 +1611,7 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct 
> timespec __user *rmtp,
>   unsigned long slack;
>  
>   slack = current->timer_slack_ns;
> - if (rt_task(current))
> + if (dl_task(current) || rt_task(current))

Since dl_task() checks if prio is less than 0, and rt_task checks for
prio < MAX_RT_PRIO, I wonder if we can introduce a

dl_or_rt_task(current)

that does a signed compare against MAX_RT_PRIO to eliminate the double
compare (in case gcc doesn't figure it out).

Not something that we need to change now, but something in the future
maybe.

>   slack = 0;
>  
>   hrtimer_init_on_stack(, clockid, mode);
> diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
> index 54adcf3..d77282f 100644
> --- a/kernel/sched/Makefile
> +++ b/kernel/sched/Makefile
> @@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
>  CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
>  endif
>  
> -obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o stop_task.o
> +obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o deadline.o 
> stop_task.o
>  obj-$(CONFIG_SMP) += cpupri.o
>  obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
>  obj-$(CONFIG_SCHEDSTATS) += stats.o
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 4fcbf13..cfe15bfc 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -903,7 +903,9 @@ static inline int normal_prio(struct task_struct *p)
>  {
>   int prio;
>  
> - if (task_has_rt_policy(p))
> + if (task_has_dl_policy(p))
> + prio = MAX_DL_PRIO-1;
> + else if (task_has_rt_policy(p))
>   prio = MAX_RT_PRIO-1 - p->rt_priority;
>   else
>   prio = __normal_prio(p);
> @@ -1611,6 +1613,12 @@ static void __sched_fork(struct task_struct *p)
>   memset(>se.statistics, 0, sizeof(p->se.statistics));
>  #endif
>  
> + RB_CLEAR_NODE(>dl.rb_node);
> + hrtimer_init(>dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> + p->dl.dl_runtime = p->dl.runtime = 0;
> + p->dl.dl_deadline = p->dl.deadline = 0;
> + p->dl.flags = 0;
> +
>   INIT_LIST_HEAD(>rt.run_list);
>  
>  #ifdef CONFIG_PREEMPT_NOTIFIERS
> @@ -1654,7 +1662,7 @@ void set_numabalancing_state(bool enabled)
>  /*
>   * fork()/clone()-time setup:
>   */
> -void sched_fork(struct task_struct *p)
> +int sched_fork(struct task_struct *p)
>  {
>   unsigned long flags;
>   int cpu = get_cpu();
> @@ -1676,7 +1684,7 @@ void sched_fork(struct task_struct *p)
>* Revert to default priority/policy on fork if requested.
>*/
>   if (unlikely(p->sched_reset_on_fork)) {
> -  

[PATCH 2/2] ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call

2013-11-12 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch avoids a duplicate iscsit_increment_maxcmdsn() call for
ISER_IB_RDMA_WRITE within isert_map_rdma() + isert_reg_rdma_frwr(),
which will already be occuring once during isert_put_datain() ->
iscsit_build_rsp_pdu() operation.

It also removes the local conn->stat_sn assignment + increment,
and changes the third parameter to iscsit_build_rsp_pdu() to
signal this should be done by iscsi_target_mode code.

Tested-by: Moussa Ba 
Signed-off-by: Nicholas Bellinger 
---
 drivers/infiniband/ulp/isert/ib_isert.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index bbd86e8..5661075 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2029,8 +2029,6 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd 
*cmd,
 
if (wr->iser_ib_op == ISER_IB_RDMA_WRITE) {
data_left = se_cmd->data_length;
-   iscsit_increment_maxcmdsn(cmd, conn->sess);
-   cmd->stat_sn = conn->stat_sn++;
} else {
sg_off = cmd->write_data_done / PAGE_SIZE;
data_left = se_cmd->data_length - cmd->write_data_done;
@@ -2242,8 +2240,6 @@ isert_reg_rdma_frwr(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd,
 
if (wr->iser_ib_op == ISER_IB_RDMA_WRITE) {
data_left = se_cmd->data_length;
-   iscsit_increment_maxcmdsn(cmd, conn->sess);
-   cmd->stat_sn = conn->stat_sn++;
} else {
sg_off = cmd->write_data_done / PAGE_SIZE;
data_left = se_cmd->data_length - cmd->write_data_done;
@@ -2352,7 +2348,7 @@ isert_put_datain(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd)
 * Build isert_conn->tx_desc for iSCSI response PDU and attach
 */
isert_create_send_desc(isert_conn, isert_cmd, _cmd->tx_desc);
-   iscsit_build_rsp_pdu(cmd, conn, false, (struct iscsi_scsi_rsp *)
+   iscsit_build_rsp_pdu(cmd, conn, true, (struct iscsi_scsi_rsp *)
 _cmd->tx_desc.iscsi_header);
isert_init_tx_hdrs(isert_conn, _cmd->tx_desc);
isert_init_send_wr(isert_conn, isert_cmd,
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn

2013-11-12 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

This patch fixes a >= v3.10 regression bug with mutex_trylock() usage
within iscsit_increment_maxcmdsn(), that was originally added to allow
for a special case where ->cmdsn_mutex was already held from the
iscsit_execute_cmd() exception path for ib_isert.

When !mutex_trylock() was occuring under contention during normal RX/TX
process context codepaths, the bug was manifesting itself as the following
protocol error:

  Received CmdSN: 0x000fcbb7 is greater than MaxCmdSN: 0x000fcbb6, protocol 
error.
  Received CmdSN: 0x000fcbb8 is greater than MaxCmdSN: 0x000fcbb6, protocol 
error.

This patch simply avoids the direct ib_isert callback in lio_queue_status()
for the special iscsi_execute_cmd() exception cases, that allows the problematic
mutex_trylock() usage in iscsit_increment_maxcmdsn() to go away.

Reported-by: Moussa Ba 
Tested-by: Moussa Ba 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/iscsi/iscsi_target_configfs.c |5 +
 drivers/target/iscsi/iscsi_target_device.c   |6 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_configfs.c 
b/drivers/target/iscsi/iscsi_target_configfs.c
index 1eec37c..fde3624 100644
--- a/drivers/target/iscsi/iscsi_target_configfs.c
+++ b/drivers/target/iscsi/iscsi_target_configfs.c
@@ -1790,6 +1790,11 @@ static int lio_queue_status(struct se_cmd *se_cmd)
struct iscsi_cmd *cmd = container_of(se_cmd, struct iscsi_cmd, se_cmd);
 
cmd->i_state = ISTATE_SEND_STATUS;
+
+   if (cmd->se_cmd.scsi_status || cmd->sense_reason) {
+   iscsit_add_cmd_to_response_queue(cmd, cmd->conn, cmd->i_state);
+   return 0;
+   }
cmd->conn->conn_transport->iscsit_queue_status(cmd->conn, cmd);
 
return 0;
diff --git a/drivers/target/iscsi/iscsi_target_device.c 
b/drivers/target/iscsi/iscsi_target_device.c
index 6c7a510..7087c73 100644
--- a/drivers/target/iscsi/iscsi_target_device.c
+++ b/drivers/target/iscsi/iscsi_target_device.c
@@ -58,11 +58,7 @@ void iscsit_increment_maxcmdsn(struct iscsi_cmd *cmd, struct 
iscsi_session *sess
 
cmd->maxcmdsn_inc = 1;
 
-   if (!mutex_trylock(>cmdsn_mutex)) {
-   sess->max_cmd_sn += 1;
-   pr_debug("Updated MaxCmdSN to 0x%08x\n", sess->max_cmd_sn);
-   return;
-   }
+   mutex_lock(>cmdsn_mutex);
sess->max_cmd_sn += 1;
pr_debug("Updated MaxCmdSN to 0x%08x\n", sess->max_cmd_sn);
mutex_unlock(>cmdsn_mutex);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] iscsi/iser-target: Fix

2013-11-12 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Hi folks,

Here are the two patches to address the regression bug that Moussa
has recently been hitting with ib_isert ports + mtip32xx backends.

In the end, this bug was due to an incorrect usage of mutex_trylock(),
that was originally introduced to avoid an ib_isert specific nested
deadlock from an exception path, that inadvertently allowed normal
per RX/TX process context contention to increment sess->max_cmd_sn +
post a response + receive a new command in RX context with a CmdSN
larger than the last MaxCmdSN sychronized across RX/TX contexts.

This patch simply avoids the ib_isert callback in lio_queue_status()
for the special iscsi_execute_cmd() exception cases to avoid the
dead-lock, which allows the problematic mutex_trylock() usage in
iscsit_increment_maxcmdsn() to go away entirely.

I'll be including both of these with a CC' to stable for v3.10+

Special thanks to Moussa for helping track this bug down.

--nab

Nicholas Bellinger (2):
  iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn
  ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call

 drivers/infiniband/ulp/isert/ib_isert.c  |6 +-
 drivers/target/iscsi/iscsi_target_configfs.c |5 +
 drivers/target/iscsi/iscsi_target_device.c   |6 +-
 3 files changed, 7 insertions(+), 10 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >