date:20131112

[PATCH 09/10] crytpo: CCP device driver build files

2013-11-12 Thread Tom Lendacky

These files provide the ability to configure and build the
AMD CCP device driver and crypto API support.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/Kconfig  |   12 
 drivers/crypto/Makefile |1 +
 drivers/crypto/ccp/Kconfig  |   23 +++
 drivers/crypto/ccp/Makefile |   10 ++
 4 files changed, 46 insertions(+)
 create mode 100644 drivers/crypto/ccp/Kconfig
 create mode 100644 drivers/crypto/ccp/Makefile

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index f4fd837..4954d75 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -399,4 +399,16 @@ config CRYPTO_DEV_ATMEL_SHA
  To compile this driver as a module, choose M here: the module
  will be called atmel-sha.
 
+config CRYPTO_DEV_CCP
+   bool "Support for AMD Cryptographic Coprocessor"
+   depends on X86
+   default n
+   help
+ The AMD Cryptographic Coprocessor provides hardware support
+ for encryption, hashing and related operations.
+
+if CRYPTO_DEV_CCP
+   source "drivers/crypto/ccp/Kconfig"
+endif
+
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index b4946dd..8a6c86a 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -22,3 +22,4 @@ obj-$(CONFIG_CRYPTO_DEV_NX) += nx/
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
+obj-$(CONFIG_CRYPTO_DEV_CCP) += ccp/
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
new file mode 100644
index 000..335ed5c
--- /dev/null
+++ b/drivers/crypto/ccp/Kconfig
@@ -0,0 +1,23 @@
+config CRYPTO_DEV_CCP_DD
+   tristate "Cryptographic Coprocessor device driver"
+   depends on CRYPTO_DEV_CCP
+   default m
+   help
+ Provides the interface to use the AMD Cryptographic Coprocessor
+ which can be used to accelerate or offload encryption operations
+ such as SHA, AES and more. If you choose 'M' here, this module
+ will be called ccp.
+
+config CRYPTO_DEV_CCP_CRYPTO
+   tristate "Encryption and hashing acceleration support"
+   depends on CRYPTO_DEV_CCP_DD
+   default m
+   select CRYPTO_ALGAPI
+   select CRYPTO_HASH
+   select CRYPTO_BLKCIPHER
+   select CRYPTO_AUTHENC
+   help
+ Support for using the cryptographic API with the AMD Cryptographic
+ Coprocessor. This module supports acceleration and offload of SHA
+ and AES algorithms.  If you choose 'M' here, this module will be
+ called ccp_crypto.
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
new file mode 100644
index 000..d3505a0
--- /dev/null
+++ b/drivers/crypto/ccp/Makefile
@@ -0,0 +1,10 @@
+obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
+ccp-objs := ccp-dev.o ccp-ops.o
+ccp-objs += ccp-pci.o
+
+obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
+ccp-crypto-objs := ccp-crypto-main.o \
+  ccp-crypto-aes.o \
+  ccp-crypto-aes-cmac.o \
+  ccp-crypto-aes-xts.o \
+  ccp-crypto-sha.o


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 00/10] AMD Cryptographic Coprocessor support

2013-11-12 Thread Tom Lendacky

Resending because of typo in mailing list address...

The following series implements support for the AMD Cryptographic
Coprocessor (CCP).  The AMD CCP provides hardware encryption, hashing
and other related operations.

This patch series is based on the 3.12 kernel.

---

Tom Lendacky (10):
  crypto: authenc - Find proper IV address in ablkcipher callback
  crypto: scatterwalk - Set the chain pointer indication bit
  crypto: CCP device driver and interface support
  crypto: crypto API interface to the CCP device driver
  crypto: CCP AES crypto API support
  crypto: CCP AES CMAC mode crypto API support
  crypto: CCP XTS-AES crypto API support
  crypto: CCP SHA crypto API support
  crytpo: CCP device driver build files
  crypto: CCP maintainer information


 MAINTAINERS  |7 
 crypto/authenc.c |7 
 drivers/crypto/Kconfig   |   12 
 drivers/crypto/Makefile  |1 
 drivers/crypto/ccp/Kconfig   |   23 
 drivers/crypto/ccp/Makefile  |   10 
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c |  355 +
 drivers/crypto/ccp/ccp-crypto-aes-xts.c  |  285 
 drivers/crypto/ccp/ccp-crypto-aes.c  |  375 ++
 drivers/crypto/ccp/ccp-crypto-main.c |  432 ++
 drivers/crypto/ccp/ccp-crypto-sha.c  |  497 +++
 drivers/crypto/ccp/ccp-crypto.h  |  191 +++
 drivers/crypto/ccp/ccp-dev.c |  582 +
 drivers/crypto/ccp/ccp-dev.h |  272 
 drivers/crypto/ccp/ccp-ops.c | 2020 ++
 drivers/crypto/ccp/ccp-pci.c |  360 +
 include/crypto/scatterwalk.h |1 
 include/linux/ccp.h  |  525 
 18 files changed, 5952 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/Kconfig
 create mode 100644 drivers/crypto/ccp/Makefile
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-cmac.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-xts.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-main.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-sha.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto.h
 create mode 100644 drivers/crypto/ccp/ccp-dev.c
 create mode 100644 drivers/crypto/ccp/ccp-dev.h
 create mode 100644 drivers/crypto/ccp/ccp-ops.c
 create mode 100644 drivers/crypto/ccp/ccp-pci.c
 create mode 100644 include/linux/ccp.h

-- 
Tom Lendacky

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/10] crypto: CCP maintainer information

2013-11-12 Thread Tom Lendacky

Update the MAINTAINERS file for the AMD CCP device driver.

Signed-off-by: Tom Lendacky 
---
 MAINTAINERS |7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 051e4dc..de22604 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -525,6 +525,13 @@ F: drivers/tty/serial/altera_jtaguart.c
 F: include/linux/altera_uart.h
 F: include/linux/altera_jtaguart.h
 
+AMD CRYPTOGRAPHIC COPROCESSOR (CCP) DRIVER
+M: Tom Lendacky 
+L: linux-cry...@vger.kernel.org
+S: Supported
+F: drivers/crypto/ccp/
+F: include/linux/ccp.h
+
 AMD FAM15H PROCESSOR POWER MONITORING DRIVER
 M: Andreas Herrmann 
 L: lm-sens...@lm-sensors.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/10] crypto: CCP XTS-AES crypto API support

2013-11-12 Thread Tom Lendacky

These routines provide crypto API support for the XTS-AES mode of AES
on the AMD CCP.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c |  285 +++
 1 file changed, 285 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-xts.c

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c 
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
new file mode 100644
index 000..d100b48
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
@@ -0,0 +1,285 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) AES XTS crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+
+struct ccp_aes_xts_def {
+   const char *name;
+   const char *drv_name;
+};
+
+static struct ccp_aes_xts_def aes_xts_algs[] = {
+   {
+   .name   = "xts(aes)",
+   .drv_name   = "xts-aes-ccp",
+   },
+};
+
+struct ccp_unit_size_map {
+   unsigned int size;
+   u32 value;
+};
+
+static struct ccp_unit_size_map unit_size_map[] = {
+   {
+   .size   = 4096,
+   .value  = CCP_XTS_AES_UNIT_SIZE_4096,
+   },
+   {
+   .size   = 2048,
+   .value  = CCP_XTS_AES_UNIT_SIZE_2048,
+   },
+   {
+   .size   = 1024,
+   .value  = CCP_XTS_AES_UNIT_SIZE_1024,
+   },
+   {
+   .size   = 512,
+   .value  = CCP_XTS_AES_UNIT_SIZE_512,
+   },
+   {
+   .size   = 256,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 128,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 64,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 32,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 16,
+   .value  = CCP_XTS_AES_UNIT_SIZE_16,
+   },
+   {
+   .size   = 1,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+};
+
+static int ccp_aes_xts_complete(struct crypto_async_request *async_req, int 
ret)
+{
+   struct ablkcipher_request *req = ablkcipher_request_cast(async_req);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+
+   if (ret)
+   return ret;
+
+   memcpy(req->info, rctx->iv, AES_BLOCK_SIZE);
+
+   return 0;
+}
+
+static int ccp_aes_xts_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+ unsigned int key_len)
+{
+   struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm));
+
+   /* Only support 128-bit AES key with a 128-bit Tweak key,
+* otherwise use the fallback
+*/
+   switch (key_len) {
+   case AES_KEYSIZE_128 * 2:
+   memcpy(ctx->u.aes.key, key, key_len);
+   break;
+   }
+   ctx->u.aes.key_len = key_len / 2;
+   sg_init_one(>u.aes.key_sg, ctx->u.aes.key, key_len);
+
+   return crypto_ablkcipher_setkey(ctx->u.aes.tfm_ablkcipher, key,
+   key_len);
+}
+
+static int ccp_aes_xts_crypt(struct ablkcipher_request *req,
+unsigned int encrypt)
+{
+   struct crypto_tfm *tfm =
+   crypto_ablkcipher_tfm(crypto_ablkcipher_reqtfm(req));
+   struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+   unsigned int unit;
+   int ret;
+
+   if (!ctx->u.aes.key_len) {
+   pr_err("AES key not set\n");
+   return -EINVAL;
+   }
+
+   if (req->nbytes & (AES_BLOCK_SIZE - 1)) {
+   pr_err("AES request size is not a multiple of the block 
size\n");
+   return -EINVAL;
+   }
+
+   if (!req->info) {
+   pr_err("AES IV not supplied");
+   return -EINVAL;
+   }
+
+   for (unit = 0; unit < ARRAY_SIZE(unit_size_map); unit++)
+   if (!(req->nbytes & (unit_size_map[unit].size - 1)))
+   break;
+
+   if ((unit_size_map[unit].value == CCP_XTS_AES_UNIT_SIZE__LAST) ||
+   (ctx->u.aes.key_len != AES_KEYSIZE_128)) {
+   /* Use the fallback to process the request for any
+* unsupported unit sizes or key sizes
+*/
+   ablkcipher_request_set_tfm(req, ctx->u.aes.tfm_ablkcipher);
+   ret = (encrypt) ? crypto_ablkcipher_encrypt(req) :
+ crypto_ablkcipher_decrypt(req);
+

[PATCH 01/10] crypto: authenc - Find proper IV address in ablkcipher callback

2013-11-12 Thread Tom Lendacky

When performing an asynchronous ablkcipher operation the authenc
completion callback routine is invoked, but it does not locate and use
the proper IV.

The callback routine, crypto_authenc_encrypt_done, is updated to use
the same method of calculating the address of the IV as is done in
crypto_authenc_encrypt function which sets up the callback.

Signed-off-by: Tom Lendacky 
---
 crypto/authenc.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/crypto/authenc.c b/crypto/authenc.c
index ffce19d..528b00b 100644
--- a/crypto/authenc.c
+++ b/crypto/authenc.c
@@ -368,9 +368,10 @@ static void crypto_authenc_encrypt_done(struct 
crypto_async_request *req,
if (!err) {
struct crypto_aead *authenc = crypto_aead_reqtfm(areq);
struct crypto_authenc_ctx *ctx = crypto_aead_ctx(authenc);
-   struct ablkcipher_request *abreq = aead_request_ctx(areq);
-   u8 *iv = (u8 *)(abreq + 1) +
-crypto_ablkcipher_reqsize(ctx->enc);
+   struct authenc_request_ctx *areq_ctx = aead_request_ctx(areq);
+   struct ablkcipher_request *abreq = (void *)(areq_ctx->tail
+   + ctx->reqoff);
+   u8 *iv = (u8 *)abreq - crypto_ablkcipher_ivsize(ctx->enc);
 
err = crypto_authenc_genicv(areq, iv, 0);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/10] crypto: scatterwalk - Set the chain pointer indication bit

2013-11-12 Thread Tom Lendacky

The scatterwalk_crypto_chain function invokes the scatterwalk_sg_chain
function to chain two scatterlists, but the chain pointer indication
bit is not set.  When the resulting scatterlist is used, for example,
by sg_nents to count the number of scatterlist entries, a segfault occurs
because sg_nents does not follow the chain pointer to the chained scatterlist.

Update scatterwalk_sg_chain to set the chain pointer indication bit as is
done by the sg_chain function.

Signed-off-by: Tom Lendacky 
---
 include/crypto/scatterwalk.h |1 +
 1 file changed, 1 insertion(+)

diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 13621cc..64ebede 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -36,6 +36,7 @@ static inline void scatterwalk_sg_chain(struct scatterlist 
*sg1, int num,
 {
sg_set_page([num - 1], (void *)sg2, 0, 0);
sg1[num - 1].page_link &= ~0x02;
+   sg1[num - 1].page_link |= 0x01;
 }
 
 static inline struct scatterlist *scatterwalk_sg_next(struct scatterlist *sg)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] I2C: busses: Do not print error message in syslog if no ACK received

2013-11-12 Thread Andreas Werner

Using the i2c-eg20t driver and call i2cdetect or probe on the bus,
the driver will print a lot of error messages if there was no ACK
received.

i2cdetect normally print a table with all the available devices. If there
is no device on the address, the table will be empty.
Currently with the i2c-eg20t driver, the table is not visible because
the error messages destroy the table.

Error message: pch_i2c_getack return -71

This patch prevent the driver to print the messages to syslog if debug is not 
set.

Tested on Intel Atom E6xx and Eg20t Chipset.

Signed-off-by: Andreas Werner 
---
 drivers/i2c/busses/i2c-eg20t.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-eg20t.c b/drivers/i2c/busses/i2c-eg20t.c
index 0f37529..b10c651 100644
--- a/drivers/i2c/busses/i2c-eg20t.c
+++ b/drivers/i2c/busses/i2c-eg20t.c
@@ -322,7 +322,7 @@ static s32 pch_i2c_getack(struct i2c_algo_pch_data *adap)
reg_val = ioread32(p + PCH_I2CSR) & PCH_GETACK;
 
if (reg_val != 0) {
-   pch_err(adap, "return%d\n", -EPROTO);
+   pch_dbg(adap, "return%d\n", -EPROTO);
return -EPROTO;
}
 
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/10] crypto: crypto API interface to the CCP device driver

2013-11-12 Thread Tom Lendacky

These routines provide the support for the interface between the crypto API
and the AMD CCP. This includes insuring that requests associated with a
given tfm on the same cpu are processed in the order received.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-main.c |  432 ++
 drivers/crypto/ccp/ccp-crypto.h  |  191 +++
 2 files changed, 623 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-main.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto.h

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c 
b/drivers/crypto/ccp/ccp-crypto-main.c
new file mode 100644
index 000..2636f04
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -0,0 +1,432 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+MODULE_AUTHOR("Tom Lendacky ");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.0.0");
+MODULE_DESCRIPTION("AMD Cryptographic Coprocessor crypto API support");
+
+
+/* List heads for the supported algorithms */
+static LIST_HEAD(hash_algs);
+static LIST_HEAD(cipher_algs);
+
+/* For any tfm, requests for that tfm on the same CPU must be returned
+ * in the order received.  With multiple queues available, the CCP can
+ * process more than one cmd at a time.  Therefore we must maintain
+ * a cmd list to insure the proper ordering of requests on a given tfm/cpu
+ * combination.
+ */
+struct ccp_crypto_cpu_queue {
+   struct list_head cmds;
+   struct list_head *backlog;
+   unsigned int cmd_count;
+};
+#define CCP_CRYPTO_MAX_QLEN50
+
+struct ccp_crypto_percpu_queue {
+   struct ccp_crypto_cpu_queue __percpu *cpu_queue;
+};
+static struct ccp_crypto_percpu_queue req_queue;
+
+struct ccp_crypto_cmd {
+   struct list_head entry;
+
+   struct ccp_cmd *cmd;
+
+   /* Save the crypto_tfm and crypto_async_request addresses
+* separately to avoid any reference to a possibly invalid
+* crypto_async_request structure after invoking the request
+* callback
+*/
+   struct crypto_async_request *req;
+   struct crypto_tfm *tfm;
+
+   /* Used for held command processing to determine state */
+   int ret;
+
+   int cpu;
+};
+
+struct ccp_crypto_cpu {
+   struct work_struct work;
+   struct completion completion;
+   struct ccp_crypto_cmd *crypto_cmd;
+   int err;
+};
+
+
+static inline bool ccp_crypto_success(int err)
+{
+   if (err && (err != -EINPROGRESS) && (err != -EBUSY))
+   return false;
+
+   return true;
+}
+
+/*
+ * ccp_crypto_cmd_complete must be called while running on the appropriate
+ * cpu and the caller must have done a get_cpu to disable preemption
+ */
+static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
+   struct ccp_crypto_cmd *crypto_cmd, struct ccp_crypto_cmd **backlog)
+{
+   struct ccp_crypto_cpu_queue *cpu_queue;
+   struct ccp_crypto_cmd *held = NULL, *tmp;
+
+   *backlog = NULL;
+
+   cpu_queue = this_cpu_ptr(req_queue.cpu_queue);
+
+   /* Held cmds will be after the current cmd in the queue so start
+* searching for a cmd with a matching tfm for submission.
+*/
+   tmp = crypto_cmd;
+   list_for_each_entry_continue(tmp, _queue->cmds, entry) {
+   if (crypto_cmd->tfm != tmp->tfm)
+   continue;
+   held = tmp;
+   break;
+   }
+
+   /* Process the backlog:
+*   Because cmds can be executed from any point in the cmd list
+*   special precautions have to be taken when handling the backlog.
+*/
+   if (cpu_queue->backlog != _queue->cmds) {
+   /* Skip over this cmd if it is the next backlog cmd */
+   if (cpu_queue->backlog == _cmd->entry)
+   cpu_queue->backlog = crypto_cmd->entry.next;
+
+   *backlog = container_of(cpu_queue->backlog,
+   struct ccp_crypto_cmd, entry);
+   cpu_queue->backlog = cpu_queue->backlog->next;
+
+   /* Skip over this cmd if it is now the next backlog cmd */
+   if (cpu_queue->backlog == _cmd->entry)
+   cpu_queue->backlog = crypto_cmd->entry.next;
+   }
+
+   /* Remove the cmd entry from the list of cmds */
+   cpu_queue->cmd_count--;
+   list_del(_cmd->entry);
+
+   return held;
+}
+
+static void ccp_crypto_complete_on_cpu(struct work_struct *work)
+{
+   struct ccp_crypto_cpu *cpu_work =
+   container_of(work, struct ccp_crypto_cpu, work);
+   struct ccp_crypto_cmd

Re: [PATCH net 2/2] macvtap: limit head length of skb allocated

2013-11-12 Thread Greg Rose

On Tue, 12 Nov 2013 18:02:57 +0800
Jason Wang  wrote:

> We currently use hdr_len as a hint of head length which is advertised
> by guest. But when guest advertise a very big value, it can lead to
> an 64K+ allocating of kmalloc() which has a very high possibility of
> failure when host memory is fragmented or under heavy stress. The
> huge hdr_len also reduce the effect of zerocopy or even disable if a
> gso skb is linearized in guest.
> 
> To solves those issues, this patch introduces an upper limit
> (PAGE_SIZE) of the head, which guarantees an order 0 allocation each
> time.
> 
> Cc: Stefan Hajnoczi 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 
> ---
> The patch was needed for stable.
> ---
>  drivers/net/macvtap.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index 9dccb1e..7ee6f9d 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -523,6 +523,11 @@ static inline struct sk_buff
> *macvtap_alloc_skb(struct sock *sk, size_t prepad, int noblock, int
> *err) {
>   struct sk_buff *skb;
> + int good_linear = SKB_MAX_HEAD(prepad);
> +
> + /* Don't use huge linear part */
> + if (linear > good_linear)
> + linear = good_linear;
>  
>   /* Under a page?  Don't bother with paged skb. */
>   if (prepad + len < PAGE_SIZE || !linear)

I see no problem with this or the tuntap patch except that in both
cases kernel coding style would prefer that you align the local
variable declarations in a reverse pyramid, longest at the beginning,
shortest at the end.

- Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] uprobes: Add uprobe_task->dup_work/dup_addr

2013-11-12 Thread Srikar Dronamraju

> Yes, and it is always equal to regs->ip when pre_ssout() is called,
> 
> > and do the necessary fixups after single stepping out of line.
> 
> Exactly. So it is write-only (and meaningless) to the generic uprobe
> code. We can (and perhaps should) move it into autask->saved_vaddr,
> arch_uprobe_pre_xol() can initialize it.
> 
> > The casual reading of this commit message, one can get an impression
> > that vaddr is never needed.
> 
> See above. The changelog doesn't say we can simply remove it, it says
> "move it".


Okay, moving to arch_uprobe_task is fine. I probably got confused by 
"First of all it is not really needed," 
> 
> > Your change still retains it.
> 
> 
> OK. How about dup_xol_work/dup_xol_vaddr ?
> 

Yes fine with me.

-- 
Thanks and Regards
Srikar Dronamraju

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH v5 01/14] sched: add a new arch_sd_local_flags for sched_domain init

2013-11-12 Thread Dietmar Eggemann

On 06/11/13 14:08, Peter Zijlstra wrote:
> On Wed, Nov 06, 2013 at 02:53:44PM +0100, Martin Schwidefsky wrote:
>> On Tue, 5 Nov 2013 23:27:52 +0100
>> Peter Zijlstra  wrote:
>>
>>> On Tue, Nov 05, 2013 at 03:57:23PM +0100, Vincent Guittot wrote:
 Your proposal looks fine for me. It's clearly better to move in one
 place the configuration of sched_domain fields. Have you already got
 an idea about how to let architecture override the topology?
>>>
>>> Maybe something like the below -- completely untested (my s390 compiler
>>> is on a machine that's currently powered off).
>>
>> In principle I do not see a reason why this should not work, but there
>> are a few more things to take care of. E.g. struct sd_data is defined
>> in kernel/sched/core.c, cpu_cpu_mask as well. These need to be moved
>> to a header where arch/s390/kernel/smp.c can pick it up.
>>
>> I do have the feeling that the sched_domain_topology should be left
>> where they are, or do we really want to expose more of the scheduler
>> internals?
> 
> Ah, its a trade off; in that previous patch I removed the entire
> sched_domain initializers the archs used to 'have' to fill out. That
> exposed far too much behavioural stuff the archs really shouldn't
> bother with.
> 
> In return we now provide a (hopefully) simpler interface that allows
> archs to communicate their topology to the scheduler -- without getting
> mixed up in the behavioural aspects (too much).
> 
> Maybe s390 wasn't the best example to pick, as the book domain really
> isn't that exciting. Arguably I should have taken Power7+ and the
> ASYM_PACKING SMT thing.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

We actually don't have to expose sched_domain_topology or any internal
scheduler data structures.

We still can get rid of the SD_XXX_INIT stuff and do the sched_domain
initialization for all levels in one function sd_init().

Moreover, we could introduce a arch specific general function replacing
arch specific functions for particular flags and levels like
arch_sd_sibling_asym_packing() or Vincent's arch_sd_local_flags().
This arch specific general function exposes the level and the
sched_domain pointer to the arch which then could fine tune sched_domain
in each individual level.

Below is a patch which bases on your idea to transform sd_numa_init()
into sd_init(). The main difference is that I don't try to distinguish
based of power management related flags inside sd_init() but rather on
the new sd level data.

Dietmar

8<

>From 3df278ad50690a7878c9cc6b18e226805e1f4bd1 Mon Sep 17 00:00:00 2001
From: Dietmar Eggemann 
Date: Tue, 12 Nov 2013 12:37:36 +
Subject: [PATCH] sched: rework sched_domain setup code

This patch removes the sched_domain initializer macros
SD_[SIBLING|MC|BOOK|CPU]_INIT in core.c and in archs and replaces them
with calls to the new function sd_init().  The function sd_init
incorporates the already existing function sd_numa_init().

It introduces preprocessor constants (SD_LVL_[INV|SMT|MC|BOOK|CPU|NUMA])
and replaces 'sched_domain_init_f init' with 'int level' data member in
struct sched_domain_topology_level.

The new data member is used to distinguish the sched_domain level in
sd_init() and is also passed as an argument to the arch specific
function to tweak the sched_domain described below.

To make it still possible for archs to tweak the individual
sched_domain level, a new weak function arch_sd_customize(int level,
struct sched_domain *sd, int cpu) is introduced.
By exposing the sched_domain level and the pointer to the sched_domain
data structure, the archs can tweak individual data members, like the
min or max interval or the flags.  This function also replaces the
existing function arch_sd_sibiling_asym_packing() which is specialized
in setting the SD_ASYM_PACKING flag for the SMT sched_domain level.
The parameter cpu is currently not used but could be used in the
future to setup sched_domain structures in one sched_domain level
differently for different cpus.

Initialization of a sched_domain is done in three steps. First, at the
beginning of sd_init(), the sched_domain data members are set which
have the same value for all or at least most of the sched_domain
levels.  Second, sched_domain data members are set for each
sched_domain level individually in sd_init().  Third,
arch_sd_customize() is called in sd_init().

One exception is SD_NODE_INIT which this patch removes from
arch/metag/include/asm/topology.h. I don't now how it's been used so
this patch does not provide a metag specific arch_sd_customize()
implementation.

This patch has been tested on ARM TC2 (5 CPUs, sched_domain level MC
and CPU) and compile-tested for x86_64, powerpc (chroma_defconfig) and
mips (ip27_defconfig).

It is against v3.12 .

Re: [RFC][PATCH v5 00/14] sched: packing tasks

2013-11-12 Thread Catalin Marinas

On Mon, Nov 11, 2013 at 04:36:30PM +, Peter Zijlstra wrote:
> On Mon, Nov 11, 2013 at 11:33:45AM +, Catalin Marinas wrote:
> 
> tl;dr :-) Still trying to wrap my head around how to do that weird
> topology Vincent raised..

Long email, I know, but topology discussion is a good start ;).

To summarise the rest, I don't see full task packing as useful but
rather getting packing as a result of other decisions (like trying to
estimate the cost of task placement and refining the algorithm from
there). There are ARM SoCs where maximising idle time does not always
mean maximising the energy saving even if the cores can be power-gated
individually (unless you have small workload that doesn't increase the
P-state on the packing CPU).

> > Question for Peter/Ingo: do you want the scheduler to decide on which
> > C-state a CPU should be in or we still leave this to a cpuidle
> > layer/driver?
> 
> I think the can leave most of that in a driver; right along with how to
> prod the hardware to actually get into that state.
> 
> I think the most important parts are what is now 'generic' code; stuff
> that guestimates the idle-time and so forth.
> 
> I think the scheduler simply wants to say: we expect to go idle for X
> ns, we want a guaranteed wakeup latency of Y ns -- go do your thing.

Sounds good (and I think the Linaro guys started looking into this).

> I think you also raised the point in that we do want some feedback as to
> the cost of waking up particular cores to better make decisions on which
> to wake. That is indeed so.

It depends on how we end up implementing the energy awareness in the
scheduler but too simple topology (just which CPUs can be power-gated)
is not that useful.

In a very simplistic and ideal world (note the 'ideal' part), we could
estimate the energy cost of a CPU for a period T:

E = sum(P(Cx) * Tx) + sum(wake-up-energy) + sum(P(Ty) * Ty)

  P(Cx): power in C-state x
  wake-up-energy: the cost of waking up from various C-states
  P(Ty): power of running task y (which also depends on the P-state)
  sum(Tx) + sum(Ty) = T

Assuming that we have such information and can predict (based on past
usage) what the task loads will be, together with other
performance/latency constraints, an 'ideal' scheduler would always
choose the correct C/P states and task placements for optimal energy.
However, the reality is different and even so it would be an NP problem.

But we can try to come up with some "guestimates" based on parameters
provided by the SoC (via DT or ACPI tables or just some low-level
driver/arch code). The scheduler does its best according on these
parameters at certain times (task wake-up, idle balance) while the SoC
can still tune the behaviour.

If we roughly estimate the energy cost of a run-queue and the energy
cost of individual tasks on that run-queue (based on their load and
P-state), we could estimate the cost of moving or waking the
task on another CPU (where the task's cost may change depending on
asymmetric configurations or different P-state). We don't even need to
be precise in the energy costs but just some relative numbers so that
the scheduler can favour one CPU or another. If we ignore P-state costs
and only consider C-states and symmetric configurations, we probably get
a behaviour similar to Vincent's task packing patches.

The information we have currently for C-states is target residency and
exit latency. From these I think we can only infer the wake-up energy
cost not how much we save but placing a CPU into that state. So if we
want the scheduler to decide whether to pack or spread (from an energy
cost perspective), we need additional information in the topology.

Alternatively we could have a power driver which dynamically returns
some estimates every time the scheduler asks for them, with a power
driver for each SoC (which is already the case for ARM SoCs).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel BUG at kernel/kallsyms.c:222!

2013-11-12 Thread Jonathan Austin


On 12/11/13 03:22, Ming Lei wrote:

On Tue, Nov 12, 2013 at 3:32 AM, Russell King - ARM Linux
 wrote:

On Mon, Nov 11, 2013 at 05:15:29PM +, Jonathan Austin wrote:

I've tested the patch below and it solves the ARM side of things - so
gives you an option other than a complete revert. Happy to put this in to
RMK's patch system if you'd prefer not to have to revert and he's happy
with the patch.


I think this is the right solution because it then means that this symbol
has the same meaning whether on MMU or !MMU - and getting rid of these
kinds of gratuitous variances are the only way that !MMU is going to
become less fragile.


The patch only fixes problem on arm, and other !MMU&&!ARM archs
should be affected too.

Also there is no CONFIG_PAGE_OFFSET defined for some ARCHs,
such as 64bit ARCHs.

Currently, I suggest to filter only on ARM as attachment patch if we plan
to merge Jonathan's patch, otherwise a more complicated approach has
to be figured out  to do the filter(such as, define a readonly symbol in
kernel to store PAGE_OFFSET, and let scripts/kallsyms use it for
filtering).


I'm happy with that approach, though allowing only ARM seems a bit
conservative - is it the only architecture we actually expect to work?

Jonny




Thanks,




-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in 
England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/10] crytpo: CCP device driver build files

2013-11-12 Thread Tom Lendacky

These files provide the ability to configure and build the
AMD CCP device driver and crypto API support.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/Kconfig  |   12 
 drivers/crypto/Makefile |1 +
 drivers/crypto/ccp/Kconfig  |   23 +++
 drivers/crypto/ccp/Makefile |   10 ++
 4 files changed, 46 insertions(+)
 create mode 100644 drivers/crypto/ccp/Kconfig
 create mode 100644 drivers/crypto/ccp/Makefile

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index f4fd837..4954d75 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -399,4 +399,16 @@ config CRYPTO_DEV_ATMEL_SHA
  To compile this driver as a module, choose M here: the module
  will be called atmel-sha.
 
+config CRYPTO_DEV_CCP
+   bool "Support for AMD Cryptographic Coprocessor"
+   depends on X86
+   default n
+   help
+ The AMD Cryptographic Coprocessor provides hardware support
+ for encryption, hashing and related operations.
+
+if CRYPTO_DEV_CCP
+   source "drivers/crypto/ccp/Kconfig"
+endif
+
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index b4946dd..8a6c86a 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -22,3 +22,4 @@ obj-$(CONFIG_CRYPTO_DEV_NX) += nx/
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_AES) += atmel-aes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
+obj-$(CONFIG_CRYPTO_DEV_CCP) += ccp/
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
new file mode 100644
index 000..335ed5c
--- /dev/null
+++ b/drivers/crypto/ccp/Kconfig
@@ -0,0 +1,23 @@
+config CRYPTO_DEV_CCP_DD
+   tristate "Cryptographic Coprocessor device driver"
+   depends on CRYPTO_DEV_CCP
+   default m
+   help
+ Provides the interface to use the AMD Cryptographic Coprocessor
+ which can be used to accelerate or offload encryption operations
+ such as SHA, AES and more. If you choose 'M' here, this module
+ will be called ccp.
+
+config CRYPTO_DEV_CCP_CRYPTO
+   tristate "Encryption and hashing acceleration support"
+   depends on CRYPTO_DEV_CCP_DD
+   default m
+   select CRYPTO_ALGAPI
+   select CRYPTO_HASH
+   select CRYPTO_BLKCIPHER
+   select CRYPTO_AUTHENC
+   help
+ Support for using the cryptographic API with the AMD Cryptographic
+ Coprocessor. This module supports acceleration and offload of SHA
+ and AES algorithms.  If you choose 'M' here, this module will be
+ called ccp_crypto.
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
new file mode 100644
index 000..d3505a0
--- /dev/null
+++ b/drivers/crypto/ccp/Makefile
@@ -0,0 +1,10 @@
+obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
+ccp-objs := ccp-dev.o ccp-ops.o
+ccp-objs += ccp-pci.o
+
+obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
+ccp-crypto-objs := ccp-crypto-main.o \
+  ccp-crypto-aes.o \
+  ccp-crypto-aes-cmac.o \
+  ccp-crypto-aes-xts.o \
+  ccp-crypto-sha.o


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/10] crypto: CCP AES crypto API support

2013-11-12 Thread Tom Lendacky

These routines provide crypto API support for AES on the AMD CCP.

Support for AES modes: ECB, CBC, OFB, CFB and CTR

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-aes.c |  375 +++
 1 file changed, 375 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes.c

diff --git a/drivers/crypto/ccp/ccp-crypto-aes.c 
b/drivers/crypto/ccp/ccp-crypto-aes.c
new file mode 100644
index 000..f302a5b7
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-aes.c
@@ -0,0 +1,375 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) AES crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+
+static int ccp_aes_complete(struct crypto_async_request *async_req, int ret)
+{
+   struct ablkcipher_request *req = ablkcipher_request_cast(async_req);
+   struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+
+   if (ret)
+   return ret;
+
+   if (ctx->u.aes.mode != CCP_AES_MODE_ECB)
+   memcpy(req->info, rctx->iv, AES_BLOCK_SIZE);
+
+   return 0;
+}
+
+static int ccp_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+ unsigned int key_len)
+{
+   struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm));
+   struct ccp_crypto_ablkcipher_alg *alg =
+   ccp_crypto_ablkcipher_alg(crypto_ablkcipher_tfm(tfm));
+
+   switch (key_len) {
+   case AES_KEYSIZE_128:
+   ctx->u.aes.type = CCP_AES_TYPE_128;
+   break;
+   case AES_KEYSIZE_192:
+   ctx->u.aes.type = CCP_AES_TYPE_192;
+   break;
+   case AES_KEYSIZE_256:
+   ctx->u.aes.type = CCP_AES_TYPE_256;
+   break;
+   default:
+   crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   return -EINVAL;
+   }
+   ctx->u.aes.mode = alg->mode;
+   ctx->u.aes.key_len = key_len;
+
+   memcpy(ctx->u.aes.key, key, key_len);
+   sg_init_one(>u.aes.key_sg, ctx->u.aes.key, key_len);
+
+   return 0;
+}
+
+static int ccp_aes_crypt(struct ablkcipher_request *req, bool encrypt)
+{
+   struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+   struct scatterlist *iv_sg = NULL;
+   unsigned int iv_len = 0;
+   int ret;
+
+   if (!ctx->u.aes.key_len) {
+   pr_err("AES key not set\n");
+   return -EINVAL;
+   }
+
+   if (((ctx->u.aes.mode == CCP_AES_MODE_ECB) ||
+(ctx->u.aes.mode == CCP_AES_MODE_CBC) ||
+(ctx->u.aes.mode == CCP_AES_MODE_CFB)) &&
+   (req->nbytes & (AES_BLOCK_SIZE - 1))) {
+   pr_err("AES request size is not a multiple of the block 
size\n");
+   return -EINVAL;
+   }
+
+   if (ctx->u.aes.mode != CCP_AES_MODE_ECB) {
+   if (!req->info) {
+   pr_err("AES IV not supplied");
+   return -EINVAL;
+   }
+
+   memcpy(rctx->iv, req->info, AES_BLOCK_SIZE);
+   iv_sg = >iv_sg;
+   iv_len = AES_BLOCK_SIZE;
+   sg_init_one(iv_sg, rctx->iv, iv_len);
+   }
+
+   memset(>cmd, 0, sizeof(rctx->cmd));
+   INIT_LIST_HEAD(>cmd.entry);
+   rctx->cmd.engine = CCP_ENGINE_AES;
+   rctx->cmd.u.aes.type = ctx->u.aes.type;
+   rctx->cmd.u.aes.mode = ctx->u.aes.mode;
+   rctx->cmd.u.aes.action =
+   (encrypt) ? CCP_AES_ACTION_ENCRYPT : CCP_AES_ACTION_DECRYPT;
+   rctx->cmd.u.aes.key = >u.aes.key_sg;
+   rctx->cmd.u.aes.key_len = ctx->u.aes.key_len;
+   rctx->cmd.u.aes.iv = iv_sg;
+   rctx->cmd.u.aes.iv_len = iv_len;
+   rctx->cmd.u.aes.src = req->src;
+   rctx->cmd.u.aes.src_len = req->nbytes;
+   rctx->cmd.u.aes.dst = req->dst;
+
+   ret = ccp_crypto_enqueue_request(>base, >cmd);
+
+   return ret;
+}
+
+static int ccp_aes_encrypt(struct ablkcipher_request *req)
+{
+   return ccp_aes_crypt(req, true);
+}
+
+static int ccp_aes_decrypt(struct ablkcipher_request *req)
+{
+   return ccp_aes_crypt(req, false);
+}
+
+static int ccp_aes_cra_init(struct crypto_tfm *tfm)
+{
+   struct ccp_ctx *ctx = crypto_tfm_ctx(tfm);
+
+   ctx->complete = ccp_aes_complete;
+   ctx->u.aes.key_len = 0;
+
+   tfm->crt_ablkcipher.reqsize = sizeof(struct ccp_aes_req_ctx);
+
+   return 0;
+}
+
+static void ccp_aes_cra_exit(struct crypto_tfm *tfm)
+{
+}
+
+static int ccp_aes_rfc3686_complete(struct crypto_async_request

[PATCH 02/10] crypto: scatterwalk - Set the chain pointer indication bit

2013-11-12 Thread Tom Lendacky

The scatterwalk_crypto_chain function invokes the scatterwalk_sg_chain
function to chain two scatterlists, but the chain pointer indication
bit is not set.  When the resulting scatterlist is used, for example,
by sg_nents to count the number of scatterlist entries, a segfault occurs
because sg_nents does not follow the chain pointer to the chained scatterlist.

Update scatterwalk_sg_chain to set the chain pointer indication bit as is
done by the sg_chain function.

Signed-off-by: Tom Lendacky 
---
 include/crypto/scatterwalk.h |1 +
 1 file changed, 1 insertion(+)

diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 13621cc..64ebede 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -36,6 +36,7 @@ static inline void scatterwalk_sg_chain(struct scatterlist 
*sg1, int num,
 {
sg_set_page([num - 1], (void *)sg2, 0, 0);
sg1[num - 1].page_link &= ~0x02;
+   sg1[num - 1].page_link |= 0x01;
 }
 
 static inline struct scatterlist *scatterwalk_sg_next(struct scatterlist *sg)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/10] crypto: authenc - Find proper IV address in ablkcipher callback

2013-11-12 Thread Tom Lendacky

When performing an asynchronous ablkcipher operation the authenc
completion callback routine is invoked, but it does not locate and use
the proper IV.

The callback routine, crypto_authenc_encrypt_done, is updated to use
the same method of calculating the address of the IV as is done in
crypto_authenc_encrypt function which sets up the callback.

Signed-off-by: Tom Lendacky 
---
 crypto/authenc.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/crypto/authenc.c b/crypto/authenc.c
index ffce19d..528b00b 100644
--- a/crypto/authenc.c
+++ b/crypto/authenc.c
@@ -368,9 +368,10 @@ static void crypto_authenc_encrypt_done(struct 
crypto_async_request *req,
if (!err) {
struct crypto_aead *authenc = crypto_aead_reqtfm(areq);
struct crypto_authenc_ctx *ctx = crypto_aead_ctx(authenc);
-   struct ablkcipher_request *abreq = aead_request_ctx(areq);
-   u8 *iv = (u8 *)(abreq + 1) +
-crypto_ablkcipher_reqsize(ctx->enc);
+   struct authenc_request_ctx *areq_ctx = aead_request_ctx(areq);
+   struct ablkcipher_request *abreq = (void *)(areq_ctx->tail
+   + ctx->reqoff);
+   u8 *iv = (u8 *)abreq - crypto_ablkcipher_ivsize(ctx->enc);
 
err = crypto_authenc_genicv(areq, iv, 0);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/10] crypto: CCP maintainer information

2013-11-12 Thread Tom Lendacky

Update the MAINTAINERS file for the AMD CCP device driver.

Signed-off-by: Tom Lendacky 
---
 MAINTAINERS |7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 051e4dc..de22604 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -525,6 +525,13 @@ F: drivers/tty/serial/altera_jtaguart.c
 F: include/linux/altera_uart.h
 F: include/linux/altera_jtaguart.h
 
+AMD CRYPTOGRAPHIC COPROCESSOR (CCP) DRIVER
+M: Tom Lendacky 
+L: linux-cry...@vger.kernel.org
+S: Supported
+F: drivers/crypto/ccp/
+F: include/linux/ccp.h
+
 AMD FAM15H PROCESSOR POWER MONITORING DRIVER
 M: Andreas Herrmann 
 L: lm-sens...@lm-sensors.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/10] crypto: CCP AES CMAC mode crypto API support

2013-11-12 Thread Tom Lendacky

These routines provide crypto API support for the CMAC mode of AES
on the AMD CCP.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c |  355 ++
 1 file changed, 355 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-cmac.c

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c 
b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
new file mode 100644
index 000..5b9cd98
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
@@ -0,0 +1,355 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) AES CMAC crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+
+static int ccp_aes_cmac_complete(struct crypto_async_request *async_req,
+int ret)
+{
+   struct ahash_request *req = ahash_request_cast(async_req);
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_aes_cmac_req_ctx *rctx = ahash_request_ctx(req);
+   unsigned int digest_size = crypto_ahash_digestsize(tfm);
+
+   if (ret)
+   goto e_free;
+
+   if (rctx->hash_rem) {
+   /* Save remaining data to buffer */
+   scatterwalk_map_and_copy(rctx->buf, rctx->cmd.u.aes.src,
+rctx->hash_cnt, rctx->hash_rem, 0);
+   rctx->buf_count = rctx->hash_rem;
+   } else
+   rctx->buf_count = 0;
+
+   memcpy(req->result, rctx->iv, digest_size);
+
+e_free:
+   sg_free_table(>data_sg);
+
+   return ret;
+}
+
+static int ccp_do_cmac_update(struct ahash_request *req, unsigned int nbytes,
+ unsigned int final)
+{
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
+   struct ccp_aes_cmac_req_ctx *rctx = ahash_request_ctx(req);
+   struct scatterlist *sg, *cmac_key_sg = NULL;
+   unsigned int block_size =
+   crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
+   unsigned int len, need_pad, sg_count;
+   int ret;
+
+   if (!ctx->u.aes.key_len) {
+   pr_err("AES key not set\n");
+   return -EINVAL;
+   }
+
+   if (nbytes)
+   rctx->null_msg = 0;
+
+   if (!final && ((nbytes + rctx->buf_count) <= block_size)) {
+   scatterwalk_map_and_copy(rctx->buf + rctx->buf_count, req->src,
+0, nbytes, 0);
+   rctx->buf_count += nbytes;
+
+   return 0;
+   }
+
+   len = rctx->buf_count + nbytes;
+
+   rctx->final = final;
+   rctx->hash_cnt = final ? len : len & ~(block_size - 1);
+   rctx->hash_rem = final ?   0 : len &  (block_size - 1);
+   if (!final && (rctx->hash_cnt == len)) {
+   /* CCP can't do zero length final, so keep some data around */
+   rctx->hash_cnt -= block_size;
+   rctx->hash_rem = block_size;
+   }
+
+   if (final && (rctx->null_msg || (len & (block_size - 1
+   need_pad = 1;
+   else
+   need_pad = 0;
+
+   sg_init_one(>iv_sg, rctx->iv, sizeof(rctx->iv));
+
+   /* Build the data scatterlist table - allocate enough entries for all
+* possible data pieces (buffer, input data, padding)
+*/
+   sg_count = (nbytes) ? sg_nents(req->src) + 2 : 2;
+   ret = sg_alloc_table(>data_sg, sg_count, GFP_KERNEL);
+   if (ret)
+   return ret;
+
+   sg = NULL;
+   if (rctx->buf_count) {
+   sg_init_one(>buf_sg, rctx->buf, rctx->buf_count);
+   sg = ccp_crypto_sg_table_add(>data_sg, >buf_sg);
+   }
+
+   if (nbytes)
+   sg = ccp_crypto_sg_table_add(>data_sg, req->src);
+
+   if (need_pad) {
+   int pad_length = block_size - (len & (block_size - 1));
+
+   rctx->hash_cnt += pad_length;
+
+   memset(rctx->pad, 0, sizeof(rctx->pad));
+   rctx->pad[0] = 0x80;
+   sg_init_one(>pad_sg, rctx->pad, pad_length);
+   sg = ccp_crypto_sg_table_add(>data_sg, >pad_sg);
+   }
+   if (sg)
+   sg_mark_end(sg);
+
+   /* Initialize the K1/K2 scatterlist */
+   if (final)
+   cmac_key_sg = (need_pad) ? >u.aes.k2_sg
+: >u.aes.k1_sg;
+
+   memset(>cmd, 0, sizeof(rctx->cmd));
+   INIT_LIST_HEAD(>cmd.entry);
+   rctx->cmd.engine = CCP_ENGINE_AES;
+   rctx->cmd.u.aes.type = ctx->u.aes.type;
+   rctx->cmd.u.aes.mode = ctx->u.aes.mode;
+   rctx->cmd.u.aes.action =

[PATCH 04/10] crypto: crypto API interface to the CCP device driver

2013-11-12 Thread Tom Lendacky

These routines provide the support for the interface between the crypto API
and the AMD CCP. This includes insuring that requests associated with a
given tfm on the same cpu are processed in the order received.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-main.c |  432 ++
 drivers/crypto/ccp/ccp-crypto.h  |  191 +++
 2 files changed, 623 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-main.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto.h

diff --git a/drivers/crypto/ccp/ccp-crypto-main.c 
b/drivers/crypto/ccp/ccp-crypto-main.c
new file mode 100644
index 000..2636f04
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-main.c
@@ -0,0 +1,432 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+MODULE_AUTHOR("Tom Lendacky ");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.0.0");
+MODULE_DESCRIPTION("AMD Cryptographic Coprocessor crypto API support");
+
+
+/* List heads for the supported algorithms */
+static LIST_HEAD(hash_algs);
+static LIST_HEAD(cipher_algs);
+
+/* For any tfm, requests for that tfm on the same CPU must be returned
+ * in the order received.  With multiple queues available, the CCP can
+ * process more than one cmd at a time.  Therefore we must maintain
+ * a cmd list to insure the proper ordering of requests on a given tfm/cpu
+ * combination.
+ */
+struct ccp_crypto_cpu_queue {
+   struct list_head cmds;
+   struct list_head *backlog;
+   unsigned int cmd_count;
+};
+#define CCP_CRYPTO_MAX_QLEN50
+
+struct ccp_crypto_percpu_queue {
+   struct ccp_crypto_cpu_queue __percpu *cpu_queue;
+};
+static struct ccp_crypto_percpu_queue req_queue;
+
+struct ccp_crypto_cmd {
+   struct list_head entry;
+
+   struct ccp_cmd *cmd;
+
+   /* Save the crypto_tfm and crypto_async_request addresses
+* separately to avoid any reference to a possibly invalid
+* crypto_async_request structure after invoking the request
+* callback
+*/
+   struct crypto_async_request *req;
+   struct crypto_tfm *tfm;
+
+   /* Used for held command processing to determine state */
+   int ret;
+
+   int cpu;
+};
+
+struct ccp_crypto_cpu {
+   struct work_struct work;
+   struct completion completion;
+   struct ccp_crypto_cmd *crypto_cmd;
+   int err;
+};
+
+
+static inline bool ccp_crypto_success(int err)
+{
+   if (err && (err != -EINPROGRESS) && (err != -EBUSY))
+   return false;
+
+   return true;
+}
+
+/*
+ * ccp_crypto_cmd_complete must be called while running on the appropriate
+ * cpu and the caller must have done a get_cpu to disable preemption
+ */
+static struct ccp_crypto_cmd *ccp_crypto_cmd_complete(
+   struct ccp_crypto_cmd *crypto_cmd, struct ccp_crypto_cmd **backlog)
+{
+   struct ccp_crypto_cpu_queue *cpu_queue;
+   struct ccp_crypto_cmd *held = NULL, *tmp;
+
+   *backlog = NULL;
+
+   cpu_queue = this_cpu_ptr(req_queue.cpu_queue);
+
+   /* Held cmds will be after the current cmd in the queue so start
+* searching for a cmd with a matching tfm for submission.
+*/
+   tmp = crypto_cmd;
+   list_for_each_entry_continue(tmp, _queue->cmds, entry) {
+   if (crypto_cmd->tfm != tmp->tfm)
+   continue;
+   held = tmp;
+   break;
+   }
+
+   /* Process the backlog:
+*   Because cmds can be executed from any point in the cmd list
+*   special precautions have to be taken when handling the backlog.
+*/
+   if (cpu_queue->backlog != _queue->cmds) {
+   /* Skip over this cmd if it is the next backlog cmd */
+   if (cpu_queue->backlog == _cmd->entry)
+   cpu_queue->backlog = crypto_cmd->entry.next;
+
+   *backlog = container_of(cpu_queue->backlog,
+   struct ccp_crypto_cmd, entry);
+   cpu_queue->backlog = cpu_queue->backlog->next;
+
+   /* Skip over this cmd if it is now the next backlog cmd */
+   if (cpu_queue->backlog == _cmd->entry)
+   cpu_queue->backlog = crypto_cmd->entry.next;
+   }
+
+   /* Remove the cmd entry from the list of cmds */
+   cpu_queue->cmd_count--;
+   list_del(_cmd->entry);
+
+   return held;
+}
+
+static void ccp_crypto_complete_on_cpu(struct work_struct *work)
+{
+   struct ccp_crypto_cpu *cpu_work =
+   container_of(work, struct ccp_crypto_cpu, work);
+   struct ccp_crypto_cmd

[PATCH 08/10] crypto: CCP SHA crypto API support

2013-11-12 Thread Tom Lendacky

These routines provide crypto API support for SHA1, SHA224 and SHA256
on the AMD CCP.  HMAC support for these SHA modes is also provided.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-sha.c |  497 +++
 1 file changed, 497 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-sha.c

diff --git a/drivers/crypto/ccp/ccp-crypto-sha.c 
b/drivers/crypto/ccp/ccp-crypto-sha.c
new file mode 100644
index 000..44ff00a
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-sha.c
@@ -0,0 +1,497 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) SHA crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+
+struct ccp_sha_result {
+   struct completion completion;
+   int err;
+};
+
+static void ccp_sync_hash_complete(struct crypto_async_request *req, int err)
+{
+   struct ccp_sha_result *result = req->data;
+
+   if (err == -EINPROGRESS)
+   return;
+
+   result->err = err;
+   complete(>completion);
+}
+
+static int ccp_sync_hash(struct crypto_ahash *tfm, u8 *buf,
+struct scatterlist *sg, unsigned int len)
+{
+   struct ccp_sha_result result;
+   struct ahash_request *req;
+   int ret;
+
+   init_completion();
+
+   req = ahash_request_alloc(tfm, GFP_KERNEL);
+   if (!req)
+   return -ENOMEM;
+
+   ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+  ccp_sync_hash_complete, );
+   ahash_request_set_crypt(req, sg, buf, len);
+
+   ret = crypto_ahash_digest(req);
+   if ((ret == -EINPROGRESS) || (ret == -EBUSY)) {
+   ret = wait_for_completion_interruptible();
+   if (!ret)
+   ret = result.err;
+   }
+
+   ahash_request_free(req);
+
+   return ret;
+}
+
+static int ccp_sha_finish_hmac(struct crypto_async_request *async_req)
+{
+   struct ahash_request *req = ahash_request_cast(async_req);
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
+   struct scatterlist sg[2];
+   unsigned int block_size =
+   crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
+   unsigned int digest_size = crypto_ahash_digestsize(tfm);
+
+   sg_init_table(sg, ARRAY_SIZE(sg));
+   sg_set_buf([0], ctx->u.sha.opad, block_size);
+   sg_set_buf([1], req->result, digest_size);
+
+   return ccp_sync_hash(ctx->u.sha.hmac_tfm, req->result, sg,
+block_size + digest_size);
+}
+
+static int ccp_sha_complete(struct crypto_async_request *async_req, int ret)
+{
+   struct ahash_request *req = ahash_request_cast(async_req);
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
+   struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
+   unsigned int digest_size = crypto_ahash_digestsize(tfm);
+
+   if (ret)
+   goto e_free;
+
+   if (rctx->hash_rem) {
+   /* Save remaining data to buffer */
+   scatterwalk_map_and_copy(rctx->buf, rctx->cmd.u.sha.src,
+rctx->hash_cnt, rctx->hash_rem, 0);
+   rctx->buf_count = rctx->hash_rem;
+   } else
+   rctx->buf_count = 0;
+
+   memcpy(req->result, rctx->ctx, digest_size);
+
+   /* If we're doing an HMAC, we need to perform that on the final op */
+   if (rctx->final && ctx->u.sha.key_len)
+   ret = ccp_sha_finish_hmac(async_req);
+
+e_free:
+   sg_free_table(>data_sg);
+
+   return ret;
+}
+
+static int ccp_do_sha_update(struct ahash_request *req, unsigned int nbytes,
+unsigned int final)
+{
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct ccp_ctx *ctx = crypto_ahash_ctx(tfm);
+   struct ccp_sha_req_ctx *rctx = ahash_request_ctx(req);
+   struct scatterlist *sg;
+   unsigned int block_size =
+   crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
+   unsigned int len, sg_count;
+   int ret;
+
+   if (!final && ((nbytes + rctx->buf_count) <= block_size)) {
+   scatterwalk_map_and_copy(rctx->buf + rctx->buf_count, req->src,
+0, nbytes, 0);
+   rctx->buf_count += nbytes;
+
+   return 0;
+   }
+
+   len = rctx->buf_count + nbytes;
+
+   rctx->final = final;
+   rctx->hash_cnt = final ? len : len & ~(block_size - 1);
+   rctx->hash_rem = final ?   0 : len &

[PATCH 07/10] crypto: CCP XTS-AES crypto API support

2013-11-12 Thread Tom Lendacky

These routines provide crypto API support for the XTS-AES mode of AES
on the AMD CCP.

Signed-off-by: Tom Lendacky 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c |  285 +++
 1 file changed, 285 insertions(+)
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-xts.c

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c 
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
new file mode 100644
index 000..d100b48
--- /dev/null
+++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
@@ -0,0 +1,285 @@
+/*
+ * AMD Cryptographic Coprocessor (CCP) AES XTS crypto API support
+ *
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ccp-crypto.h"
+
+
+struct ccp_aes_xts_def {
+   const char *name;
+   const char *drv_name;
+};
+
+static struct ccp_aes_xts_def aes_xts_algs[] = {
+   {
+   .name   = "xts(aes)",
+   .drv_name   = "xts-aes-ccp",
+   },
+};
+
+struct ccp_unit_size_map {
+   unsigned int size;
+   u32 value;
+};
+
+static struct ccp_unit_size_map unit_size_map[] = {
+   {
+   .size   = 4096,
+   .value  = CCP_XTS_AES_UNIT_SIZE_4096,
+   },
+   {
+   .size   = 2048,
+   .value  = CCP_XTS_AES_UNIT_SIZE_2048,
+   },
+   {
+   .size   = 1024,
+   .value  = CCP_XTS_AES_UNIT_SIZE_1024,
+   },
+   {
+   .size   = 512,
+   .value  = CCP_XTS_AES_UNIT_SIZE_512,
+   },
+   {
+   .size   = 256,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 128,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 64,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 32,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+   {
+   .size   = 16,
+   .value  = CCP_XTS_AES_UNIT_SIZE_16,
+   },
+   {
+   .size   = 1,
+   .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
+   },
+};
+
+static int ccp_aes_xts_complete(struct crypto_async_request *async_req, int 
ret)
+{
+   struct ablkcipher_request *req = ablkcipher_request_cast(async_req);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+
+   if (ret)
+   return ret;
+
+   memcpy(req->info, rctx->iv, AES_BLOCK_SIZE);
+
+   return 0;
+}
+
+static int ccp_aes_xts_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
+ unsigned int key_len)
+{
+   struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm));
+
+   /* Only support 128-bit AES key with a 128-bit Tweak key,
+* otherwise use the fallback
+*/
+   switch (key_len) {
+   case AES_KEYSIZE_128 * 2:
+   memcpy(ctx->u.aes.key, key, key_len);
+   break;
+   }
+   ctx->u.aes.key_len = key_len / 2;
+   sg_init_one(>u.aes.key_sg, ctx->u.aes.key, key_len);
+
+   return crypto_ablkcipher_setkey(ctx->u.aes.tfm_ablkcipher, key,
+   key_len);
+}
+
+static int ccp_aes_xts_crypt(struct ablkcipher_request *req,
+unsigned int encrypt)
+{
+   struct crypto_tfm *tfm =
+   crypto_ablkcipher_tfm(crypto_ablkcipher_reqtfm(req));
+   struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
+   unsigned int unit;
+   int ret;
+
+   if (!ctx->u.aes.key_len) {
+   pr_err("AES key not set\n");
+   return -EINVAL;
+   }
+
+   if (req->nbytes & (AES_BLOCK_SIZE - 1)) {
+   pr_err("AES request size is not a multiple of the block 
size\n");
+   return -EINVAL;
+   }
+
+   if (!req->info) {
+   pr_err("AES IV not supplied");
+   return -EINVAL;
+   }
+
+   for (unit = 0; unit < ARRAY_SIZE(unit_size_map); unit++)
+   if (!(req->nbytes & (unit_size_map[unit].size - 1)))
+   break;
+
+   if ((unit_size_map[unit].value == CCP_XTS_AES_UNIT_SIZE__LAST) ||
+   (ctx->u.aes.key_len != AES_KEYSIZE_128)) {
+   /* Use the fallback to process the request for any
+* unsupported unit sizes or key sizes
+*/
+   ablkcipher_request_set_tfm(req, ctx->u.aes.tfm_ablkcipher);
+   ret = (encrypt) ? crypto_ablkcipher_encrypt(req) :
+ crypto_ablkcipher_decrypt(req);
+

[PATCH 00/10] AMD Cryptographic Coprocessor support

2013-11-12 Thread Tom Lendacky

The following series implements support for the AMD Cryptographic
Coprocessor (CCP).  The AMD CCP provides hardware encryption, hashing
and other related operations.

This patch series is based on the 3.12 kernel.

---

Tom Lendacky (10):
  crypto: authenc - Find proper IV address in ablkcipher callback
  crypto: scatterwalk - Set the chain pointer indication bit
  crypto: CCP device driver and interface support
  crypto: crypto API interface to the CCP device driver
  crypto: CCP AES crypto API support
  crypto: CCP AES CMAC mode crypto API support
  crypto: CCP XTS-AES crypto API support
  crypto: CCP SHA crypto API support
  crytpo: CCP device driver build files
  crypto: CCP maintainer information


 MAINTAINERS  |7 
 crypto/authenc.c |7 
 drivers/crypto/Kconfig   |   12 
 drivers/crypto/Makefile  |1 
 drivers/crypto/ccp/Kconfig   |   23 
 drivers/crypto/ccp/Makefile  |   10 
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c |  355 +
 drivers/crypto/ccp/ccp-crypto-aes-xts.c  |  285 
 drivers/crypto/ccp/ccp-crypto-aes.c  |  375 ++
 drivers/crypto/ccp/ccp-crypto-main.c |  432 ++
 drivers/crypto/ccp/ccp-crypto-sha.c  |  497 +++
 drivers/crypto/ccp/ccp-crypto.h  |  191 +++
 drivers/crypto/ccp/ccp-dev.c |  582 +
 drivers/crypto/ccp/ccp-dev.h |  272 
 drivers/crypto/ccp/ccp-ops.c | 2020 ++
 drivers/crypto/ccp/ccp-pci.c |  360 +
 include/crypto/scatterwalk.h |1 
 include/linux/ccp.h  |  525 
 18 files changed, 5952 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/Kconfig
 create mode 100644 drivers/crypto/ccp/Makefile
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-cmac.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes-xts.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-aes.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-main.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto-sha.c
 create mode 100644 drivers/crypto/ccp/ccp-crypto.h
 create mode 100644 drivers/crypto/ccp/ccp-dev.c
 create mode 100644 drivers/crypto/ccp/ccp-dev.h
 create mode 100644 drivers/crypto/ccp/ccp-ops.c
 create mode 100644 drivers/crypto/ccp/ccp-pci.c
 create mode 100644 include/linux/ccp.h

-- 
Tom Lendacky

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Results] [RFC PATCH v4 00/40] mm: Memory Power Management

2013-11-12 Thread Dave Hansen

On 11/12/2013 12:02 AM, Srivatsa S. Bhat wrote:
> I performed experiments on an IBM POWER 7 machine and got actual power-savings
> numbers (upto 2.6% of total system power) from this patchset. I presented them
> at the Kernel Summit but forgot to post them on LKML. So here they are:

"upto"?  What was it, actually?  Essentially what you've told us here is
that you have a patch that tries to do some memory power management and
that it accomplishes that.  But, to what degree?

Was your baseline against a kernel also booted with numa=fake=1, or was
it a kernel booted normally?

1. What is the theoretical power savings from memory?
2. How much of the theoretical numbers can your patch reach?
3. What is the performance impact?  Does it hurt ebizzy?

You also said before:
> On page 40, the paper shows the power-consumption breakdown for an IBM p670
> machine, which shows that as much as 40% of the system energy is consumed by
> the memory sub-system in a mid-range server.

2.6% seems pretty awful for such an invasive patch set if you were
expecting 40%.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Fwd: Re: [PATCH v2 2/2] x86: add prefetching to do_csum]

2013-11-12 Thread Joe Perches

On Tue, 2013-11-12 at 12:12 -0500, Neil Horman wrote:
> On Mon, Nov 11, 2013 at 05:42:22PM -0800, Joe Perches wrote:
> > Hi again Neil.
> > 
> > Forwarding on to netdev with a concern as to how often
> > do_csum is used via csum_partial for very short headers
> > and what impact any prefetch would have there.
> > 
> > Also, what changed in your test environment?
> > 
> > Why are the new values 5+% higher cycles/byte than the
> > previous values?
> > 
> > And here is the new table reformatted:
> > 
> > len set iterations  Readahead cachelines vs cycles/byte
> > 1   2   3   4   6   10  20
> > 1500B   64MB100 1.4342  1.4300  1.4350  1.4350  1.4396  1.4315  
> > 1.4555
> > 1500B   128MB   100 1.4312  1.4346  1.4271  1.4284  1.4376  1.4318  
> > 1.4431
> > 1500B   256MB   100 1.4309  1.4254  1.4316  1.4308  1.4418  1.4304  
> > 1.4367
> > 1500B   512MB   100 1.4534  1.4516  1.4523  1.4563  1.4554  1.4644  
> > 1.4590
> > 9000B   64MB100 0.8921  0.8924  0.8932  0.8949  0.8952  0.8939  
> > 0.8985
> > 9000B   128MB   100 0.8841  0.8856  0.8845  0.8854  0.8861  0.8879  
> > 0.8861
> > 9000B   256MB   100 0.8806  0.8821  0.8813  0.8833  0.8814  0.8827  
> > 0.8895
> > 9000B   512MB   100 0.8838  0.8852  0.8841  0.8865  0.8846  0.8901  
> > 0.8865
> > 64KB64MB100 0.8132  0.8136  0.8132  0.8150  0.8147  0.8149  
> > 0.8147
> > 64KB128MB   100 0.8013  0.8014  0.8013  0.8020  0.8041  0.8015  
> > 0.8033
> > 64KB256MB   100 0.7956  0.7959  0.7956  0.7976  0.7981  0.7967  
> > 0.7973
> > 64KB512MB   100 0.7934  0.7932  0.7937  0.7951  0.7954  0.7943  
> > 0.7948
> > 
> 
> 
> There we go, thats better:
> len   set iterations  Readahead cachelines vs cycles/byte
>   1   2   3   4   5   10  20
> 1500B 64MB100 1.3638  1.3288  1.3464  1.3505  1.3586  1.3527  1.3408
> 1500B 128MB   100 1.3394  1.3357  1.3625  1.3456  1.3536  1.3400  1.3410
> 1500B 256MB   100 1.3773  1.3362  1.3419  1.3548  1.3543  1.3442  1.4163
> 1500B 512MB   100 1.3442  1.3390  1.3434  1.3505  1.3767  1.3513  1.3820
> 9000B 64MB100 0.8505  0.8492  0.8521  0.8593  0.8566  0.8577  0.8547
> 9000B 128MB   100 0.8507  0.8507  0.8523  0.8627  0.8593  0.8670  0.8570
> 9000B 256MB   100 0.8516  0.8515  0.8568  0.8546  0.8549  0.8609  0.8596
> 9000B 512MB   100 0.8517  0.8526  0.8552  0.8675  0.8547  0.8526  0.8621
> 64KB  64MB100 0.7679  0.7689  0.7688  0.7716  0.7714  0.7722  0.7716
> 64KB  128MB   100 0.7683  0.7687  0.7710  0.7690  0.7717  0.7694  0.7703
> 64KB  256MB   100 0.7680  0.7703  0.7688  0.7689  0.7726  0.7717  0.7713
> 64KB  512MB   100 0.7692  0.7690  0.7701  0.7705  0.7698  0.7693  0.7735
> 
> 
> So, the numbers are correct now that I returned my hardware to its previous
> interrupt affinity state, but the trend seems to be the same (namely that 
> there
> isn't a clear one).  We seem to find peak performance around a readahead of 2
> cachelines, but its very small (about 3%), and its inconsistent (larger set
> sizes fall to either side of that stride).  So I don't see it as a clear win. 
>  I
> still think we should probably scrap the readahead for now, just take the perf
> bits, and revisit this when we can use the vector instructions or the
> independent carry chain instructions to improve this more consistently.
> 
> Thoughts

Perhaps a single prefetch, not of the first addr but of
the addr after PREFETCH_STRIDE would work best but only
if length is > PREFETCH_STRIDE.

I'd try:

if (len > PREFETCH_STRIDE)
prefetch(buf + PREFETCH_STRIDE);
while (count64) {
etc...
}

I still don't know how much that impacts very short lengths.

Can you please add a 20 byte length to your tests?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: XFS leadership and a new co-maintainer candidate

2013-11-12 Thread Christoph Hellwig

On Fri, Nov 08, 2013 at 02:46:06PM -0600, Ben Myers wrote:
> That really didn't happen Christoph.  It's not in my tree or in a pull 
> request.

I'll take my back room complain back then, but I still think that this
is not a useful way to discuss something like this.

> Linus, let me know what you want to do.  I do think we're doing a fair job 
> over
> here, and (geez) I'm just trying to add Mark as my backup since Alex is too
> busy.  I know the RH people want more control, and that's understandable, but
> they really don't need to replace me to get their code in.  Ouch.

I'd really like to see more diversity in XFS maintainers.  The SGI focus
has defintively been an issue again and again because it seems when one
SGI person is too busy the others usually are as well. As mentioned
before there's also been historically a way too high turnover, with the
associated transition pains.

By making sure we have a broader base for the maintainers, and a more
open infrastructure we'll all win.  Note that we already had that sort
of instructure on kernel.org, but gave up on it because many people
perceived the effort to re-gain the kernel.org accounts to high.

I would also really like to get a clarification on "I know the RH people
want more control, and that's understandable, but they really don't need
to replace me to get their code in".  What specific people are you
worried about an what code?  What makes "the RH people" less worthy
to their code in than "the SGI" people.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/14] sched: add extended scheduling interface.

2013-11-12 Thread Steven Rostedt

On Thu,  7 Nov 2013 14:43:36 +0100
Juri Lelli  wrote:

  
> +static int
> +do_sched_setscheduler2(pid_t pid, int policy,
> +  struct sched_param2 __user *param2)
> +{
> + struct sched_param2 lparam2;
> + struct task_struct *p;
> + int retval;
> +
> + if (!param2 || pid < 0)
> + return -EINVAL;
> +
> + memset(, 0, sizeof(struct sched_param2));
> + if (copy_from_user(, param2, sizeof(struct sched_param2)))
> + return -EFAULT;

Why the memset() before the copy_from_user()? We are copying
sizeof(sched_param2) anyway, and should overwrite anything that was on
the stack. I'm not aware of any possible leak from copying from
userspace. I could understand it if we were copying to userspace.

do_sched_setscheduler() doesn't do that either.

> +
> + rcu_read_lock();
> + retval = -ESRCH;
> + p = find_process_by_pid(pid);
> + if (p != NULL)
> + retval = sched_setscheduler2(p, policy, );
> + rcu_read_unlock();
> +
> + return retval;
> +}
> +
>  /**
>   * sys_sched_setscheduler - set/change the scheduler policy and RT priority
>   * @pid: the pid in question.
> @@ -3514,6 +3553,21 @@ SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, 
> policy,
>  }
>  
>  /**
> + * sys_sched_setscheduler2 - same as above, but with extended sched_param
> + * @pid: the pid in question.
> + * @policy: new policy (could use extended sched_param).
> + * @param: structure containg the extended parameters.
> + */
> +SYSCALL_DEFINE3(sched_setscheduler2, pid_t, pid, int, policy,
> + struct sched_param2 __user *, param2)
> +{
> + if (policy < 0)
> + return -EINVAL;
> +
> + return do_sched_setscheduler2(pid, policy, param2);
> +}
> +
> +/**
>   * sys_sched_setparam - set/change the RT priority of a thread
>   * @pid: the pid in question.
>   * @param: structure containing the new RT priority.
> @@ -3526,6 +3580,17 @@ SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct 
> sched_param __user *, param)
>  }
>  
>  /**
> + * sys_sched_setparam2 - same as above, but with extended sched_param
> + * @pid: the pid in question.
> + * @param2: structure containing the extended parameters.
> + */
> +SYSCALL_DEFINE2(sched_setparam2, pid_t, pid,
> + struct sched_param2 __user *, param2)
> +{
> + return do_sched_setscheduler2(pid, -1, param2);
> +}
> +
> +/**
>   * sys_sched_getscheduler - get the policy (scheduling class) of a thread
>   * @pid: the pid in question.
>   *
> @@ -3595,6 +3660,45 @@ out_unlock:
>   return retval;
>  }
>  
> +/**
> + * sys_sched_getparam2 - same as above, but with extended sched_param
> + * @pid: the pid in question.
> + * @param2: structure containing the extended parameters.
> + */
> +SYSCALL_DEFINE2(sched_getparam2, pid_t, pid,
> + struct sched_param2 __user *, param2)
> +{
> + struct sched_param2 lp;
> + struct task_struct *p;
> + int retval;
> +
> + if (!param2 || pid < 0)
> + return -EINVAL;
> +
> + rcu_read_lock();
> + p = find_process_by_pid(pid);
> + retval = -ESRCH;
> + if (!p)
> + goto out_unlock;
> +
> + retval = security_task_getscheduler(p);
> + if (retval)
> + goto out_unlock;
> +
> + lp.sched_priority = p->rt_priority;
> + rcu_read_unlock();
> +

OK, now we are missing the memset(). This does leak info, as lp never
was set to zero, it just contains anything on the stack, and the only
value you updated was sched_priority. We just copied to user memory
from the kernel stack.

-- Steve



> + retval = copy_to_user(param2, ,
> + sizeof(struct sched_param2)) ? -EFAULT : 0;
> +
> + return retval;
> +
> +out_unlock:
> + rcu_read_unlock();
> + return retval;
> +
> +}
> +
>  long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
>  {
>   cpumask_var_t cpus_allowed, new_mask;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH 2/2] swiotlb-xen: xen_swiotlb_map_page: do not error out if dma_capable fails

2013-11-12 Thread Stefano Stabellini

Russell gave a great explanation of the issue so I am just going to
limit myself to answering to:

On Tue, 12 Nov 2013, Konrad Rzeszutek Wilk wrote:
> > Considering that we know that the swiotlb buffer has a low address,
> > skip the check.
> 
> I am not following that sentence. Could you please explain to me
> how the SWIOTLB buffer low address guarantees that we don't need
> the check?

xen_swiotlb_fixup makes sure that the swiotlb buffer is lower than 4GB,
probably lower than 3GB, by passing dma_bits to
xen_create_contiguous_region.
This meets the requirements of most devices out there.
In fact we are not even running this check under the same conditions in
swiotlb_map_sg_attrs.
I admit that it is possible to come up with a scenario where the check
would be useful, but it is far easier to come up with scenarios where
not only is unneeded but it is even harmful.

Alternatively (without Rob's "of: set dma_mask to point to
coherent_dma_mask") Linux 3.13 is going to fail to get the network
running on Midway. It is going to avoid fs mounting failures just
because we don't do the same check in swiotlb_map_sg_attrs.

FYI given that Rob's patch is probably going upstream soon anyway, I
don't feel so strongly about this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RELEASE] Userspace RCU 0.8.1

2013-11-12 Thread Mathieu Desnoyers

liburcu is a LGPLv2.1 userspace RCU (read-copy-update) library. This
data synchronization library provides read-side access which scales
linearly with the number of cores. It does so by allowing multiples
copies of a given data structure to live at the same time, and by
monitoring the data structure accesses to detect grace periods after
which memory reclamation is possible.

liburcu-cds provides efficient data structures based on RCU and
lock-free algorithms. Those structures include hash tables, queues,
stacks, and doubly-linked lists.

Changelog:
2013-11-12 Userspace RCU 0.8.1
* tls-compat: fix comment typo
* Keep ABI compatible with already compiled LGPL applications
* Fix: tls-compat multi-lib conflict
* Use cross compiler for doc examples
* gcc warning fixes: -Wsign-compare and -Wextra
* Fix: urcu-qsbr: reversed logic on RCU_DEBUG
* Fix: urcu-bp segfault in glibc pthread_kill()
* Fix urcu-bp: don't move registry
* Fix: compat futex duplicated lock and completion
* Fix: i386 compat code duplicated mutex instances
* Fix: urcu-bp: Bulletproof RCU arena resize bug
* Fix: test_mutex.c uninitialized mutex

Project website: http://urcu.so
Download link: http://urcu.so/files/
Git repository: git://git.urcu.so/urcu.git

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RELEASE] Userspace RCU 0.7.9

2013-11-12 Thread Mathieu Desnoyers

liburcu is a LGPLv2.1 userspace RCU (read-copy-update) library. This
data synchronization library provides read-side access which scales
linearly with the number of cores. It does so by allowing multiples
copies of a given data structure to live at the same time, and by
monitoring the data structure accesses to detect grace periods after
which memory reclamation is possible.

liburcu-cds provides efficient data structures based on RCU and
lock-free algorithms. Those structures include hash tables, queues,
stacks, and doubly-linked lists.

Changelog:
2013-11-12 Userspace RCU 0.7.9
* tls-compat: fix comment typo
* Keep ABI compatible with already compiled LGPL applications
* Fix: tls-compat multi-lib conflict
* gcc warning fixes: -Wsign-compare and -Wextra
* Fix: urcu-qsbr: reversed logic on RCU_DEBUG
* Fix: urcu-bp segfault in glibc pthread_kill()
* Fix urcu-bp: don't move registry
* Fix: compat futex duplicated lock and completion
* Fix: i386 compat code duplicated mutex instances
* Fix: urcu-bp: Bulletproof RCU arena resize bug
* Fix: test_mutex.c uninitialized mutex

Project website: http://urcu.so
Download link: http://urcu.so/files/
Git repository: git://git.urcu.so/urcu.git

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/14] sched: add extended scheduling interface.

2013-11-12 Thread Steven Rostedt

On Thu,  7 Nov 2013 14:43:36 +0100
Juri Lelli  wrote:

> + * This is reflected by the actual fields of the sched_param2 structure:
> + *
> + *  @sched_priority task's priority (might still be useful)
> + *  @sched_deadline representative of the task's deadline
> + *  @sched_runtime  representative of the task's runtime
> + *  @sched_period   representative of the task's period
> + *  @sched_flagsfor customizing the scheduler behaviour
> + *
> + * Given this task model, there are a multiplicity of scheduling algorithms
> + * and policies, that can be used to ensure all the tasks will make their
> + * timing constraints.
> + *
> + * @__unused padding to allow future expansion without ABI issues
> + */
> +struct sched_param2 {
> + int sched_priority;
> + unsigned int sched_flags;

I'm just thinking, if we are creating a new structure, and this
structure already contains u64 elements, why not make sched_flags u64
too? We are now just limiting the total number of possible flags to 32.
I'm not sure how many flags will be needed in the future, maybe 32 is
good enough, but just something to think about.

Of course you can argue that the int sched_flags matches the int
sched_priority leaving out any holes in the structure, which is a
legitimate argument.

> + u64 sched_runtime;
> + u64 sched_deadline;
> + u64 sched_period;
> +
> + u64 __unused[12];

And in the future, we could use one of these __unused[12] as a
sched_flags2;

I'm not saying we should make it u64, just wanted to make sure we are
fine with it as 32 for now.

-- Steve

> +};
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: imx-drm: Fix modular build of DRM_IMX_IPUV3

2013-11-12 Thread Josh Boyer

commit b8d181e408af (staging: drm/imx: add drm plane support) added a file
to the make target for DRM_IMX_IPUV3 but didn't adjust the objs required
to actually build that as a module.  Kbuild got confused and this lead to
link errors like:

ERROR: "ipu_plane_disable" [drivers/staging/imx-drm/ipuv3-crtc.ko] undefined!
ERROR: "ipu_plane_enable" [drivers/staging/imx-drm/ipuv3-crtc.ko] undefined!

Additionally, it added a call to imx_drm_crtc_id which also fails with a
link error as above.  To fix this, we adjust the make target with the proper
objs, which will change the name of the resulting .ko.  We also add an
EXPORT_SYMBOL_GPL for imx_drm_crtc_id.

Signed-off-by: Josh Boyer 
---
 drivers/staging/imx-drm/Makefile   | 4 +++-
 drivers/staging/imx-drm/imx-drm-core.c | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/imx-drm/Makefile b/drivers/staging/imx-drm/Makefile
index 2c3a9e1..8742432 100644
--- a/drivers/staging/imx-drm/Makefile
+++ b/drivers/staging/imx-drm/Makefile
@@ -8,4 +8,6 @@ obj-$(CONFIG_DRM_IMX_TVE) += imx-tve.o
 obj-$(CONFIG_DRM_IMX_LDB) += imx-ldb.o
 obj-$(CONFIG_DRM_IMX_FB_HELPER) += imx-fbdev.o
 obj-$(CONFIG_DRM_IMX_IPUV3_CORE) += ipu-v3/
-obj-$(CONFIG_DRM_IMX_IPUV3)+= ipuv3-crtc.o ipuv3-plane.o
+
+imx-ipuv3-crtc-objs  := ipuv3-crtc.o ipuv3-plane.o
+obj-$(CONFIG_DRM_IMX_IPUV3)+= imx-ipuv3-crtc.o
diff --git a/drivers/staging/imx-drm/imx-drm-core.c 
b/drivers/staging/imx-drm/imx-drm-core.c
index 4483d47..2b366d8 100644
--- a/drivers/staging/imx-drm/imx-drm-core.c
+++ b/drivers/staging/imx-drm/imx-drm-core.c
@@ -72,6 +72,7 @@ int imx_drm_crtc_id(struct imx_drm_crtc *crtc)
 {
return crtc->pipe;
 }
+EXPORT_SYMBOL_GPL(imx_drm_crtc_id);
 
 static void imx_drm_driver_lastclose(struct drm_device *drm)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/14] sched: add sched_class->task_dead.

2013-11-12 Thread Steven Rostedt

On Thu,  7 Nov 2013 14:43:35 +0100
Juri Lelli  wrote:

> From: Dario Faggioli 
> 
> Add a new function to the scheduling class interface. It is called
> at the end of a context switch, if the prev task is in TASK_DEAD state.
> 
> It might be useful for the scheduling classes that want to be notified
> when one of their task dies, e.g. to perform some cleanup actions.

Nit.  s/task/tasks/

-- Steve

> 
> Signed-off-by: Dario Faggioli 
> Signed-off-by: Juri Lelli 
> ---
>  kernel/sched/core.c  |3 +++
>  kernel/sched/sched.h |1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 5ac63c9..850a02c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1890,6 +1890,9 @@ static void finish_task_switch(struct rq *rq, struct 
> task_struct *prev)
>   if (mm)
>   mmdrop(mm);
>   if (unlikely(prev_state == TASK_DEAD)) {
> + if (prev->sched_class->task_dead)
> + prev->sched_class->task_dead(prev);
> +
>   /*
>* Remove function-return probe instances associated with this
>* task and put them back on the free list.
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b3c5653..64eda5c 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -992,6 +992,7 @@ struct sched_class {
>   void (*set_curr_task) (struct rq *rq);
>   void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
>   void (*task_fork) (struct task_struct *p);
> + void (*task_dead) (struct task_struct *p);
>  
>   void (*switched_from) (struct rq *this_rq, struct task_struct *task);
>   void (*switched_to) (struct rq *this_rq, struct task_struct *task);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/11] fuse: restructure fuse_readpage()

2013-11-12 Thread Miklos Szeredi

On Thu, Oct 10, 2013 at 05:11:25PM +0400, Maxim Patlasov wrote:
> Move the code filling and sending read request to a separate function. Future
> patches will use it for .write_begin -- partial modification of a page
> requires reading the page from the storage very similarly to what 
> fuse_readpage
> does.
> 
> Signed-off-by: Maxim Patlasov 
> ---
>  fs/fuse/file.c |   55 +--
>  1 file changed, 37 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index b4d4189..77eb849 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -700,21 +700,14 @@ static void fuse_short_read(struct fuse_req *req, 
> struct inode *inode,
>   }
>  }
>  
> -static int fuse_readpage(struct file *file, struct page *page)
> +static int __fuse_readpage(struct file *file, struct page *page, size_t 
> count,
> +int *err, struct fuse_req **req_pp, u64 *attr_ver_p)

Signature of this helper looks really ugly.  A quick look tells me that neither
caller actually needs 'req'.  And fuse_get_attr_version() can be moved to the
one caller that needs it.  And negative err can be returned.  And then all those
ugly pointer args are gone and the whole thing is much simpler.

Thanks,
Miklos



>  {
>   struct fuse_io_priv io = { .async = 0, .file = file };
>   struct inode *inode = page->mapping->host;
>   struct fuse_conn *fc = get_fuse_conn(inode);
>   struct fuse_req *req;
>   size_t num_read;
> - loff_t pos = page_offset(page);
> - size_t count = PAGE_CACHE_SIZE;
> - u64 attr_ver;
> - int err;
> -
> - err = -EIO;
> - if (is_bad_inode(inode))
> - goto out;
>  
>   /*
>* Page writeback can extend beyond the lifetime of the
> @@ -724,20 +717,45 @@ static int fuse_readpage(struct file *file, struct page 
> *page)
>   fuse_wait_on_page_writeback(inode, page->index);
>  
>   req = fuse_get_req(fc, 1);
> - err = PTR_ERR(req);
> + *err = PTR_ERR(req);
>   if (IS_ERR(req))
> - goto out;
> + return 0;
>  
> - attr_ver = fuse_get_attr_version(fc);
> + if (attr_ver_p)
> + *attr_ver_p = fuse_get_attr_version(fc);
>  
>   req->out.page_zeroing = 1;
>   req->out.argpages = 1;
>   req->num_pages = 1;
>   req->pages[0] = page;
>   req->page_descs[0].length = count;
> - num_read = fuse_send_read(req, , pos, count, NULL);
> - err = req->out.h.error;
>  
> + num_read = fuse_send_read(req, , page_offset(page), count, NULL);
> + *err = req->out.h.error;
> +
> + if (*err)
> + fuse_put_request(fc, req);
> + else
> + *req_pp = req;
> +
> + return num_read;
> +}
> +
> +static int fuse_readpage(struct file *file, struct page *page)
> +{
> + struct inode *inode = page->mapping->host;
> + struct fuse_conn *fc = get_fuse_conn(inode);
> + struct fuse_req *req = NULL;
> + size_t num_read;
> + size_t count = PAGE_CACHE_SIZE;
> + u64 attr_ver = 0;
> + int err;
> +
> + err = -EIO;
> + if (is_bad_inode(inode))
> + goto out;
> +
> + num_read = __fuse_readpage(file, page, count, , , _ver);
>   if (!err) {
>   /*
>* Short read means EOF.  If file size is larger, truncate it
> @@ -747,10 +765,11 @@ static int fuse_readpage(struct file *file, struct page 
> *page)
>  
>   SetPageUptodate(page);
>   }
> -
> - fuse_put_request(fc, req);
> - fuse_invalidate_attr(inode); /* atime changed */
> - out:
> + if (req) {
> + fuse_put_request(fc, req);
> + fuse_invalidate_attr(inode); /* atime changed */
> + }
> +out:
>   unlock_page(page);
>   return err;
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 4/4] MCS Lock: Barrier corrections

2013-11-12 Thread George Spelvin

> On Mon, Nov 11, 2013 at 09:17:52PM +, Tim Chen wrote:
>> An alternate implementation is
>>  while (!ACCESS_ONCE(node->locked))
>>  arch_mutex_cpu_relax();
>>  smp_load_acquire(>locked);
>> 
>> Leaving the smp_load_acquire at the end to provide appropriate barrier.
>> Will that be acceptable?

Will Deacon  wrote:
> It still doesn't solve my problem though: I want a way to avoid that busy
> loop by some architecture-specific manner. The arch_mutex_cpu_relax() hook
> is a start, but there is no corresponding hook on the unlock side to issue a
> wakeup. Given a sensible relax implementation, I don't have an issue with
> putting a load-acquire in a loop, since it shouldn't be aggresively spinning
> anymore.

So you want something like this?

/*
 * This is a spin-wait with acquire semantics.  That is, accesses after
 * this are not allowed to be reordered before the load that meets
 * the specified condition.  This requires that it end with either a
 * load-acquire or a full smp_mb().  The optimal way to do this is likely
 * to be architecture-dependent.  E.g. x86 MONITOR/MWAIT instructions.
 */
#ifndef smp_load_acquire_until
#define smp_load_acquire_until(addr, cond) \
while (!(smp_load_acquire(addr) cond)) { \
do { \
arch_mutex_cpu_relax(); \
} while (!(ACCESS_ONCE(*(addr)) cond)); \
}
#endif

smp_load_acquire_until(>locked, != 0);

Alternative implementations:

#define smp_load_acquire_until(addr, cond) { \
while (!(ACCESS_ONCE(*(addr)) cond)) \
arch_mutex_cpu_relax(); \
smp_mb(); }

#define smp_load_acquire_until(addr, cond) \
if (!(smp_load_acquire(addr) cond)) { \
do { \
arch_mutex_cpu_relax(); \
} while (!(ACCESS_ONCE(*(addr)) cond)); \
smp_mb(); \
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] sched: Check sched_domain before computing group power.

2013-11-12 Thread Srikar Dronamraju

> 
> Hurm.. can you provide the actual topology of the machine that triggers
> this? My brain hurts trying to thing through the weird cases of this
> code.
> 

Hope this helps. Please do let me know if you were looking for pdf output.

Machine (251GB)



   NUMANode P#0 (63GB)



   Socket P#0


 L3 (20MB)



 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB)   L2 
(256KB)   L2 (256KB)   L2 (256KB)   L2 (256KB)



 L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB)   L1d 
(32KB)   L1d (32KB)   L1d (32KB)   L1d (32KB)



 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB)   L1i 
(32KB)   L1i (32KB)   L1i (32KB)   L1i (32KB)



 Core P#0   Core P#1   Core P#2   Core P#3 Core 
P#4 Core P#5 Core P#6 Core P#7


   PU P#0PU P#1 PU P#2 PU P#3   PU 
P#4   PU P#5   PU P#6   PU P#7


   PU P#32   PU P#33PU P#34PU P#35  PU 
P#36  PU P#37  PU P#38  PU P#39




   NUMANode P#1 (63GB)



   Socket P#1


 L3 (20MB)



 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB)   L2 
(256KB)   L2 (256KB)   L2 (256KB)   L2 (256KB)



 L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB)   L1d 
(32KB)   L1d (32KB)   L1d (32KB)   L1d (32KB)



 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB)   L1i 
(32KB)   L1i (32KB)   L1i (32KB)   L1i (32KB)



 Core P#0   Core P#1   Core P#2   Core P#3 Core 
P#4 Core P#5 Core P#6 Core P#7


   PU P#8PU P#9 PU P#10PU P#11  PU 
P#12  PU P#13  PU P#14  PU P#15


   PU P#40   PU P#41PU P#42PU P#43  PU 
P#44  PU P#45  PU P#46  PU P#47




   NUMANode P#2 (63GB)



   Socket P#2


 L3 (20MB)



 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB)   L2 
(256KB)   L2 (256KB)   L2 (256KB)   L2 (256KB)



 L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB)   L1d 
(32KB)   L1d (32KB)   L1d (32KB)   L1d (32KB)



 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB)   L1i 
(32KB)   L1i (32KB)   L1i (32KB)   L1i (32KB)



 Core P#0   Core P#1   Core P#2   Core P#3 Core 
P#4 Core P#5 Core P#6 Core P#7


   PU P#16   PU P#17PU P#18PU P#19  PU 
P#20  PU P#21  PU P#22  PU P#23


   PU P#48   PU P#49PU P#50PU P#51  PU 
P#52  PU P#53  PU P#54  PU P#55




   NUMANode P#3 (62GB)



   Socket P#3


 L3 (20MB)



 L2 (256KB) L2 (256KB) L2 (256KB) L2 (256KB)   L2 
(256KB)   L2 (256KB)   L2 (256KB)   L2 (256KB)



 L1d (32KB) L1d (32KB) L1d (32KB) L1d (32KB)   L1d 
(32KB)   L1d (32KB)   L1d (32KB)   L1d (32KB)



 L1i (32KB) L1i (32KB) L1i (32KB) L1i (32KB)   L1i 
(32KB)   L1i (32KB)   L1i (32KB)   L1i (32KB)



 Core P#0   Core P#1   Core P#2   Core P#3 Core 
P#4 Core P#5 Core P#6 Core P#7


   PU P#24   PU P#25PU P#26PU P#27  PU 
P#28  PU P#29  PU P#30  PU P#31


   PU P#56   PU P#57PU P#58PU P#59  PU 
P#60  PU P#61  PU P#62  PU P#63




PCI 1000:005b


sda  sdb sdcsdd



sde  sdf sdgsdh




PCI 19a2:0710


eth0




PCI 19a2:0710


eth1




PCI 19a2:0710


eth2




PCI 19a2:0710


eth3




  PCI 102b:0534



   PCI 8086:1d02



PCI 1000:0073



Host: kong.in.ibm.com

Indexes: physical

Date: Tuesday 12 November 2013 10:38:18 PM IST

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Fwd: Re: [PATCH v2 2/2] x86: add prefetching to do_csum]

2013-11-12 Thread Neil Horman

On Mon, Nov 11, 2013 at 05:42:22PM -0800, Joe Perches wrote:
> Hi again Neil.
> 
> Forwarding on to netdev with a concern as to how often
> do_csum is used via csum_partial for very short headers
> and what impact any prefetch would have there.
> 
> Also, what changed in your test environment?
> 
> Why are the new values 5+% higher cycles/byte than the
> previous values?
> 
> And here is the new table reformatted:
> 
> len   set iterations  Readahead cachelines vs cycles/byte
>   1   2   3   4   6   10  20
> 1500B 64MB100 1.4342  1.4300  1.4350  1.4350  1.4396  1.4315  1.4555
> 1500B 128MB   100 1.4312  1.4346  1.4271  1.4284  1.4376  1.4318  1.4431
> 1500B 256MB   100 1.4309  1.4254  1.4316  1.4308  1.4418  1.4304  1.4367
> 1500B 512MB   100 1.4534  1.4516  1.4523  1.4563  1.4554  1.4644  1.4590
> 9000B 64MB100 0.8921  0.8924  0.8932  0.8949  0.8952  0.8939  0.8985
> 9000B 128MB   100 0.8841  0.8856  0.8845  0.8854  0.8861  0.8879  0.8861
> 9000B 256MB   100 0.8806  0.8821  0.8813  0.8833  0.8814  0.8827  0.8895
> 9000B 512MB   100 0.8838  0.8852  0.8841  0.8865  0.8846  0.8901  0.8865
> 64KB  64MB100 0.8132  0.8136  0.8132  0.8150  0.8147  0.8149  0.8147
> 64KB  128MB   100 0.8013  0.8014  0.8013  0.8020  0.8041  0.8015  0.8033
> 64KB  256MB   100 0.7956  0.7959  0.7956  0.7976  0.7981  0.7967  0.7973
> 64KB  512MB   100 0.7934  0.7932  0.7937  0.7951  0.7954  0.7943  0.7948
> 


There we go, thats better:
len   set iterations  Readahead cachelines vs cycles/byte
1   2   3   4   5   10  20
1500B 64MB  100 1.3638  1.3288  1.3464  1.3505  1.3586  1.3527  1.3408
1500B 128MB 100 1.3394  1.3357  1.3625  1.3456  1.3536  1.3400  1.3410
1500B 256MB 100 1.3773  1.3362  1.3419  1.3548  1.3543  1.3442  1.4163
1500B 512MB 100 1.3442  1.3390  1.3434  1.3505  1.3767  1.3513  1.3820
9000B 64MB  100 0.8505  0.8492  0.8521  0.8593  0.8566  0.8577  0.8547
9000B 128MB 100 0.8507  0.8507  0.8523  0.8627  0.8593  0.8670  0.8570
9000B 256MB 100 0.8516  0.8515  0.8568  0.8546  0.8549  0.8609  0.8596
9000B 512MB 100 0.8517  0.8526  0.8552  0.8675  0.8547  0.8526  0.8621
64KB  64MB  100 0.7679  0.7689  0.7688  0.7716  0.7714  0.7722  0.7716
64KB  128MB 100 0.7683  0.7687  0.7710  0.7690  0.7717  0.7694  0.7703
64KB  256MB 100 0.7680  0.7703  0.7688  0.7689  0.7726  0.7717  0.7713
64KB  512MB 100 0.7692  0.7690  0.7701  0.7705  0.7698  0.7693  0.7735


So, the numbers are correct now that I returned my hardware to its previous
interrupt affinity state, but the trend seems to be the same (namely that there
isn't a clear one).  We seem to find peak performance around a readahead of 2
cachelines, but its very small (about 3%), and its inconsistent (larger set
sizes fall to either side of that stride).  So I don't see it as a clear win.  I
still think we should probably scrap the readahead for now, just take the perf
bits, and revisit this when we can use the vector instructions or the
independent carry chain instructions to improve this more consistently.

Thoughts
Neil



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] staging: zsmalloc: Ensure handle is never 0 on success

2013-11-12 Thread Olav Haugan

Hi Greg,

On 11/11/2013 4:19 PM, Greg KH wrote:
> On Thu, Nov 07, 2013 at 05:58:03PM -0800, Olav Haugan wrote:
>> zsmalloc encodes a handle using the pfn and an object
>> index. On hardware platforms with physical memory starting
>> at 0x0 the pfn can be 0. This causes the encoded handle to be
>> 0 and is incorrectly interpreted as an allocation failure.
> 
> Please list the known hardware platforms that have this issue, so that
> people have a chance to know if this patch is relevant for them or not.
> 
> For example, should I include this in the stable releases because it
> affects systems that are shipping?  Or is it just in "future" chips and
> it doesn't need to go there or not?
> 
> Please make it easy for me to do this type of determination, I already
> asked you this question before, why didn't you include the information
> here as well (hint, that is why I asked you...)

I don't think it would be the best to mention specific hardware
platforms in the commit text. If I saw this patch listing specific
hardware platforms I would have made the wrong decision (I would look at
the list and decide that I am not running on those platforms so I don't
need this patch). The problem could happen on any hardware platform. It
just depends on how the memory map of the platform is configured. Hence,
I re-worded the commit text to make it clear that this will happen when
you have memory starting at 0x0.

If I list out specific hardware platforms it would be only a sample (I
do not know all hardware platforms and their memory maps). However,
having said that there are products already shipping with physical
address starting at 0.

Thanks,

Olav Haugan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5 v2] input: tc3589x-keypad: support probing from device tree

2013-11-12 Thread Linus Walleij

On Tue, Nov 12, 2013 at 4:30 PM, Sebastian Reichel  wrote:
> On Tue, Nov 12, 2013 at 03:13:38PM +0100, Linus Walleij wrote:

>> + plat->no_autorepeat = of_property_read_bool(np, "linux,no-autorepeat");
>> + plat->enable_wakeup = of_property_read_bool(np, "linux,wakeup");
>
> There is currently discussion going on for the property name of
> autorepeat:
>
> https://lkml.org/lkml/2013/11/11/680

So this binding is documented for GPIO keys in:
Documentation/devicetree/bindings/input/gpio-matrix-keypad.txt

This is probably the most used binding as GPIO matrixes are
not uncommon. But I don't know, yes it is a mess. (As usual.)

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 07/11] VFS hot tracking: Add a /proc interface to control memory usage

2013-11-12 Thread Dave Hansen

On 11/11/2013 02:45 PM, Zhi Yong Wu wrote:
> On Tue, Nov 12, 2013 at 6:15 AM, Dave Hansen  wrote:
>> In general, why do you have to control the number of these statically?
> It gives the user or admin one optional chance to control the amount
> of memory consumed by VFS hot tracking. And you can choose not to use
> it.

The on/off knob seems to me to be something better left to a mount
option, not a global tunable.

>> Shouldn't you just define a shrinker and let memory pressure determine
>> how many of these we allow to exist?
> How about if the user and admin hope to control the amount of the
> memory consumed by VFS hot tracking? e.g. If the host has several
> hundred of G or T memory, but the user or admin hope that the memory
> size consumed by VFS hot tracking is under several G, In the case,
> maybe a shrinker of VFS hot tracking will never be invoked by system
> memory module, so this interface will make sense.

If the shrinker is not invoked, that means that there is lots of memory
free.  In the case that there is lots of memory free, are you arguing
that a user would rather see memory go *unused* than be put to use for
this hot tracking data?

If this were true, why don't we have similar knobs for the dentry, inode
and page caches?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH 1/2] swiotlb-xen: add missing xen_dma_map_page call

2013-11-12 Thread Stefano Stabellini

On Tue, 12 Nov 2013, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 12, 2013 at 02:11:59PM +, Stefano Stabellini wrote:
> > swiotlb-xen is missing a xen_dma_map_page call in
> > xen_swiotlb_map_sg_attrs, in the slow path.
> 
> s/slow/bounce buffer/ I believe?

right


> > Signed-off-by: Stefano Stabellini 
> > ---
> >  drivers/xen/swiotlb-xen.c |5 +
> >  1 files changed, 5 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index a224bc7..1eac073 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -555,6 +555,11 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct 
> > scatterlist *sgl,
> > sg_dma_len(sgl) = 0;
> > return 0;
> > }
> > +   xen_dma_map_page(hwdev, pfn_to_page(map >> PAGE_SHIFT),
> > +   map & ~PAGE_MASK,
> > +   sg->length,
> > +   dir,
> > +   attrs);
> > sg->dma_address = xen_phys_to_bus(map);
> > } else {
> > /* we are not interested in the dma_addr returned by
> > -- 
> > 1.7.2.5
> > 
> > 
> > ___
> > Xen-devel mailing list
> > xen-de...@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] sched: Check sched_domain before computing group power.

2013-11-12 Thread Peter Zijlstra

On Tue, Nov 12, 2013 at 10:11:26PM +0530, Srikar Dronamraju wrote:
> After Commit-id 863bffc80898 (sched/fair: Fix group power_orig
> computation), we might end up computing group power before the
> sched_domain for a cpu is updated.
> 
> Update with cpu_power, if rq->sd is not yet updated.
> 
> Signed-off-by: Srikar Dronamraju 
> ---
> Changelog since v1: Fix divide by zero errors that can result because
> power/power_orig was set to 0.

Hurm.. can you provide the actual topology of the machine that triggers
this? My brain hurts trying to thing through the weird cases of this
code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/kaslr] x86, kaslr: Use char array to gain sizeof sanity

2013-11-12 Thread tip-bot for Kees Cook

Commit-ID:  327f7d72454aecdc7a4a1c847a291a3f224b730f
Gitweb: http://git.kernel.org/tip/327f7d72454aecdc7a4a1c847a291a3f224b730f
Author: Kees Cook 
AuthorDate: Tue, 12 Nov 2013 08:56:07 -0800
Committer:  H. Peter Anvin 
CommitDate: Tue, 12 Nov 2013 08:58:35 -0800

x86, kaslr: Use char array to gain sizeof sanity

The build_str needs to be char [] not char * for the sizeof() to report
the string length.

Reported-by: Mathias Krause 
Signed-off-by: Kees Cook 
Link: http://lkml.kernel.org/r/20131112165607.ga5...@www.outflux.net
Signed-off-by: H. Peter Anvin 
---
 arch/x86/boot/compressed/aslr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 38a07cc..84be175 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -13,7 +13,7 @@
 #include 
 
 /* Simplified build-specific string for starting entropy. */
-static const char *build_str = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
+static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
 #define I8254_PORT_CONTROL 0x43
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipvs: Remove unused variable ret from sync_thread_master()

2013-11-12 Thread Peter Zijlstra

On Tue, Nov 12, 2013 at 05:21:36PM +0100, Oleg Nesterov wrote:
> On 11/12, Peter Zijlstra wrote:
> >
> > On Tue, Nov 12, 2013 at 02:21:39PM -, David Laight wrote:
> > > Shame there isn't a process flag to indicate that the process
> > > will sleep uninterruptibly and that it doesn't matter.
> > > So don't count to the load average and don't emit a warning
> > > if it has been sleeping for a long time.
> >
> > A process flag wouldn't work, because the task could block waiting for
> > actual work to complete in other sleeps.
> >
> > However, we could do something like the below; which would allow us
> > writing things like:
> >
> > (void)___wait_event(*sk_sleep(sk),
> > sock_writeable(sk) || kthread_should_stop(),
> > TASK_UNINTERRUPTIBLE | TASK_IDLE, 0, 0,
> > schedule());
> >
> > Marking the one wait-for-more-work as TASK_IDLE such that it doesn't
> > contribute to the load avg.
> 
> Agreed, I thought about additional bit too.
> 
> >  static const char * const task_state_array[] = {
> > -   "R (running)",  /*   0 */
> > -   "S (sleeping)", /*   1 */
> > -   "D (disk sleep)",   /*   2 */
> > -   "T (stopped)",  /*   4 */
> > -   "t (tracing stop)", /*   8 */
> > -   "Z (zombie)",   /*  16 */
> > -   "X (dead)", /*  32 */
> > -   "x (dead)", /*  64 */
> > -   "K (wakekill)", /* 128 */
> > -   "W (waking)",   /* 256 */
> > -   "P (parked)",   /* 512 */
> > +   "R (running)",  /*0 */
> > +   "S (sleeping)", /*1 */
> > +   "D (disk sleep)",   /*2 */
> > +   "T (stopped)",  /*4 */
> > +   "t (tracing stop)", /*8 */
> > +   "Z (zombie)",   /*   16 */
> > +   "X (dead)", /*   32 */
> > +   "x (dead)", /*   64 */
> > +   "K (wakekill)", /*  128 */
> > +   "W (waking)",   /*  256 */
> > +   "P (parked)",   /*  512 */
> > +   "I (idle)", /* 1024 */
> >  };
> 
> but I am not sure about what /proc/ should report in this case...

We have to put in something...

BUILD_BUG_ON(1 + ilog2(TASK_STATE_MAX) != ARRAY_SIZE(task_state_array));

However, since we always set it together with TASK_UNINTERUPTIBLE
userspace shouldn't actually ever see the I thing.

> >  #define task_contributes_to_load(task) \
> > ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
> > -(task->flags & PF_FROZEN) == 0)
> > +(task->flags & PF_FROZEN) == 0 && \
> > +(task->state & TASK_IDLE) == 0)
> 
> perhaps
> 
>   (task->state & (TASK_UNINTERRUPTIBLE | TASK_IDLE)) == 
> TASK_UNINTERRUPTIBLE
> 
> can save an insn.

Fair enough.

> I am also wondering if it makes any sense to turn PF_FROZEN into
> TASK_FROZEN, something like (incomplete, probably racy) patch below.
> Note that it actually adds the new state, not the the qualifier.
> 
> --- x/include/linux/freezer.h
> +++ x/include/linux/freezer.h
> @@ -23,7 +23,7 @@ extern unsigned int freeze_timeout_msecs
>   */
>  static inline bool frozen(struct task_struct *p)
>  {
> - return p->flags & PF_FROZEN;
> + return p->state & TASK_FROZEN;

do we want == there? Does it make sense to allow it be set with other
state flags?

>  }
>  
>  extern bool freezing_slow_path(struct task_struct *p);
> --- x/kernel/freezer.c
> +++ x/kernel/freezer.c
> @@ -57,16 +57,13 @@ bool __refrigerator(bool check_kthr_stop
>   pr_debug("%s entered refrigerator\n", current->comm);
>  
>   for (;;) {
> - set_current_state(TASK_UNINTERRUPTIBLE);
> -
>   spin_lock_irq(_lock);
> - current->flags |= PF_FROZEN;
> - if (!freezing(current) ||
> - (check_kthr_stop && kthread_should_stop()))
> - current->flags &= ~PF_FROZEN;
> + if (freezing(current) &&
> + !(check_kthr_stop && kthread_should_stop()))
> + set_current_state(TASK_FROZEN);
>   spin_unlock_irq(_lock);
>  
> - if (!(current->flags & PF_FROZEN))
> + if (!(current->state & TASK_FROZEN))
>   break;
>   was_frozen = true;
>   schedule();
> @@ -148,8 +145,7 @@ void __thaw_task(struct task_struct *p)
>* refrigerator.
>*/
>   spin_lock_irqsave(_lock, flags);
> - if (frozen(p))
> - wake_up_process(p);
> + try_to_wake_up(p, TASK_FROZEN, 0);
>   spin_unlock_irqrestore(_lock, flags);
>  }

Should work I suppose... I'm not entirely sure why that's a PF to begin
with.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Crypto Update for 3.13

2013-11-12 Thread Borislav Petkov

On Wed, Nov 13, 2013 at 12:41:52AM +0800, Herbert Xu wrote:
> Hi Linus:
> 
> Here is the crypto update for 3.13:
> 
> * Made x86 ablk_helper generic for ARM.
> * Phase out chainiv in favour of eseqiv (affects IPsec).
> * Fixed aes-cbc IV corruption on s390.
> * Added constant-time crypto_memneq which replaces memcmp.
> 
> * Fixed aes-ctr in omap-aes.
> * Added OMAP3 ROM RNG support.
> * Add PRNG support for MSM SoC's
> * Add and use Job Ring API in caam.
> 
> * Misc fixes.

Maybe add this one to that:

http://marc.info/?l=linux-kernel=138078878205385=2

?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 2/4] perf stat: add event unit and scale support

2013-11-12 Thread Stephane Eranian

This patch adds perf stat support for handling event units and
scales as exported by the kernel.

The kernel can export PMU events actual unit and scaling factor
via sysfs:
$ ls -1 /sys/devices/power/events/energy-*
/sys/devices/power/events/energy-cores
/sys/devices/power/events/energy-cores.scale
/sys/devices/power/events/energy-cores.unit
/sys/devices/power/events/energy-pkg
/sys/devices/power/events/energy-pkg.scale
/sys/devices/power/events/energy-pkg.unit
$ cat /sys/devices/power/events/energy-cores.scale
2.3283064365386962890625e-10
$ cat cat /sys/devices/power/events/energy-cores.unit
Joules

This patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:

 # perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 
1000
 #  time counts   unit events
 1.000214717   3.07 Joules power/energy-pkg/ [100.00%]
 1.000214717   0.53 Joules power/energy-cores/
 1.000214717   12965028cycles[100.00%]
 2.000749289   3.01 Joules power/energy-pkg/
 2.000749289   0.52 Joules power/energy-cores/
 2.000749289   15817043cycles

When the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.

Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.

Signed-off-by: Stephane Eranian 
Signed-off-by: Jiri Olsa 
---
 tools/perf/builtin-stat.c  |  114 -
 tools/perf/util/evsel.c|2 +
 tools/perf/util/evsel.h|3 +
 tools/perf/util/parse-events.c |   28 +---
 tools/perf/util/pmu.c  |  138 ++--
 tools/perf/util/pmu.h  |3 +-
 6 files changed, 244 insertions(+), 44 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 0fc1c94..66d20c9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -138,6 +138,7 @@ static const char   *post_cmd   
= NULL;
 static boolsync_run= false;
 static unsigned intinterval= 0;
 static unsigned intinitial_delay   = 0;
+static unsigned intunit_width  = 4; /* 
strlen("unit") */
 static boolforever = false;
 static struct timespec ref_time;
 static struct cpu_map  *aggr_map;
@@ -462,17 +463,17 @@ static void print_interval(void)
if (num_print_interval == 0 && !csv_output) {
switch (aggr_mode) {
case AGGR_SOCKET:
-   fprintf(output, "#   time socket cpus   
  counts events\n");
+   fprintf(output, "#   time socket cpus   
  counts %*s events\n", unit_width, "unit");
break;
case AGGR_CORE:
-   fprintf(output, "#   time core cpus 
counts events\n");
+   fprintf(output, "#   time core cpus 
counts %*s events\n", unit_width, "unit");
break;
case AGGR_NONE:
-   fprintf(output, "#   time CPU 
counts events\n");
+   fprintf(output, "#   time CPU
counts %*s events\n", unit_width, "unit");
break;
case AGGR_GLOBAL:
default:
-   fprintf(output, "#   time counts 
events\n");
+   fprintf(output, "#   time counts 
%*s events\n", unit_width, "unit");
}
}
 
@@ -517,6 +518,7 @@ static int __run_perf_stat(int argc, const char **argv)
unsigned long long t0, t1;
struct perf_evsel *counter;
struct timespec ts;
+   size_t l;
int status = 0;
const bool forks = (argc > 0);
 
@@ -566,6 +568,10 @@ static int __run_perf_stat(int argc, const char **argv)
return -1;
}
counter->supported = true;
+
+   l = strlen(counter->unit);
+   if (l > unit_width)
+   unit_width = l;
}
 
if (perf_evlist__apply_filters(evsel_list)) {
@@ -705,14 +711,25 @@ static void aggr_printout(struct perf_evsel *evsel, int 
id, int nr)
 static void nsec_printout(int cpu, int nr, struct perf_evsel *evsel, double 
avg)
 {
double msecs = avg / 1e6;
-   const char *fmt = csv_output ? "%.6f%s%s" : "%18.6f%s%-25s";
+   const char *fmt_v, *fmt_n;
char

Re: [PATCH RFC 2/6] arm64: Kprobes with single stepping support

2013-11-12 Thread Steven Rostedt

On Tue, 12 Nov 2013 16:25:26 +0530
Sandeepa Prabhu  wrote:


> >
> > BTW, I'm currently trying a general housecleaning of __kprobes
> > annotations. It may also have impact on your patch.
> > https://lkml.org/lkml/2013/11/8/187
> Hmm, we can help testing your patchset on arm64 platforms. Also have
> many doubts on the changes you are working [blacklisting probes etc]
> 
> Basically I had tried placing kprobe on memcpy() and the model hung
> (insmod never returned back!). Fast-model I have does not have option
> of any debug so no clue what happened!.
> memcpy() is low-level call being used internally within kprobes, so
> probably we cannot handle probe on that routine, but then how to make
> sure all such API are rejected by kprobe sub-system ?

Working on ports of ftrace, I found that many of the functions in lib/
are used by several locations that just can't be traced, due to how
low level they are. I just simply blacklisted the entire lib/
directory (See the top of lib/Makefile)

I wonder if there's an easy way to blacklist entire directories from
being used by kprobes too. Or at least do it by a file per file basis.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 0/4] perf/x86: add Intel RAPL PMU support

2013-11-12 Thread Stephane Eranian

This patch adds a new uncore PMU to expose the Intel
RAPL (Running Average Power Limit) energy consumption counters.
Up to 3 counters, each counting a single RAPL event are exposed.

The RAPL counters are available on Intel SandyBridge, IvyBridge,
Haswell. The server processors add a 3rd counter to measure
DRAM power consumption.

The following events are available and exposed in sysfs:
- power/energy-cores: power consumption of all cores on socket
- power/energy-pkg  : power consumption of all cores + LLC cache
- power/energy-dram : power consumption of DRAM (servers only)

The RAPL PMU is uncore by nature and is implemented such
that it only works in system-wide mode. Measuring only
one CPU per socket is sufficient. 

The counters all count in the same unit. The perf_events API
exposes all RAPL counters as 64-bit integers counting in unit
of 1/2^32 Joules (about 0.23 nJ). User level tools must convert
the counts by multiplying them by the scaling factor exposed
in the correponding event .scale file in sysfs to obtain a value
expressed in Joules. The reason for this approach is that the kernel
avoids doing floating point math whenever possible because it is
expensive (user floating-point state must be saved). The method
used avoids kernel floating-point and does not incur any precision
loss. Thanks to PeterZ for suggesting this approach.

To convert the raw count in Watts: W = C * 2.3 / (1e10 * time)

The kernel exposes both the scaling factor and the unit (Joules)
in sysfs:
$ ls -1 /sys/devices/power/events/energy-*
/sys/devices/power/events/energy-cores
/sys/devices/power/events/energy-cores.scale
/sys/devices/power/events/energy-cores.unit
/sys/devices/power/events/energy-pkg
/sys/devices/power/events/energy-pkg.scale
/sys/devices/power/events/energy-pkg.unit

$ cat /sys/devices/power/events/energy-cores.scale
2.3283064365386962890625e-10

$ cat cat /sys/devices/power/events/energy-cores.unit
Joules

RAPL PMU is a new standalone PMU which registers with the
perf_event core subsystem. The PMU type (attr->type) is
dynamically allocated and is available from /sys/device/rapl/type.

Sampling is not supported by the RAPL PMU. There is no
privilege level filtering either.

The PMU exports a cpumask in /sys/devices/power/cpumask. It
is used by perf to ensure only one instance of each RAPL event
is measured per processor socket. Hotplug CPU is also supported.

The perf stat infrastrructure is enhanced to show events
units. It also applies the scaling factor. As such, perf stat prints
RAPL events in Joules (and not increments of 0.23 nJ):

 # perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 
1000
 #  time counts   unit events
 1.000282860   2.51 Joules power/energy-pkg/
 1.000282860   0.31 Joules power/energy-cores/
 1.000282860   37765378cycles

The patch adds a hrtimer to poll the counters given that
they do no interrupt on overflow. Hardware counters are 32-bit
wide.

In v2, we add the locking necesarry to protect the rapl_pmu
struct. We also add a description at the top of the file.
We check for Intel only processor. We improved the data
layout of the rapl_pmu struct. We also lifted the restriction
of the number of instances of RAPL counters that can be active
at the same time. RAPL is free running counters, so ought to be
able to measure events as many times as necessary in parallel
via multiple tools. There is never multiplexing among RAPL events.

In v3, we have renamed the event to be more generic power/* instead
of rapl/*. We have modified perf stat to print the event with the
unit and scaling factors.

In v4, we integrate the feedback from Jiri and rebase to 3.12-rc7+
from tip.git.

In v5, we export the full scaling factor to increase prescision.
In the perf tool, we changed the way the .unit and .scale syfs
entries are parsed. Thank to Jiri for this contribution on this.
We also fix a couple of printf() issues with perf stat and units.
Now, we print no unit symbol when the event has no unit (was ? before).
Patch is relative to 3.12 from tip.git.

In v6, we fixed a few issues in the perf tool having to do with printing
of the unit. Works for uncore events now. There is a major restructuring
of the code in the kernel because the hrtimer needs to be per-cpu and
not shared per socket because there can be multiple sessions in parallel
and also because of hotplug CPU. The hotplug cpu code was also updated.

In v7, we rebased to 3.12+ and dropped the hotplug cpu lock because it
was not needed. Hotplug CPU is still serialized.

Thanks to all contributors to this patch series: PeterZ, Jiri, Maria,
Arnaldo, Andi, Ingo.

Supported CPUs: SandyBridge, IvyBridge, Haswell.

Signed-off-by: Stephane Eranian 

Stephane Eranian (4):
  perf: add active_entry list head to struct perf_event
  perf stat: add event unit and scale support
  perf,x86: add Intel RAPL PMU support
  perf,x86: add RAPL hrtimer support

[PATCH v7 3/4] perf,x86: add Intel RAPL PMU support

2013-11-12 Thread Stephane Eranian

This patch adds a new uncore PMU to expose the Intel
RAPL energy consumption counters. Up to 3 counters,
each counting a particular RAPL event are exposed.

The RAPL counters are available on Intel SandyBridge,
IvyBridge, Haswell. The server skus add a 3rd counter.

The following events are available and exposed in sysfs:
- power/energy-cores: power consumption of all cores on socket
- power/energy-pkg: power consumption of all cores + LLc cache
- power/energy-dram: power consumption of DRAM (servers only)

For each event both the unit (Joules) and scale (2^-32 J)
is exposed in sysfs for use by perf stat and other tools.
Files are:
/sys/devices/power/events/energy-*.unit
/sys/devices/power/events/energy-*.scale

The RAPL PMU is uncore by nature and is implemented such
that it only works in system-wide mode. Measuring only
one CPU per socket is sufficient. The /sys/devices/power/cpumask
file can be used by tools to figure out which CPUs to monitor
by default. For instance, on a 2-socket system, 2 CPUs
(one on each socket) will be shown.

All the counters measure in the same unit (exposed via sysfs).
The perf_events API exposes all RAPL counters as 64-bit integers
counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
must convert the counts by multiplying them by 2^-32 to obtain
Joules. The reason for this is that the kernel avoids
doing floating point math whenever possible because it is
expensive (user floating-point state must be saved). The method
used avoids kernel floating-point usage. There is no loss of
precision. Thanks to PeterZ for suggesting this approach.

To convert the raw count in Watt:
   W = C * 2.3 / (1e10 * time)
or ldexp(C, -32).

RAPL PMU is a new standalone PMU which registers with the
perf_event core subsystem. The PMU type (attr->type) is
dynamically allocated and is available from /sys/device/power/type.

Sampling is not supported by the RAPL PMU. There is no
privilege level filtering either.

Signed-off-by: Stephane Eranian 
Signed-off-by: Maria Dimakopoulou 
---
 arch/x86/kernel/cpu/Makefile|2 +-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |  591 +++
 2 files changed, 592 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kernel/cpu/perf_event_intel_rapl.c

diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 47b56a7..6359506 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -36,7 +36,7 @@ obj-$(CONFIG_CPU_SUP_AMD) += 
perf_event_amd_iommu.o
 endif
 obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_p6.o perf_event_knc.o 
perf_event_p4.o
 obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_intel_lbr.o 
perf_event_intel_ds.o perf_event_intel.o
-obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_intel_uncore.o
+obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_intel_uncore.o 
perf_event_intel_rapl.o
 endif
 
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
new file mode 100644
index 000..cfcd386
--- /dev/null
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -0,0 +1,591 @@
+/*
+ * perf_event_intel_rapl.c: support Intel RAPL energy consumption counters
+ * Copyright (C) 2013 Google, Inc., Stephane Eranian
+ *
+ * Intel RAPL interface is specified in the IA-32 Manual Vol3b
+ * section 14.7.1 (September 2013)
+ *
+ * RAPL provides more controls than just reporting energy consumption
+ * however here we only expose the 3 energy consumption free running
+ * counters (pp0, pkg, dram).
+ *
+ * Each of those counters increments in a power unit defined by the
+ * RAPL_POWER_UNIT MSR. On SandyBridge, this unit is 1/(2^16) Joules
+ * but it can vary.
+ *
+ * Counter to rapl events mappings:
+ *
+ *  pp0 counter: consumption of all physical cores (power plane 0)
+ *   event: rapl_energy_cores
+ *perf code: 0x1
+ *
+ *  pkg counter: consumption of the whole processor package
+ *   event: rapl_energy_pkg
+ *perf code: 0x2
+ *
+ * dram counter: consumption of the dram domain (servers only)
+ *   event: rapl_energy_dram
+ *perf code: 0x3
+ *
+ * We manage those counters as free running (read-only). They may be
+ * use simultaneously by other tools, such as turbostat.
+ *
+ * The events only support system-wide mode counting. There is no
+ * sampling support because it does not make sense and is not
+ * supported by the RAPL hardware.
+ *
+ * Because we want to avoid floating-point operations in the kernel,
+ * the events are all reported in fixed point arithmetic (32.32).
+ * Tools must adjust the counts to convert them to Watts using
+ * the duration of the measurement. Tools may use a function such as
+ * ldexp(raw_count, -32);
+ */
+#include 
+#include 
+#include 
+#include 
+#include "perf_event.h"
+
+/*
+ * RAPL energy status counters
+ */
+#define RAPL_IDX_PP0_NRG_STAT  0   /* all cores */
+#define INTEL_RAPL_PP0 0x1 /*

[PATCH v7 4/4] perf,x86: add RAPL hrtimer support

2013-11-12 Thread Stephane Eranian

The RAPL PMU counters do not interrupt on overflow.
Therefore, the kernel needs to poll the counters
to avoid missing an overflow. This patch adds
the hrtimer code to do this.

The timer interval is calculated at boot time
based on the power unit used by the HW.

There is one hrtimer per-cpu to handle the case
of multiple simultaneous use across cores on
the same package + hotplug CPU.

Signed-off-by: Stephane Eranian 
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   74 ++-
 1 file changed, 72 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index cfcd386..a6f16ca 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -96,6 +96,8 @@ struct rapl_pmu {
int  n_active; /* number of active events */
struct list_head active_list;
struct pmu   *pmu; /* pointer to rapl_pmu_class */
+   ktime_t  timer_interval; /* in ktime_t unit */
+   struct hrtimer   hrtimer;
 };
 
 static struct pmu rapl_pmu_class;
@@ -158,6 +160,48 @@ static u64 rapl_event_update(struct perf_event *event)
return new_raw_count;
 }
 
+static void rapl_start_hrtimer(struct rapl_pmu *pmu)
+{
+   __hrtimer_start_range_ns(>hrtimer,
+   pmu->timer_interval, 0,
+   HRTIMER_MODE_REL_PINNED, 0);
+}
+
+static void rapl_stop_hrtimer(struct rapl_pmu *pmu)
+{
+   hrtimer_cancel(>hrtimer);
+}
+
+static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
+{
+   struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+   struct perf_event *event;
+   unsigned long flags;
+
+   if (!pmu->n_active)
+   return HRTIMER_NORESTART;
+
+   spin_lock_irqsave(>lock, flags);
+
+   list_for_each_entry(event, >active_list, active_entry) {
+   rapl_event_update(event);
+   }
+
+   spin_unlock_irqrestore(>lock, flags);
+
+   hrtimer_forward_now(hrtimer, pmu->timer_interval);
+
+   return HRTIMER_RESTART;
+}
+
+static void rapl_hrtimer_init(struct rapl_pmu *pmu)
+{
+   struct hrtimer *hr = >hrtimer;
+
+   hrtimer_init(hr, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   hr->function = rapl_hrtimer_handle;
+}
+
 static void __rapl_pmu_event_start(struct rapl_pmu *pmu,
   struct perf_event *event)
 {
@@ -171,6 +215,8 @@ static void __rapl_pmu_event_start(struct rapl_pmu *pmu,
local64_set(>hw.prev_count, rapl_read_counter(event));
 
pmu->n_active++;
+   if (pmu->n_active == 1)
+   rapl_start_hrtimer(pmu);
 }
 
 static void rapl_pmu_event_start(struct perf_event *event, int mode)
@@ -195,6 +241,8 @@ static void rapl_pmu_event_stop(struct perf_event *event, 
int mode)
if (!(hwc->state & PERF_HES_STOPPED)) {
WARN_ON_ONCE(pmu->n_active <= 0);
pmu->n_active--;
+   if (pmu->n_active == 0)
+   rapl_stop_hrtimer(pmu);
 
list_del(>active_entry);
 
@@ -423,6 +471,9 @@ static void rapl_cpu_exit(int cpu)
 */
if (target >= 0)
perf_pmu_migrate_context(pmu->pmu, cpu, target);
+
+   /* cancel overflow polling timer for CPU */
+   rapl_stop_hrtimer(pmu);
 }
 
 static void rapl_cpu_init(int cpu)
@@ -442,6 +493,7 @@ static int rapl_cpu_prepare(int cpu)
 {
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
+   u64 ms;
 
if (pmu)
return 0;
@@ -466,6 +518,22 @@ static int rapl_cpu_prepare(int cpu)
pmu->hw_unit = (pmu->hw_unit >> 8) & 0x1FULL;
pmu->pmu = _pmu_class;
 
+   /*
+* use reference of 200W for scaling the timeout
+* to avoid missing counter overflows.
+* 200W = 200 Joules/sec
+* divide interval by 2 to avoid lockstep (2 * 100)
+* if hw unit is 32, then we use 2 ms 1/200/2
+*/
+   if (pmu->hw_unit < 32)
+   ms = 1000 * (1ULL << (32 - pmu->hw_unit - 1)) / (2 * 100);
+   else
+   ms = 2;
+
+   pmu->timer_interval = ms_to_ktime(ms);
+
+   rapl_hrtimer_init(pmu);
+
/* set RAPL pmu for this cpu for now */
per_cpu(rapl_pmu, cpu) = pmu;
per_cpu(rapl_pmu_to_free, cpu) = NULL;
@@ -580,9 +648,11 @@ static int __init rapl_pmu_init(void)
 
pr_info("RAPL PMU detected, hw unit 2^-%d Joules,"
" API unit is 2^-32 Joules,"
-   " %d fixed counters\n",
+   " %d fixed counters"
+   " %llu ms ovfl timer\n",
pmu->hw_unit,
-   hweight32(rapl_cntr_mask));
+   hweight32(rapl_cntr_mask),
+   ktime_to_ms(pmu->timer_interval));
 
put_online_cpus();
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body

[PATCH v7 1/4] perf: add active_entry list head to struct perf_event

2013-11-12 Thread Stephane Eranian

This patch adds a new field to the struct perf_event.
It is intended to be used to chain events which are
active (enabled). It helps in the hardware layer
for PMUs which do not have actual counter restrictions, i.e.,
free running read-only counters. Active events are chained
as opposed to being tracked via the counter they use.

To save space we use a union with hlist_entry as both
are mutually exclusive (suggested by Jiri Olsa).

Signed-off-by: Stephane Eranian 
---
 include/linux/perf_event.h |5 -
 kernel/events/core.c   |1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2e069d1..8f4a70f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -319,7 +319,10 @@ struct perf_event {
 */
struct list_headmigrate_entry;
 
-   struct hlist_node   hlist_entry;
+   union {
+   struct hlist_node   hlist_entry;
+   struct list_headactive_entry;
+   };
int nr_siblings;
int group_flags;
struct perf_event   *group_leader;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4dc078d..90dca5c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6663,6 +6663,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
INIT_LIST_HEAD(>event_entry);
INIT_LIST_HEAD(>sibling_list);
INIT_LIST_HEAD(>rb_entry);
+   INIT_LIST_HEAD(>active_entry);
 
init_waitqueue_head(>waitq);
init_irq_work(>pending, perf_pending_event);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] FS: BTRFS: fixed coding style issues

2013-11-12 Thread Aldo Iljazi

 David Sterba wrote:

> On Mon, Nov 04, 2013 at 03:27:38PM +0200, Aldo Iljazi wrote:
> > Fixed three coding style issues. Replaced spaces with tabs.
> > 
> > Signed-off-by: Aldo Iljazi 
> > ---
> >  fs/btrfs/dev-replace.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> > index 9efb94e..b2fe609 100644
> > --- a/fs/btrfs/dev-replace.c
> > +++ b/fs/btrfs/dev-replace.c
> > @@ -377,7 +377,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
> > printk_in_rcu(KERN_INFO
> >   "btrfs: dev_replace from %s (devid %llu) to %s) 
> > started\n",
> >   src_device->missing ? "" :
> > -   rcu_str_deref(src_device->name),
> > +   rcu_str_deref(src_device->name),
> 
> What's the change here? I don't think we need to fix whitespace, this
> makes searching in patch history more boring, namely in case where the
> code looks exactly the same before and after the patch.
> 
> The style issues should be best fixed when the patch is about to be
> merged, doing it later like this is kind of not welcome, speaking for
> myself. There are lots of opportunities to do real code cleanups.
> 
> 
> david

Okay thanks.
-- 
Aldo Iljazi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: vmstat: On demand vmstat workers V3

2013-11-12 Thread Christoph Lameter

Hmmm... This has been sitting there for over a month. What I can I do to
to make progress on merging this?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scripts: checkpatch.pl & Lindent (minor complaint)

2013-11-12 Thread Joe Perches

On Tue, 2013-11-12 at 11:51 -0500, Mimi Zohar wrote:
> On Tue, 2013-11-12 at 08:30 -0800, Joe Perches wrote:
[]
> > My suggestion is not to use Lindent.
> > 
> > If you want a semi-automated source-code reformatting tool,
> > use scripts/checkpatch.pl --fix
> 
> Thanks, perhaps this suggestion should be reflected in
> Documentation/CodingStyle.

I sent a patch awhile ago

https://lkml.org/lkml/2013/2/11/390


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86, kaslr: use char array to gain sizeof sanity

2013-11-12 Thread Kees Cook

The build_str needs to be char [] not char * for the sizeof() to report
the string length.

Reported-by: Mathias Krause 
Signed-off-by: Kees Cook 
---
 arch/x86/boot/compressed/aslr.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index 8746487fa916..9777ed15fbb9 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -13,7 +13,7 @@
 #include 
 
 /* Simplified build-specific string for starting entropy. */
-static const char *build_str = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
+static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
 #define I8254_PORT_CONTROL 0x43
-- 
1.7.9.5


-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

oom-kill && frozen()

2013-11-12 Thread Oleg Nesterov

On 11/12, Oleg Nesterov wrote:
>
> I am also wondering if it makes any sense to turn PF_FROZEN into
> TASK_FROZEN, something like (incomplete, probably racy) patch below.
> Note that it actually adds the new state, not the the qualifier.

As for the current usage of PF_FROZEN... David, it seems that
oom_scan_process_thread()->__thaw_task() is dead? Probably this
was fine before, when __thaw_task() cleared the "need to freeze"
condition, iirc it was PF_FROZEN.

But today __thaw_task() can't help, no? the task will simply
schedule() in D state again.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.10.16 cgroup_mutex deadlock

2013-11-12 Thread Michal Hocko

On Tue 12-11-13 09:55:30, Shawn Bohrer wrote:
> On Tue, Nov 12, 2013 at 03:31:47PM +0100, Michal Hocko wrote:
> > On Tue 12-11-13 18:17:20, Li Zefan wrote:
> > > Cc more people
> > > 
> > > On 2013/11/12 6:06, Shawn Bohrer wrote:
> > > > Hello,
> > > > 
> > > > This morning I had a machine running 3.10.16 go unresponsive but
> > > > before we killed it we were able to get the information below.  I'm
> > > > not an expert here but it looks like most of the tasks below are
> > > > blocking waiting on the cgroup_mutex.  You can see that the
> > > > resource_alloca:16502 task is holding the cgroup_mutex and that task
> > > > appears to be waiting on a lru_add_drain_all() to complete.
> > 
> > Do you have sysrq+l output as well by any chance? That would tell
> > us what the current CPUs are doing. Dumping all kworker stacks
> > might be helpful as well. We know that lru_add_drain_all waits for
> > schedule_on_each_cpu to return so it is waiting for workers to finish.
> > I would be really curious why some of lru_add_drain_cpu cannot finish
> > properly. The only reason would be that some work item(s) do not get CPU
> > or somebody is holding lru_lock.
> 
> In fact the sys-admin did manage to fire off a sysrq+l, I've put all
> of the info from the syslog below.  I've looked it over and I'm not
> sure it reveals anything.  First looking at the timestamps it appears
> we ran the sysrq+l 19.2 hours after the cgroup_mutex lockup I
> previously sent.

I would expect sysrq+w would still show those kworkers blocked on the
same cgroup mutex?

> I also have atop logs over that whole time period
> that show hundreds of zombie processes which to me indicates that over
> that 19.2 hours systemd remained wedged on the cgroup_mutex.  Looking
> at the backtraces from the sysrq+l it appears most of the CPUs were
> idle

Right so either we managed to sleep with the lru_lock held which sounds
a bit improbable - but who knows - or there is some other problem. I
would expect the later to be true.

lru_add_drain executes per-cpu and preemption disabled this means that
its work item cannot be preempted so the only logical explanation seems
to be that the work item has never got scheduled.

> except there are a few where ptpd is trying to step the clock
> with clock_settime.  The ptpd process also appears to get stuck for a
> bit but it looks like it recovers because it moves CPUs and the
> previous CPUs become idle.

It gets soft lockup because it is waiting for it's own IPIs which got
preempted by NMI trace dumper. But this is unrelated.

> The fact that ptpd is stepping the clock
> at all at this time means that timekeeping is a mess at this point and
> the system clock is way out of sync.  There are also a few of these
> NMI messages in there that I don't understand but at this point the
> machine was a sinking ship.
> 
> Nov 11 07:03:29 sydtest0 kernel: [764305.327043] Uhhuh. NMI received for 
> unknown reason 21 on CPU 26.
> Nov 11 07:03:29 sydtest0 kernel: [764305.327043] Do you have a strange power 
> saving mode enabled?
> Nov 11 07:03:29 sydtest0 kernel: [764305.327043] Dazed and confused, but 
> trying to continue
> Nov 11 07:03:29 sydtest0 kernel: [764305.327143] Uhhuh. NMI received for 
> unknown reason 31 on CPU 27.
> Nov 11 07:03:29 sydtest0 kernel: [764305.327144] Do you have a strange power 
> saving mode enabled?
> Nov 11 07:03:29 sydtest0 kernel: [764305.327144] Dazed and confused, but 
> trying to continue
> Nov 11 07:03:29 sydtest0 kernel: [764305.327242] Uhhuh. NMI received for 
> unknown reason 31 on CPU 28.
> Nov 11 07:03:29 sydtest0 kernel: [764305.327242] Do you have a strange power 
> saving mode enabled?
> Nov 11 07:03:29 sydtest0 kernel: [764305.327243] Dazed and confused, but 
> trying to continue
> 
> Perhaps there is another task blocking somewhere holding the lru_lock, but at
> this point the machine has been rebooted so I'm not sure how we'd figure out
> what task that might be. Anyway here is the full output of sysrq+l plus
> whatever else ended up in the syslog.

OK. In case the issue happens again. It would be very helpful to get the
kworker and per-cpu stacks. Maybe Tejun can help with some waitqueue
debugging tricks.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/11] fuse: Trust kernel i_mtime only -v2

2013-11-12 Thread Miklos Szeredi

On Thu, Oct 10, 2013 at 05:10:56PM +0400, Maxim Patlasov wrote:
> Let the kernel maintain i_mtime locally:
>  - clear S_NOCMTIME
>  - implement i_op->update_time()
>  - flush mtime on fsync and last close
>  - update i_mtime explicitly on truncate and fallocate
> 
> Fuse inode flag FUSE_I_MTIME_DIRTY serves as indication that local i_mtime
> should be flushed to the server eventually.
> 
> Changed in v2 (thanks to Brian):
>  - renamed FUSE_I_MTIME_UPDATED to FUSE_I_MTIME_DIRTY
>  - simplified fuse_set_mtime_local()
>  - abandoned optimizations: clearing the flag on some operations (like
>direct write) is doable, but may lead to unexpected changes of
>user-visible mtime.
> 
> Signed-off-by: Maxim Patlasov 
> ---
>  fs/fuse/dir.c|  109 
> --
>  fs/fuse/file.c   |   30 +--
>  fs/fuse/fuse_i.h |6 ++-
>  fs/fuse/inode.c  |   13 +-
>  4 files changed, 138 insertions(+), 20 deletions(-)
> 
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index f022968..eda248b 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -857,8 +857,11 @@ static void fuse_fillattr(struct inode *inode, struct 
> fuse_attr *attr,
>   struct fuse_conn *fc = get_fuse_conn(inode);
>  
>   /* see the comment in fuse_change_attributes() */
> - if (fc->writeback_cache && S_ISREG(inode->i_mode))
> + if (fc->writeback_cache && S_ISREG(inode->i_mode)) {
>   attr->size = i_size_read(inode);
> + attr->mtime = inode->i_mtime.tv_sec;
> + attr->mtimensec = inode->i_mtime.tv_nsec;
> + }
>  
>   stat->dev = inode->i_sb->s_dev;
>   stat->ino = attr->ino;
> @@ -1582,6 +1585,82 @@ void fuse_release_nowrite(struct inode *inode)
>   spin_unlock(>lock);
>  }
>  
> +static void fuse_setattr_fill(struct fuse_conn *fc, struct fuse_req *req,
> +   struct inode *inode,
> +   struct fuse_setattr_in *inarg_p,
> +   struct fuse_attr_out *outarg_p)
> +{
> + req->in.h.opcode = FUSE_SETATTR;
> + req->in.h.nodeid = get_node_id(inode);
> + req->in.numargs = 1;
> + req->in.args[0].size = sizeof(*inarg_p);
> + req->in.args[0].value = inarg_p;
> + req->out.numargs = 1;
> + if (fc->minor < 9)
> + req->out.args[0].size = FUSE_COMPAT_ATTR_OUT_SIZE;
> + else
> + req->out.args[0].size = sizeof(*outarg_p);
> + req->out.args[0].value = outarg_p;
> +}
> +
> +/*
> + * Flush inode->i_mtime to the server
> + */
> +int fuse_flush_mtime(struct file *file, bool nofail)
> +{
> + struct inode *inode = file->f_mapping->host;
> + struct fuse_inode *fi = get_fuse_inode(inode);
> + struct fuse_conn *fc = get_fuse_conn(inode);
> + struct fuse_req *req = NULL;
> + struct fuse_setattr_in inarg;
> + struct fuse_attr_out outarg;
> + int err;
> +
> + if (nofail) {
> + req = fuse_get_req_nofail_nopages(fc, file);
> + } else {
> + req = fuse_get_req_nopages(fc);
> + if (IS_ERR(req))
> + return PTR_ERR(req);
> + }
> +
> + memset(, 0, sizeof(inarg));
> + memset(, 0, sizeof(outarg));
> +
> + inarg.valid |= FATTR_MTIME;
> + inarg.mtime = inode->i_mtime.tv_sec;
> + inarg.mtimensec = inode->i_mtime.tv_nsec;
> +
> + fuse_setattr_fill(fc, req, inode, , );
> + fuse_request_send(fc, req);
> + err = req->out.h.error;
> + fuse_put_request(fc, req);
> +
> + if (!err)
> + clear_bit(FUSE_I_MTIME_DIRTY, >state);

Doing the test and the clear separately opens a huge race window when i_mtime
modifications are bound to get lost.

> +
> + return err;
> +}
> +
> +/*
> + * S_NOCMTIME is clear, so we need to update inode->i_mtime manually.
> + */
> +static void fuse_set_mtime_local(struct iattr *iattr, struct inode *inode)
> +{
> + unsigned ivalid = iattr->ia_valid;
> + struct fuse_inode *fi = get_fuse_inode(inode);
> +
> + if ((ivalid & ATTR_MTIME) && (ivalid & ATTR_MTIME_SET)) {
> + /* client fs has just set mtime to iattr->ia_mtime */
> + inode->i_mtime = iattr->ia_mtime;
> + clear_bit(FUSE_I_MTIME_DIRTY, >state);

This is protected by i_mutex, so it should be safe.

> + } else if ((ivalid & ATTR_MTIME) || (ivalid & ATTR_SIZE)) {
> + /* client fs doesn't know that we're updating i_mtime */

If so, why not tell the client fs to update mtime?

> + inode->i_mtime = current_fs_time(inode->i_sb);
> + set_bit(FUSE_I_MTIME_DIRTY, >state);
> + }
> +}
> +
>  /*
>   * Set attributes, and at the same time refresh them.
>   *
> @@ -1641,17 +1720,7 @@ int fuse_do_setattr(struct inode *inode, struct iattr 
> *attr,
>   inarg.valid |= FATTR_LOCKOWNER;
>   inarg.lock_owner = fuse_lock_owner_id(fc, current->files);
>   }
> - req->in.h.opcode = FUSE_SETATTR;
> - req->in.h.nodeid =

Re: scripts: checkpatch.pl & Lindent (minor complaint)

2013-11-12 Thread Mimi Zohar

On Tue, 2013-11-12 at 08:30 -0800, Joe Perches wrote:
> On Tue, 2013-11-12 at 11:09 -0500, Mimi Zohar wrote:
> > On Tue, 2013-11-12 at 07:44 -0800, Joe Perches wrote:
> > > On Tue, 2013-11-12 at 09:42 -0500, Mimi Zohar wrote: 
> > > > scripts/Lindent and scripts/checkpatch disagree whether the fields in a
> > > > statically initialized array should be blank separated.  
> > > > 
> > > > static struct ima_rule_entry default_rules[] = {
> > > > {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
> > > > IMA_FSMAGIC},
> > > > 
> > > > Lindent adds a blank before '.fsmagic', which checkpatch then complains
> > > > about (eg. commit 75834fc3).
> > > 
> > > Perhaps I don't understand what you mean.
> > 
> > > Lindent _doesn't_add a blank and checkpatch
> > > seems to do the right thing here.
> > 
> > Sorry, my mistake.  It's the reverse.  Checkpatch complains about the
> > missing blank, which Lindent then removes.
> 
> My suggestion is not to use Lindent.
> 
> If you want a semi-automated source-code reformatting tool,
> use scripts/checkpatch.pl --fix

Thanks, perhaps this suggestion should be reflected in
Documentation/CodingStyle.

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH v5 00/14] sched: packing tasks

2013-11-12 Thread Arjan van de Ven


On 11/11/2013 10:18 AM, Catalin Marinas wrote:

The ordering is based on the actual C-state, so a simple way is to wake
up the CPU in the shallowest C-state. With asymmetric configurations
(big.LITTLE) we have different costs for the same C-state, so this would
come in handy.


btw I was considering something else; in practice CPUs will be in the deepest 
state..
... at which point I was going to go with some other metrics of what is best 
from a platform level

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Crypto Update for 3.13

2013-11-12 Thread Herbert Xu

Hi Linus:

Here is the crypto update for 3.13:

* Made x86 ablk_helper generic for ARM.
* Phase out chainiv in favour of eseqiv (affects IPsec).
* Fixed aes-cbc IV corruption on s390.
* Added constant-time crypto_memneq which replaces memcmp.

* Fixed aes-ctr in omap-aes.
* Added OMAP3 ROM RNG support.
* Add PRNG support for MSM SoC's
* Add and use Job Ring API in caam.

* Misc fixes.


Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git



Alex Porosanu (7):
  crypto: caam - fix RNG state handle instantiation descriptor
  crypto: caam - fix hash, alg and rng registration if CAAM driver not 
initialized
  crypto: caam - fix RNG4 instantiation
  crypto: caam - split RNG4 instantiation function
  crypto: caam - uninstantiate RNG state handle 0 if instantiated by caam 
driver
  crypto: caam - fix RNG4 AAI defines
  crypto: caam - enable instantiation of all RNG4 state handles

Ard Biesheuvel (2):
  crypto: create generic version of ablk_helper
  crypto: move x86 to the generic version of ablk_helper

Ben Hutchings (1):
  hwrng: via-rng - Mark device ID table as __maybe_unused

Fabio Estevam (4):
  crypto: dcp - Use devm_ioremap_resource()
  crypto: dcp - Use devm_request_irq()
  crypto: dcp - Fix the path for releasing the resources
  crypto: dcp - Check the return value from devm_ioremap_resource()

Herbert Xu (2):
  crypto: skcipher - Use eseqiv even on UP machines
  crypto: s390 - Fix aes-cbc IV corruption

James Yonan (1):
  crypto: crypto_memneq - add equality testing of memory regions w/o timing 
leaks

Joel Fernandes (1):
  crypto: omap-aes - Fix CTR mode counter length

Joni Lapilainen (1):
  crypto: omap-sham - Add missing modalias

Jussi Kivilinna (2):
  crypto: sha256_ssse3 - use correct module alias for sha224
  crypto: x86 - restore avx2_supported check

Linus Walleij (1):
  crypto: tegra - use kernel entropy instead of ad-hoc

Mathias Krause (6):
  crypto: authenc - Export key parsing helper function
  crypto: authencesn - Simplify key parsing
  crypto: ixp4xx - Simplify and harden key parsing
  crypto: picoxcell - Simplify and harden key parsing
  crypto: talitos - Simplify key parsing
  padata: make the sequence counter an atomic_t

Michael Ellerman (2):
  hwrng: pseries - Use KBUILD_MODNAME in pseries-rng.c
  hwrng: pseries - Return errors to upper levels in pseries-rng.c

Michael Opdenacker (1):
  crypto: mv_cesa: remove deprecated IRQF_DISABLED

Neil Horman (1):
  crypto: ansi_cprng - Fix off by one error in non-block size request

Oliver Neukum (1):
  crypto: sha256_ssse3 - also test for BMI2

Pali Rohár (1):
  hwrng: OMAP3 ROM Random Number Generator support

Ruchika Gupta (3):
  crypto: caam - Add Platform driver for Job Ring
  crypto: caam - Add API's to allocate/free Job Rings
  crypto: caam - Modify the interface layers to use JR API's

Sachin Kamat (7):
  crypto: mv_cesa - Staticize local symbols
  crypto: omap-aes - Staticize local symbols
  crypto: tegra-aes - Staticize tegra_aes_cra_exit
  crypto: tegra-aes - Fix NULL pointer dereference
  crypto: tegra-aes - Use devm_clk_get
  crypto: sahara - Remove redundant of_match_ptr
  crypto: mv_cesa - Remove redundant of_match_ptr

Stanimir Varbanov (2):
  ARM: DT: msm: Add Qualcomm's PRNG driver binding document
  hwrng: msm - Add PRNG support for MSM SoC's

Stephen Warren (1):
  ARM: tegra: remove tegra_chip_uid()

Yashpal Dutta (1):
  crypto: caam - map src buffer before access

kbuild test robot (1):
  crypto: ablk_helper - Replace memcpy with struct assignment

 .../devicetree/bindings/rng/qcom,prng.txt  |   17 +
 arch/arm/mach-tegra/fuse.c |   10 -
 arch/s390/crypto/aes_s390.c|   19 +-
 arch/x86/crypto/Makefile   |3 +-
 arch/x86/crypto/aesni-intel_glue.c |2 +-
 arch/x86/crypto/camellia_aesni_avx2_glue.c |2 +-
 arch/x86/crypto/camellia_aesni_avx_glue.c  |2 +-
 arch/x86/crypto/cast5_avx_glue.c   |2 +-
 arch/x86/crypto/cast6_avx_glue.c   |2 +-
 arch/x86/crypto/serpent_avx2_glue.c|2 +-
 arch/x86/crypto/serpent_avx_glue.c |2 +-
 arch/x86/crypto/serpent_sse2_glue.c|2 +-
 arch/x86/crypto/sha256_ssse3_glue.c|4 +-
 arch/x86/crypto/twofish_avx_glue.c |2 +-
 arch/x86/include/asm/simd.h|   11 +
 crypto/Kconfig |   23 +-
 crypto/Makefile|8 +-
 {arch/x86/crypto => crypto}/ablk_helper.c  |   13 +-
 crypto/ablkcipher.c|   21 +-
 crypto/ansi_cprng.c|4 +-
 crypto/asymmetric_keys/rsa.c

Attention!

2013-11-12 Thread Web Security



Dear Email User,

Your mailbox has exceeded the storage limit which is 20.00 GB as set by your 
administrator, you are currently running on 19.99 GB, you may not be able to 
send or receive new mail until you re-validate your email box. Kindly click the 
link below to re-validate your email account, If the page does not appear on 
your browser, you can copy and paste the link into your browser and fill in 
your account details, Click on "Submit" for account update.

http://tiny.cc/wjqe5w

Thanks!
Mail Administrator!
Case Number: 894162/2013
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] sched: Check sched_domain before computing group power.

2013-11-12 Thread Srikar Dronamraju

After Commit-id 863bffc80898 (sched/fair: Fix group power_orig
computation), we might end up computing group power before the
sched_domain for a cpu is updated.

Update with cpu_power, if rq->sd is not yet updated.

Signed-off-by: Srikar Dronamraju 
---
Changelog since v1: Fix divide by zero errors that can result because
power/power_orig was set to 0.

 kernel/sched/fair.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index df77c60..8d92853 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5354,8 +5354,16 @@ void update_group_power(struct sched_domain *sd, int cpu)
 */
 
for_each_cpu(cpu, sched_group_cpus(sdg)) {
-   struct sched_group *sg = cpu_rq(cpu)->sd->groups;
+   struct rq *rq = cpu_rq(cpu);
+   struct sched_group *sg;
 
+   if (!rq->sd) {
+   power_orig += power_of(cpu);
+   power += power_of(cpu);
+   continue;
+   }
+
+   sg = rq->sd->groups;
power_orig += sg->sgp->power_orig;
power += sg->sgp->power;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf trace: Add summary only option

2013-11-12 Thread Pekka Enberg


On 11/12/2013 06:31 PM, David Ahern wrote:

Per request from Pekka make --summary a summary only option meaning do not
show the individual system calls. Add another option to see all syscalls
along with the summary. In addition use 's' and 'S' as shortcuts for the
options.

Signed-off-by: David Ahern 
Cc: Pekka Enberg 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Adrian Hunter 
---


Thanks David!

Tested-by: Pekka Enberg 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] perf trace: Add summary only option

2013-11-12 Thread David Ahern

Per request from Pekka make --summary a summary only option meaning do not
show the individual system calls. Add another option to see all syscalls
along with the summary. In addition use 's' and 'S' as shortcuts for the
options.

Signed-off-by: David Ahern 
Cc: Pekka Enberg 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Adrian Hunter 
---
 tools/perf/Documentation/perf-trace.txt | 10 --
 tools/perf/builtin-trace.c  | 16 +---
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-trace.txt 
b/tools/perf/Documentation/perf-trace.txt
index 7b0497f95a75..fae38d9a44a4 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -93,9 +93,15 @@ the thread executes on the designated CPUs. Default is to 
monitor all CPUs.
 --comm::
 Show process COMM right beside its ID, on by default, disable with 
--no-comm.
 
+-s::
 --summary::
-   Show a summary of syscalls by thread with min, max, and average times 
(in
-msec) and relative stddev.
+   Show only a summary of syscalls by thread with min, max, and average 
times
+(in msec) and relative stddev.
+
+-S::
+--with-summary::
+   Show all syscalls followed by a summary by thread with min, max, and
+average times (in msec) and relative stddev.
 
 --tool_stats::
Show tool stats such as number of times fd->pathname was discovered thru
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c3008b1c369c..9da374cdc23a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1155,6 +1155,7 @@ struct trace {
boolsched;
boolmultiple_threads;
boolsummary;
+   boolsummary_only;
boolshow_comm;
boolshow_tool_stats;
double  duration_filter;
@@ -1598,7 +1599,7 @@ static int trace__sys_enter(struct trace *trace, struct 
perf_evsel *evsel,
   args, trace, thread);
 
if (!strcmp(sc->name, "exit_group") || !strcmp(sc->name, "exit")) {
-   if (!trace->duration_filter) {
+   if (!trace->duration_filter && !trace->summary_only) {
trace__fprintf_entry_head(trace, thread, 1, 
sample->time, trace->output);
fprintf(trace->output, "%-70s\n", ttrace->entry_str);
}
@@ -1651,6 +1652,9 @@ static int trace__sys_exit(struct trace *trace, struct 
perf_evsel *evsel,
} else if (trace->duration_filter)
goto out;
 
+   if (trace->summary_only)
+   goto out;
+
trace__fprintf_entry_head(trace, thread, duration, sample->time, 
trace->output);
 
if (ttrace->entry_pending) {
@@ -2265,8 +2269,10 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_INCR('v', "verbose", , "be more verbose"),
OPT_BOOLEAN('T', "time", _time,
"Show full timestamp, not time relative to first start"),
-   OPT_BOOLEAN(0, "summary", ,
-   "Show syscall summary with statistics"),
+   OPT_BOOLEAN('s', "summary", _only,
+   "Show only syscall summary with statistics"),
+   OPT_BOOLEAN('S', "with-summary", ,
+   "Show all syscalls and summary with statistics"),
OPT_END()
};
int err;
@@ -2277,6 +2283,10 @@ int cmd_trace(int argc, const char **argv, const char 
*prefix __maybe_unused)
 
argc = parse_options(argc, argv, trace_options, trace_usage, 0);
 
+   /* summary_only implies summary option, but don't overwrite summary if 
set */
+   if (trace.summary_only)
+   trace.summary = trace.summary_only;
+
if (output_name != NULL) {
err = trace__open_output(, output_name);
if (err < 0) {
-- 
1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ALSA: pcm: retrieve true appl_ptr where stream was unlocked

2013-11-12 Thread Takashi Iwai

At 12 Nov 2013 16:14:17 +,
Oskar Schirmer wrote:
> 
> Calculated for data transfer, the local variable appl_ptr is reused,
> changed and written back later, though the lock was not held during
> the transfer earlier. Admitted, destiny of the lock is not obvious
> at all, but this looks much like a race condition candidate, so make
> sure to have some original appl_ptr value to work on.

No, appl_ptr doesn't have to be re-read at that point.  Otherwise the
value will become inconsistent.

This is the write operation to a stream and it's exclusive.  The lock
is needed to protect the PCM state change (e.g. stopped from an irq
handler).


thanks,

Takashi

> 
> Two occurences.
> 
> Signed-off-by: Oskar Schirmer 
> ---
>  sound/core/pcm_lib.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
> index 6e03b46..47e836a 100644
> --- a/sound/core/pcm_lib.c
> +++ b/sound/core/pcm_lib.c
> @@ -2051,6 +2051,7 @@ static snd_pcm_sframes_t snd_pcm_lib_write1(struct 
> snd_pcm_substream *substream,
>   default:
>   break;
>   }
> + appl_ptr = runtime->control->appl_ptr;
>   appl_ptr += frames;
>   if (appl_ptr >= runtime->boundary)
>   appl_ptr -= runtime->boundary;
> @@ -2283,6 +2284,7 @@ static snd_pcm_sframes_t snd_pcm_lib_read1(struct 
> snd_pcm_substream *substream,
>   default:
>   break;
>   }
> + appl_ptr = runtime->control->appl_ptr;
>   appl_ptr += frames;
>   if (appl_ptr >= runtime->boundary)
>   appl_ptr -= runtime->boundary;
> -- 
> 1.7.9.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scripts: checkpatch.pl & Lindent (minor complaint)

2013-11-12 Thread Joe Perches

On Tue, 2013-11-12 at 11:09 -0500, Mimi Zohar wrote:
> On Tue, 2013-11-12 at 07:44 -0800, Joe Perches wrote:
> > On Tue, 2013-11-12 at 09:42 -0500, Mimi Zohar wrote: 
> > > scripts/Lindent and scripts/checkpatch disagree whether the fields in a
> > > statically initialized array should be blank separated.  
> > > 
> > > static struct ima_rule_entry default_rules[] = {
> > > {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
> > > IMA_FSMAGIC},
> > > 
> > > Lindent adds a blank before '.fsmagic', which checkpatch then complains
> > > about (eg. commit 75834fc3).
> > 
> > Perhaps I don't understand what you mean.
> 
> > Lindent _doesn't_add a blank and checkpatch
> > seems to do the right thing here.
> 
> Sorry, my mistake.  It's the reverse.  Checkpatch complains about the
> missing blank, which Lindent then removes.

My suggestion is not to use Lindent.

If you want a semi-automated source-code reformatting tool,
use scripts/checkpatch.pl --fix


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: [PATCH 2/8] watchdog: davinci: use davinci_wdt_device structure to hold device data

2013-11-12 Thread Santosh Shilimkar

On Tuesday 12 November 2013 11:27 AM, Guenter Roeck wrote:
> On Tue, Nov 12, 2013 at 10:37:04AM -0500, Santosh Shilimkar wrote:
>> On Wednesday 06 November 2013 06:31 AM, ivan.khoronzhuk wrote:
>>> Some SoCs, like Keystone 2, can support more than one WDT and each
>>> watchdog device has to use it's own base address, clock source,
>>> wdd device, so add new davinci_wdt_device structure to hold device
>> In commit avoid struct names ;)
>> s/wdd/watchdog device
>>> data.
>>>
>>> Signed-off-by: Ivan Khoronzhuk 
>>> ---
>>>  drivers/watchdog/davinci_wdt.c |   74 
>>> ++--
>>>  1 file changed, 48 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/watchdog/davinci_wdt.c b/drivers/watchdog/davinci_wdt.c
>>> index a6eef71..1fc2093 100644
>>> --- a/drivers/watchdog/davinci_wdt.c
>>> +++ b/drivers/watchdog/davinci_wdt.c
>>
>> [...]
>>
>>> @@ -123,14 +135,21 @@ static int davinci_wdt_probe(struct platform_device 
>>> *pdev)
>>> struct device *dev = >dev;
>>> struct resource  *wdt_mem;
>>> struct watchdog_device *wdd;
>>> +   struct davinci_wdt_device *davinci_wdt;
>>> +
>>> +   davinci_wdt = devm_kzalloc(dev, sizeof(*davinci_wdt), GFP_KERNEL);
>>> +   if (!davinci_wdt)
>>> +   return -ENOMEM;
>>>  
>>> -   wdt_clk = devm_clk_get(dev, NULL);
>>> -   if (WARN_ON(IS_ERR(wdt_clk)))
>>> -   return PTR_ERR(wdt_clk);
>>> +   davinci_wdt->clk = devm_clk_get(dev, NULL);
>>> +   if (WARN_ON(IS_ERR(davinci_wdt->clk)))
>>> +   return PTR_ERR(davinci_wdt->clk);
>>>  
>>> -   clk_prepare_enable(wdt_clk);
>>> +   clk_prepare_enable(davinci_wdt->clk);
>>>  
>>> -   wdd = _wdd;
>>> +   platform_set_drvdata(pdev, davinci_wdt);
>>> +
>>> +   wdd = _wdt->wdd;
>>> wdd->info   = _wdt_info;
>>> wdd->ops= _wdt_ops;
>>> wdd->min_timeout= 1;
>>> @@ -142,12 +161,13 @@ static int davinci_wdt_probe(struct platform_device 
>>> *pdev)
>>>  
>>> dev_info(dev, "heartbeat %d sec\n", wdd->timeout);
>>>  
>>> +   watchdog_set_drvdata(wdd, davinci_wdt);
>>> watchdog_set_nowayout(wdd, WATCHDOG_NOWAYOUT);
>>>  
>>> wdt_mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>> -   wdt_base = devm_ioremap_resource(dev, wdt_mem);
>>> -   if (IS_ERR(wdt_base))
>>> -   return PTR_ERR(wdt_base);
>>> +   davinci_wdt->base = devm_ioremap_resource(dev, wdt_mem);
>>> +   if (IS_ERR(davinci_wdt->base))
>>> +   return PTR_ERR(davinci_wdt->base);
>> You should free up davinci_wdt memory before returning, right ?
>>
> No, devm should take care of that.
> 
You are right. I didn't pay attention about the devm_*() usage.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/7] clocksource/cadence_ttc: Adjust interval in clock notifier

2013-11-12 Thread Daniel Lezcano


On 11/08/2013 10:21 PM, Soren Brinkmann wrote:

The clockevent has to be reprogrammed if the timer's input
clock frequency changes and the timer is in periodic mode, in order to
maintain the correct timer interval.

Signed-off-by: Soren Brinkmann 
---
  drivers/clocksource/cadence_ttc_timer.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/clocksource/cadence_ttc_timer.c 
b/drivers/clocksource/cadence_ttc_timer.c
index a92350b55d32..68a336038d8f 100644
--- a/drivers/clocksource/cadence_ttc_timer.c
+++ b/drivers/clocksource/cadence_ttc_timer.c
@@ -338,6 +338,10 @@ static int ttc_rate_change_clockevent_cb(struct 
notifier_block *nb,
/* update cached frequency */
ttc->freq = ndata->new_rate;

+   if (ttcce->ce.mode == CLOCK_EVT_MODE_PERIODIC)
+   ttc_set_interval(ttc, DIV_ROUND_CLOSEST(ttc->freq,
+   PRESCALE * HZ));
+


Couldn't be racy ?


/* fall through */
}
case PRE_RATE_CHANGE:




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: [PATCH 2/8] watchdog: davinci: use davinci_wdt_device structure to hold device data

2013-11-12 Thread Guenter Roeck

On Tue, Nov 12, 2013 at 10:37:04AM -0500, Santosh Shilimkar wrote:
> On Wednesday 06 November 2013 06:31 AM, ivan.khoronzhuk wrote:
> > Some SoCs, like Keystone 2, can support more than one WDT and each
> > watchdog device has to use it's own base address, clock source,
> > wdd device, so add new davinci_wdt_device structure to hold device
> In commit avoid struct names ;)
> s/wdd/watchdog device
> > data.
> > 
> > Signed-off-by: Ivan Khoronzhuk 
> > ---
> >  drivers/watchdog/davinci_wdt.c |   74 
> > ++--
> >  1 file changed, 48 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/watchdog/davinci_wdt.c b/drivers/watchdog/davinci_wdt.c
> > index a6eef71..1fc2093 100644
> > --- a/drivers/watchdog/davinci_wdt.c
> > +++ b/drivers/watchdog/davinci_wdt.c
> 
> [...]
> 
> > @@ -123,14 +135,21 @@ static int davinci_wdt_probe(struct platform_device 
> > *pdev)
> > struct device *dev = >dev;
> > struct resource  *wdt_mem;
> > struct watchdog_device *wdd;
> > +   struct davinci_wdt_device *davinci_wdt;
> > +
> > +   davinci_wdt = devm_kzalloc(dev, sizeof(*davinci_wdt), GFP_KERNEL);
> > +   if (!davinci_wdt)
> > +   return -ENOMEM;
> >  
> > -   wdt_clk = devm_clk_get(dev, NULL);
> > -   if (WARN_ON(IS_ERR(wdt_clk)))
> > -   return PTR_ERR(wdt_clk);
> > +   davinci_wdt->clk = devm_clk_get(dev, NULL);
> > +   if (WARN_ON(IS_ERR(davinci_wdt->clk)))
> > +   return PTR_ERR(davinci_wdt->clk);
> >  
> > -   clk_prepare_enable(wdt_clk);
> > +   clk_prepare_enable(davinci_wdt->clk);
> >  
> > -   wdd = _wdd;
> > +   platform_set_drvdata(pdev, davinci_wdt);
> > +
> > +   wdd = _wdt->wdd;
> > wdd->info   = _wdt_info;
> > wdd->ops= _wdt_ops;
> > wdd->min_timeout= 1;
> > @@ -142,12 +161,13 @@ static int davinci_wdt_probe(struct platform_device 
> > *pdev)
> >  
> > dev_info(dev, "heartbeat %d sec\n", wdd->timeout);
> >  
> > +   watchdog_set_drvdata(wdd, davinci_wdt);
> > watchdog_set_nowayout(wdd, WATCHDOG_NOWAYOUT);
> >  
> > wdt_mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > -   wdt_base = devm_ioremap_resource(dev, wdt_mem);
> > -   if (IS_ERR(wdt_base))
> > -   return PTR_ERR(wdt_base);
> > +   davinci_wdt->base = devm_ioremap_resource(dev, wdt_mem);
> > +   if (IS_ERR(davinci_wdt->base))
> > +   return PTR_ERR(davinci_wdt->base);
> You should free up davinci_wdt memory before returning, right ?
> 
No, devm should take care of that.

Guenter

> Other than that patch looks fine to me. With above fixed,
> Acked-by: Santosh Shilimkar 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scripts: checkpatch.pl & Lindent (minor complaint)

2013-11-12 Thread Mimi Zohar

On Tue, 2013-11-12 at 07:44 -0800, Joe Perches wrote:
> On Tue, 2013-11-12 at 09:42 -0500, Mimi Zohar wrote: 
> > scripts/Lindent and scripts/checkpatch disagree whether the fields in a
> > statically initialized array should be blank separated.  
> > 
> > static struct ima_rule_entry default_rules[] = {
> > {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
> > IMA_FSMAGIC},
> > 
> > Lindent adds a blank before '.fsmagic', which checkpatch then complains
> > about (eg. commit 75834fc3).
> 
> Perhaps I don't understand what you mean.

> Lindent _doesn't_add a blank and checkpatch
> seems to do the right thing here.

Sorry, my mistake.  It's the reverse.  Checkpatch complains about the
missing blank, which Lindent then removes.

#40: FILE: security/integrity/ima/ima_policy.c:52:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags =
IMA_FSMAGIC},

ERROR: space required after that ',' (ctx:VxV)
#40: FILE: security/integrity/ima/ima_policy.c:52:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags =
IMA_FSMAGIC},
   ^
ERROR: space required after that ',' (ctx:VxV)

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/7] clocksource/cadence_ttc: Store timer frequency in driver data

2013-11-12 Thread Daniel Lezcano


On 11/08/2013 10:21 PM, Soren Brinkmann wrote:

It is not allowed to call clk_get_rate() from interrupt context. To
avoid such calls the timer input frequency is stored in the driver's
data struct which makes it accessible to the driver in any context.

Signed-off-by: Soren Brinkmann 
---


Acked-by: Daniel Lezcano 


  drivers/clocksource/cadence_ttc_timer.c | 21 +
  1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/clocksource/cadence_ttc_timer.c 
b/drivers/clocksource/cadence_ttc_timer.c
index b2bb3a4bc205..a92350b55d32 100644
--- a/drivers/clocksource/cadence_ttc_timer.c
+++ b/drivers/clocksource/cadence_ttc_timer.c
@@ -67,11 +67,13 @@
   * struct ttc_timer - This definition defines local timer structure
   *
   * @base_addr:Base address of timer
+ * @freq:  Timer input clock frequency
   * @clk:  Associated clock source
   * @clk_rate_change_nbNotifier block for clock rate changes
   */
  struct ttc_timer {
void __iomem *base_addr;
+   unsigned long freq;
struct clk *clk;
struct notifier_block clk_rate_change_nb;
  };
@@ -196,9 +198,8 @@ static void ttc_set_mode(enum clock_event_mode mode,

switch (mode) {
case CLOCK_EVT_MODE_PERIODIC:
-   ttc_set_interval(timer,
-   DIV_ROUND_CLOSEST(clk_get_rate(ttce->ttc.clk),
-   PRESCALE * HZ));
+   ttc_set_interval(timer, DIV_ROUND_CLOSEST(ttce->ttc.freq,
+   PRESCALE * HZ));
break;
case CLOCK_EVT_MODE_ONESHOT:
case CLOCK_EVT_MODE_UNUSED:
@@ -273,6 +274,8 @@ static void __init ttc_setup_clocksource(struct clk *clk, 
void __iomem *base)
return;
}

+   ttccs->ttc.freq = clk_get_rate(ttccs->ttc.clk);
+
ttccs->ttc.clk_rate_change_nb.notifier_call =
ttc_rate_change_clocksource_cb;
ttccs->ttc.clk_rate_change_nb.next = NULL;
@@ -298,16 +301,14 @@ static void __init ttc_setup_clocksource(struct clk *clk, 
void __iomem *base)
__raw_writel(CNT_CNTRL_RESET,
 ttccs->ttc.base_addr + TTC_CNT_CNTRL_OFFSET);

-   err = clocksource_register_hz(>cs,
-   clk_get_rate(ttccs->ttc.clk) / PRESCALE);
+   err = clocksource_register_hz(>cs, ttccs->ttc.freq / PRESCALE);
if (WARN_ON(err)) {
kfree(ttccs);
return;
}

ttc_sched_clock_val_reg = base + TTC_COUNT_VAL_OFFSET;
-   setup_sched_clock(ttc_sched_clock_read, 16,
-   clk_get_rate(ttccs->ttc.clk) / PRESCALE);
+   setup_sched_clock(ttc_sched_clock_read, 16, ttccs->ttc.freq / PRESCALE);
  }

  static int ttc_rate_change_clockevent_cb(struct notifier_block *nb,
@@ -334,6 +335,9 @@ static int ttc_rate_change_clockevent_cb(struct 
notifier_block *nb,
ndata->new_rate / PRESCALE);
local_irq_restore(flags);

+   /* update cached frequency */
+   ttc->freq = ndata->new_rate;
+
/* fall through */
}
case PRE_RATE_CHANGE:
@@ -367,6 +371,7 @@ static void __init ttc_setup_clockevent(struct clk *clk,
if (clk_notifier_register(ttcce->ttc.clk,
>ttc.clk_rate_change_nb))
pr_warn("Unable to register clock notifier.\n");
+   ttcce->ttc.freq = clk_get_rate(ttcce->ttc.clk);

ttcce->ttc.base_addr = base;
ttcce->ce.name = "ttc_clockevent";
@@ -396,7 +401,7 @@ static void __init ttc_setup_clockevent(struct clk *clk,
}

clockevents_config_and_register(>ce,
-   clk_get_rate(ttcce->ttc.clk) / PRESCALE, 1, 0xfffe);
+   ttcce->ttc.freq / PRESCALE, 1, 0xfffe);
  }

  /**




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipvs: Remove unused variable ret from sync_thread_master()

2013-11-12 Thread Oleg Nesterov

On 11/12, Peter Zijlstra wrote:
>
> On Tue, Nov 12, 2013 at 02:21:39PM -, David Laight wrote:
> >
> > /* Tell scheduler we are going to sleep... */
> > if (signal_pending(current))
> > /* We don't want waking immediately (again) */
> > sleep_state = TASK_UNINTERRUPTIBLE;
> > else
> > sleep_state = TASK_INTERRUPTIBLE;
> > set_current_state(sleep_state);
>
> If this is for kernel threads, I think you can wipe the pending state;

Yes, unless this kthread does allow_signal() signal_pending() can't be
true.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/12] mtd: nand: davinci: don't request AEMIF address range

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 12:12 PM, Khoronzhuk, Ivan wrote:
> The AEMIF driver registers are used to setup timings for each chip
> select. The same registers range is used to setup NAND settings.
> The AEMIF and NAND drivers not use the same registers in this range.
> 
> In case with AEMIF driver, the memory address range is requested
> already by AEMIF, so we cannot request it twice, just ioremap.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 0/9] encrypted keys & key control op

2013-11-12 Thread David Howells

Mimi Zohar  wrote:

> > > I'm sure there is/was a good reason for add_key() to do both.
> > 
> > Yes.  No race.
> > 
> > > > But you can't pre-search for the existence of a key and mould the
> > > > payload accordingly because that means you can race against both
> > > > add_key() and keyctl_unlink().
> > > 
> > > Would this still be the case, if you differentiated between
> > > instantiating and updating a key?
> > 
> > Yes.  Imagine, you try to add a key and it gets rejected because the key
> > already exists.  You then try to update the existing key, but that gets
> > rejected because someone unlinked the key in the meantime.  So you try and
> > add it again, but this now fails because someone added a new key.  Repeat.
> 
> A counter example would be two processes, having nothing to do with each
> other, attempt to a create a key with the same name.  Instead of each
> process getting its own key, they land up sharing the same key.
> Not only are they sharing the same key, but neither process knows that
> there is another process using the same key.  I would think this is a
> bigger problem.
> 
> Failing to create/update a key, at least to me, seems safer than having
> two apps trying to create a key with same name, but instead land up
> using the same key.

Yes.  Two keys of the same type with the same description should be able to
substitute for one another and should be able to fulfil the same roll.  Safety
should not be an issue.

> > Or add_key() could immediately displace a key someone else just added,
> > leaving them with a key ID that disappeared as soon as it was returned due
> > to an add/add race.
> 
> This is a separate issue.  If a key/keyring exists, a new key/keyring,
> with the same name, should not be created replacing the existing
> key/keyring.  It should simply fail.  (Removing a key/keyring first,
> before creating a key/keyring of the same name, is different.)

If you have a key type that's not "updateable" then you'd have to unlink it
before trying to add a new one.  This would give you a gap in time where the
key does not exist.

So, no, creating a new key with the same name *should* atomically displace an
old one if it exists - if it doesn't update it instead.  Note that keys have an
"under construction" concept so that the core can create a partially formed key
and then instantiate it at its leisure whilst using it to block those that
would like to use it.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipvs: Remove unused variable ret from sync_thread_master()

2013-11-12 Thread Oleg Nesterov

On 11/12, Peter Zijlstra wrote:
>
> On Tue, Nov 12, 2013 at 02:21:39PM -, David Laight wrote:
> > Shame there isn't a process flag to indicate that the process
> > will sleep uninterruptibly and that it doesn't matter.
> > So don't count to the load average and don't emit a warning
> > if it has been sleeping for a long time.
>
> A process flag wouldn't work, because the task could block waiting for
> actual work to complete in other sleeps.
>
> However, we could do something like the below; which would allow us
> writing things like:
>
>   (void)___wait_event(*sk_sleep(sk),
>   sock_writeable(sk) || kthread_should_stop(),
>   TASK_UNINTERRUPTIBLE | TASK_IDLE, 0, 0,
>   schedule());
>
> Marking the one wait-for-more-work as TASK_IDLE such that it doesn't
> contribute to the load avg.

Agreed, I thought about additional bit too.

>  static const char * const task_state_array[] = {
> - "R (running)",  /*   0 */
> - "S (sleeping)", /*   1 */
> - "D (disk sleep)",   /*   2 */
> - "T (stopped)",  /*   4 */
> - "t (tracing stop)", /*   8 */
> - "Z (zombie)",   /*  16 */
> - "X (dead)", /*  32 */
> - "x (dead)", /*  64 */
> - "K (wakekill)", /* 128 */
> - "W (waking)",   /* 256 */
> - "P (parked)",   /* 512 */
> + "R (running)",  /*0 */
> + "S (sleeping)", /*1 */
> + "D (disk sleep)",   /*2 */
> + "T (stopped)",  /*4 */
> + "t (tracing stop)", /*8 */
> + "Z (zombie)",   /*   16 */
> + "X (dead)", /*   32 */
> + "x (dead)", /*   64 */
> + "K (wakekill)", /*  128 */
> + "W (waking)",   /*  256 */
> + "P (parked)",   /*  512 */
> + "I (idle)", /* 1024 */
>  };

but I am not sure about what /proc/ should report in this case...

>  #define task_contributes_to_load(task)   \
>   ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
> -  (task->flags & PF_FROZEN) == 0)
> +  (task->flags & PF_FROZEN) == 0 && \
> +  (task->state & TASK_IDLE) == 0)

perhaps

(task->state & (TASK_UNINTERRUPTIBLE | TASK_IDLE)) == 
TASK_UNINTERRUPTIBLE

can save an insn.

I am also wondering if it makes any sense to turn PF_FROZEN into
TASK_FROZEN, something like (incomplete, probably racy) patch below.
Note that it actually adds the new state, not the the qualifier.

Oleg.

--- x/include/linux/freezer.h
+++ x/include/linux/freezer.h
@@ -23,7 +23,7 @@ extern unsigned int freeze_timeout_msecs
  */
 static inline bool frozen(struct task_struct *p)
 {
-   return p->flags & PF_FROZEN;
+   return p->state & TASK_FROZEN;
 }
 
 extern bool freezing_slow_path(struct task_struct *p);
--- x/kernel/freezer.c
+++ x/kernel/freezer.c
@@ -57,16 +57,13 @@ bool __refrigerator(bool check_kthr_stop
pr_debug("%s entered refrigerator\n", current->comm);
 
for (;;) {
-   set_current_state(TASK_UNINTERRUPTIBLE);
-
spin_lock_irq(_lock);
-   current->flags |= PF_FROZEN;
-   if (!freezing(current) ||
-   (check_kthr_stop && kthread_should_stop()))
-   current->flags &= ~PF_FROZEN;
+   if (freezing(current) &&
+   !(check_kthr_stop && kthread_should_stop()))
+   set_current_state(TASK_FROZEN);
spin_unlock_irq(_lock);
 
-   if (!(current->flags & PF_FROZEN))
+   if (!(current->state & TASK_FROZEN))
break;
was_frozen = true;
schedule();
@@ -148,8 +145,7 @@ void __thaw_task(struct task_struct *p)
 * refrigerator.
 */
spin_lock_irqsave(_lock, flags);
-   if (frozen(p))
-   wake_up_process(p);
+   try_to_wake_up(p, TASK_FROZEN, 0);
spin_unlock_irqrestore(_lock, flags);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bcache: process get stucks when doing write IOs in writeback mode

2013-11-12 Thread Francis Moreau

Hello,

It doesn't seem my initial post reached LKML, maybe that's due to the
dmesg file I initially attached. So I'm replying to this hoping that
this is going to be fixed (since the attached file is gone).

On Mon, Nov 11, 2013 at 6:45 PM, Francis Moreau  wrote:
> Hello,
>
> [ Resending this issue to LKML to reach a wider audience since I've
> got no answer so far on bcache mailing list and it seems a pretty
> major bug in that component ]
>
> I'm using bcache on a very basic setup: no MD or LVM involved.
> /dev/sda4 (900Mo) is the backing device while /dev/sdb (120G) is the
> cache device. On top of bcache0 I'm using ext4 and I'm using it as my
> root device.
>
> I initially created the bcache0 device with default using writethough
> mode. I haven't (yet) experienced any issues using this mode: I
> successfully installed my system (archlinux) on it.
>
> I decided to switch to writeback mode and encounter several times the
> same issue: after doing a lot of IOs (for example when installing new
> packages) one process is stuck in D state. Currently I can see this:
>
> # ps aux | grep D+
> root  1080  0.0  0.0  41796  5728 pts/0D+   12:59   0:00 
> gtk-update-icon
>
> # cat /proc/1080/stack
> [] sleep_on_page+0xe/0x20
> [] wait_on_page_bit+0x7f/0x90
> [] filemap_fdatawait_range+0x11b/0x1a0
> [] filemap_write_and_wait_range+0x3f/0x70
> [] ext4_sync_file+0xba/0x390 [ext4]
> [] do_fsync+0x56/0x80
> [] SyS_fsync+0x10/0x20
> [] system_call_fastpath+0x1a/0x1f
> [] 0x
>
> From that point I'm not really sure what I should do to restore the
> system without loosing or breaking badly my rootfs. Any advices are
> welcome.
>
> Please find below some additionnal information that might help to fix
> this issue:
>
> # mount | grep bcache
> /dev/bcache0 on / type ext4 (rw,relatime,data=ordered)
>
> # uname -r
> 3.11.6-1-ARCH
>
> # bcache-super-show /dev/sda4
> sb.magicok
> sb.first_sector8 [match]
> sb.csumF828E134D5AB890C [match]
> sb.version1 [backing device]
>
> dev.label(empty)
> dev.uuid62839366-e5a9-43a9-9984-fc8f2aefe9de
> dev.sectors_per_block1
> dev.sectors_per_bucket1024
> dev.data.first_sector16
> dev.data.cache_mode1 [writeback]
> dev.data.cache_state2 [dirty]
>
> cset.uuid50485be4-15f7-424f-a01b-4c65fdf8487d
>
> # bcache-super-show /dev/sdb
> sb.magicok
> sb.first_sector8 [match]
> sb.csum692BB25984E31571 [match]
> sb.version3 [cache device]
>
> dev.label(empty)
> dev.uuida63ec68a-6a71-497e-86db-0dd71bbfb404
> dev.sectors_per_block1
> dev.sectors_per_bucket1024
> dev.cache.first_sector1024
> dev.cache.cache_sectors234439680
> dev.cache.total_sectors234440704
> dev.cache.orderedyes
> dev.cache.discardyes
> dev.cache.pos0
> dev.cache.replacement0 [lru]
>
> cset.uuid50485be4-15f7-424f-a01b-4c65fdf8487d
>
> I attached dmesg output which has been generated after doing "echo t
>>/proc/sysrq-trigger"
>
> Thanks
> --
> Francis
>
>
> --
> Francis



-- 
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 12/12] arm: dts: keystone: add AEMIF/NAND device entry

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 12:13 PM, Khoronzhuk, Ivan wrote:
> Add AEMIF/NAND device entry.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
>  arch/arm/boot/dts/keystone.dts |   63 
> 
>  1 file changed, 63 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/keystone.dts b/arch/arm/boot/dts/keystone.dts
> index 100bdf5..998da98 100644
> --- a/arch/arm/boot/dts/keystone.dts
> +++ b/arch/arm/boot/dts/keystone.dts
> @@ -179,5 +179,68 @@
> interrupts = ;
> clocks = <>;
> };
> +
> +   aemif@021000A00 {
> +   compatible = "ti,keystone-aemif";
> +   #address-cells = <2>;
> +   #size-cells = <1>;
> +   clocks = <>;
> +   clock-names = "aemif";
> +   clock-ranges;
> +
> +   reg = <0x2100A00 0x0100>;
> +   ranges = <0 0 0x3000 0x1000
> + 1 0 0x21000A00 0x0100>;
> +
> +   nand:cs0 {
> +   compatible = "ti,davinci-cs";
> +   #address-cells = <2>;
> +   #size-cells = <1>;
> +   clock-ranges;
> +   ranges;
> +
> +   /* all timings in nanoseconds */
> +   ti,davinci-cs-ta = <12>;
> +   ti,davinci-cs-rhold = <6>;
> +   ti,davinci-cs-rstrobe = <23>;
> +   ti,davinci-cs-rsetup = <9>;
> +   ti,davinci-cs-whold = <8>;
> +   ti,davinci-cs-wstrobe = <23>;
> +   ti,davinci-cs-wsetup = <8>;
> +
> +   nand@0,0 {
> +   compatible = "ti,keystone-nand";
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   reg = <0 0 0x400
> +  1 0 0x100>;
> +
> +   ti,davinci-chipselect = <0>;
> +   ti,davinci-mask-ale = <0x2000>;
> +   ti,davinci-mask-cle = <0x4000>;
> +   ti,davinci-mask-chipsel = <0>;
> +   nand-ecc-mode = "hw";
> +   ti,davinci-ecc-bits = <4>;
> +   nand-on-flash-bbt;
> +
> +   partition@0 {
> +   label = "u-boot";
> +   reg = <0x0 0x10>;
> +   read-only;
> +   };
> +
> +   partition@10 {
> +   label = "params";
> +   reg = <0x10 0x8>;
> +   read-only;
> +   };
> +
> +   partition@18 {
> +   label = "ubifs";
> +   reg = <0x18 0x7E8>;
> +   };
Lets create now board specific k2hk-evm.dts and rename keystone.dts
to keystone.dtsi and include that in boards. That way you can add the
board specific stuff in board files and common soc stuff in keystone.dtsi.

You can update that once the bindings are blessed by DT folks.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] mpc85xx_edac changes for 3.13

2013-11-12 Thread Johannes Thumshirn

Hi Linus,

Please pull these changes for mpc85xx_edac. They have been around on linux-edac
for a while.

Thanks.

--
The following changes since commit 10d0c9705e80bbd3d587c5fad24599aabaca6688:

  Merge tag 'devicetree-for-3.13' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux (2013-11-12 16:52:17 
+0900)

are available in the git repository at:


  git://github.com/morbidrsa/linux.git tags/mpc85xx-edac-for-3.13

for you to fetch changes up to 43b5acb650587ec6495c2e3b8c2e30311e9dc8cf:

  edac/85xx: Remove mpc85xx_pci_err_remove (2013-11-12 17:06:16 +0100)


mpc85xx_edac for 3.13


Chunhe Lan (1):
  edac/85xx: Add PCIe error interrupt edac support

Johannes Thumshirn (2):
  MAINTAINERS: Add edac-mpc85xx driver to MAINTAINERS
  edac/85xx: Remove mpc85xx_pci_err_remove

 MAINTAINERS |   7 +++
 drivers/edac/mpc85xx_edac.c | 120 
 drivers/edac/mpc85xx_edac.h |   7 +++
 3 files changed, 101 insertions(+), 33 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 051e4dc..3391279 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3129,6 +3129,13 @@ W:   bluesmoke.sourceforge.net
 S: Maintained
 F: drivers/edac/i82975x_edac.c

+EDAC-MPC85XX
+M: Johannes Thumshirn 
+L: linux-e...@vger.kernel.org
+W: bluesmoke.sourceforge.net
+S: Maintained
+F: drivers/edac/mpc85xx_edac.[ch]
+
 EDAC-PASEMI
 M: Egor Martovetsky 
 L: linux-e...@vger.kernel.org
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 3eb32f6..8f91821 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -1,6 +1,8 @@
 /*
  * Freescale MPC85xx Memory Controller kenel module
  *
+ * Parts Copyrighted (c) 2013 by Freescale Semiconductor, Inc.
+ *
  * Author: Dave Jiang 
  *
  * 2006-2007 (c) MontaVista Software, Inc. This file is licensed under
@@ -196,6 +198,42 @@ static void mpc85xx_pci_check(struct edac_pci_ctl_info 
*pci)
edac_pci_handle_npe(pci, pci->ctl_name);
 }

+static void mpc85xx_pcie_check(struct edac_pci_ctl_info *pci)
+{
+   struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
+   u32 err_detect;
+
+   err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR);
+
+   pr_err("PCIe error(s) detected\n");
+   pr_err("PCIe ERR_DR register: 0x%08x\n", err_detect);
+   pr_err("PCIe ERR_CAP_STAT register: 0x%08x\n",
+   in_be32(pdata->pci_vbase + MPC85XX_PCI_GAS_TIMR));
+   pr_err("PCIe ERR_CAP_R0 register: 0x%08x\n",
+   in_be32(pdata->pci_vbase + MPC85XX_PCIE_ERR_CAP_R0));
+   pr_err("PCIe ERR_CAP_R1 register: 0x%08x\n",
+   in_be32(pdata->pci_vbase + MPC85XX_PCIE_ERR_CAP_R1));
+   pr_err("PCIe ERR_CAP_R2 register: 0x%08x\n",
+   in_be32(pdata->pci_vbase + MPC85XX_PCIE_ERR_CAP_R2));
+   pr_err("PCIe ERR_CAP_R3 register: 0x%08x\n",
+   in_be32(pdata->pci_vbase + MPC85XX_PCIE_ERR_CAP_R3));
+
+   /* clear error bits */
+   out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, err_detect);
+}
+
+static int mpc85xx_pcie_find_capability(struct device_node *np)
+{
+   struct pci_controller *hose;
+
+   if (!np)
+   return -EINVAL;
+
+   hose = pci_find_hose_for_OF_device(np);
+
+   return early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP);
+}
+
 static irqreturn_t mpc85xx_pci_isr(int irq, void *dev_id)
 {
struct edac_pci_ctl_info *pci = dev_id;
@@ -207,7 +245,10 @@ static irqreturn_t mpc85xx_pci_isr(int irq, void *dev_id)
if (!err_detect)
return IRQ_NONE;

-   mpc85xx_pci_check(pci);
+   if (pdata->is_pcie)
+   mpc85xx_pcie_check(pci);
+   else
+   mpc85xx_pci_check(pci);

return IRQ_HANDLED;
 }
@@ -239,14 +280,22 @@ int mpc85xx_pci_err_probe(struct platform_device *op)
pdata = pci->pvt_info;
pdata->name = "mpc85xx_pci_err";
pdata->irq = NO_IRQ;
+
+   if (mpc85xx_pcie_find_capability(op->dev.of_node) > 0)
+   pdata->is_pcie = true;
+
dev_set_drvdata(>dev, pci);
pci->dev = >dev;
pci->mod_name = EDAC_MOD_STR;
pci->ctl_name = pdata->name;
pci->dev_name = dev_name(>dev);

-   if (edac_op_state == EDAC_OPSTATE_POLL)
-   pci->edac_check = mpc85xx_pci_check;
+   if (edac_op_state == EDAC_OPSTATE_POLL) {
+   if (pdata->is_pcie)
+   pci->edac_check = mpc85xx_pcie_check;
+   else
+   pci->edac_check = mpc85xx_pci_check;
+   }

pdata->edac_idx = edac_pci_idx++;

@@ -275,16 +324,26 @@ int mpc85xx_pci_err_probe(struct platform_device *op)
goto err;
}

-   orig_pci_err_cap_dr =
-   in_be32(pdata->pci_vbase +

Re: [PATCH 10/12] mtd: nand: davinci: don't set timings if AEMIF is used

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 12:10 PM, Khoronzhuk, Ivan wrote:
> If Davinci AEMIF is used we don't need to set timings and bus width.
> It is done by AEMIF driver (drivers/memory/davinci-aemfi.c).
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
>  drivers/mtd/nand/davinci_nand.c |   22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
> index 4705214..879e915 100644
> --- a/drivers/mtd/nand/davinci_nand.c
> +++ b/drivers/mtd/nand/davinci_nand.c
> @@ -742,27 +742,35 @@ static int __init nand_davinci_probe(struct 
> platform_device *pdev)
> goto err_clk_enable;
> }
> 
> +#if !IS_ENABLED(CONFIG_TI_DAVINCI_AEMIF)
>
Instead above #if, just use a variable.
bool aemif = IS_ENABLED(CONFIG_TI_DAVINCI_AEMIF) and then skip
the below code. #if block in the middle of the code looks ugly.

Other than that patch looks fine to me.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ALSA: pcm: retrieve true appl_ptr where stream was unlocked

2013-11-12 Thread Oskar Schirmer

Calculated for data transfer, the local variable appl_ptr is reused,
changed and written back later, though the lock was not held during
the transfer earlier. Admitted, destiny of the lock is not obvious
at all, but this looks much like a race condition candidate, so make
sure to have some original appl_ptr value to work on.

Two occurences.

Signed-off-by: Oskar Schirmer 
---
 sound/core/pcm_lib.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 6e03b46..47e836a 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -2051,6 +2051,7 @@ static snd_pcm_sframes_t snd_pcm_lib_write1(struct 
snd_pcm_substream *substream,
default:
break;
}
+   appl_ptr = runtime->control->appl_ptr;
appl_ptr += frames;
if (appl_ptr >= runtime->boundary)
appl_ptr -= runtime->boundary;
@@ -2283,6 +2284,7 @@ static snd_pcm_sframes_t snd_pcm_lib_read1(struct 
snd_pcm_substream *substream,
default:
break;
}
+   appl_ptr = runtime->control->appl_ptr;
appl_ptr += frames;
if (appl_ptr >= runtime->boundary)
appl_ptr -= runtime->boundary;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/12] mtd: nand: davinci: reuse driver for Keystone arch

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 12:09 PM, Khoronzhuk, Ivan wrote:
> The Keystone arch has compatible nand device, so reuse it.
> In case with Keystone it depends on TI_DAVINCI_AEMIF because AEMIF
> driver is responsible to set timings.
> 
> See http://www.ti.com/lit/ug/sprugz3a/sprugz3a.pdf
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/12] memory: davinci-aemif: introduce AEMIF driver

2013-11-12 Thread Santosh Shilimkar

+ Greg KH (drivers/memory/* patches goes through his queue)

On Monday 11 November 2013 12:06 PM, Khoronzhuk, Ivan wrote:
> Add new AEMIF driver for EMIF16 davinci controller. The EMIF16 module
> is intended to provide a glue-less interface to a variety of
> asynchronous memory devices like ASRA M, NOR and NAND memory. A total
> of 256M bytes of any of these memories can be accessed at any given
> time via four chip selects with 64M byte access per chip select.
> 
> Synchronous memories such as DDR1 SD RAM, SDR SDRAM and Mobile SDR
> are not supported.
> 
> See http://www.ti.com/lit/ug/sprugz3a/sprugz3a.pdf
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
>  drivers/memory/Kconfig |   11 ++
>  drivers/memory/Makefile|1 +
>  drivers/memory/davinci-aemif.c |  415 
> 
>  3 files changed, 427 insertions(+)
>  create mode 100644 drivers/memory/davinci-aemif.c
> 
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> index 29a11db..010e75e 100644
> --- a/drivers/memory/Kconfig
> +++ b/drivers/memory/Kconfig
> @@ -7,6 +7,17 @@ menuconfig MEMORY
> 
>  if MEMORY
> 
> +config TI_DAVINCI_AEMIF
s/TI_DAVINCI_AEMIF/TI_AEMIF

> +   bool "Texas Instruments DaVinci AEMIF driver"
Drop DaVinci above since its used on more SOCs.
> +   depends on (ARCH_DAVINCI || ARCH_KEYSTONE) && OF
> +   help
> + This driver is for the AEMIF module available in Texas Instruments
> + SoCs. AEMIF stands for Asynchronous External Memory Interface and
> + is intended to provide a glue-less interface to a variety of
> + asynchronuous memory devices like ASRAM, NOR and NAND memory. A 
> total
> + of 256M bytes of any of these memories can be accessed at a given
> + time via four chip selects with 64M byte access per chip select.
> +
>  config TI_EMIF
> tristate "Texas Instruments EMIF driver"
> depends on ARCH_OMAP2PLUS
> diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
> index 969d923..af14126 100644
> --- a/drivers/memory/Makefile
> +++ b/drivers/memory/Makefile
> @@ -5,6 +5,7 @@
>  ifeq ($(CONFIG_DDR),y)
>  obj-$(CONFIG_OF)   += of_memory.o
>  endif
> +obj-$(CONFIG_TI_DAVINCI_AEMIF) += davinci-aemif.o

Change this accordingly once the config is renamed.

>  obj-$(CONFIG_TI_EMIF)  += emif.o
>  obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
>  obj-$(CONFIG_TEGRA20_MC)   += tegra20-mc.o
> diff --git a/drivers/memory/davinci-aemif.c b/drivers/memory/davinci-aemif.c
> new file mode 100644
> index 000..e36b74b
> --- /dev/null
> +++ b/drivers/memory/davinci-aemif.c
> @@ -0,0 +1,415 @@
> +/*
> + * DaVinci/Keystone AEMIF driver
s/{DaVinci/Keystone}/TI
> + *
> + * Copyright (C) 2010 - 2013 Texas Instruments Incorporated. 
> http://www.ti.com/
> + * Copyright (C) Heiko Schocher 
> + * Copyright (C) Ivan Khoronzhuk 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define TA_SHIFT   2
> +#define RHOLD_SHIFT4
> +#define RSTROBE_SHIFT  7
> +#define RSETUP_SHIFT   13
> +#define WHOLD_SHIFT17
> +#define WSTROBE_SHIFT  20
> +#define WSETUP_SHIFT   26
> +#define EW_SHIFT   30
> +#define SS_SHIFT   31
> +
> +#define TA(x)  ((x) << TA_SHIFT)
> +#define RHOLD(x)   ((x) << RHOLD_SHIFT)
> +#define RSTROBE(x) ((x) << RSTROBE_SHIFT)
> +#define RSETUP(x)  ((x) << RSETUP_SHIFT)
> +#define WHOLD(x)   ((x) << WHOLD_SHIFT)
> +#define WSTROBE(x) ((x) << WSTROBE_SHIFT)
> +#define WSETUP(x)  ((x) << WSETUP_SHIFT)
> +#define EW(x)  ((x) << EW_SHIFT)
> +#define SS(x)  ((x) << SS_SHIFT)
> +
> +#define ASIZE_MAX  0x1
> +#define TA_MAX 0x3
> +#define RHOLD_MAX  0x7
> +#define RSTROBE_MAX0x3f
> +#define RSETUP_MAX 0xf
> +#define WHOLD_MAX  0x7
> +#define WSTROBE_MAX0x3f
> +#define WSETUP_MAX 0xf
> +#define EW_MAX 0x1
> +#define SS_MAX 0x1
> +#define NUM_CS 4
> +
> +#define TA_VAL(x)  (((x) & TA(TA_MAX)) >> TA_SHIFT)
> +#define RHOLD_VAL(x)   (((x) & RHOLD(RHOLD_MAX)) >> RHOLD_SHIFT)
> +#define RSTROBE_VAL(x) (((x) & RSTROBE(RSTROBE_MAX)) >> RSTROBE_SHIFT)
> +#define RSETUP_VAL(x)  (((x) & RSETUP(RSETUP_MAX)) >> RSETUP_SHIFT)
> +#define WHOLD_VAL(x)   (((x) & WHOLD(WHOLD_MAX)) >> WHOLD_SHIFT)
> +#define WSTROBE_VAL(x) (((x) & WSTROBE(WSTROBE_MAX)) >> WSTROBE_SHIFT)
> +#define WSETUP_VAL(x)  (((x) & WSETUP(WSETUP_MAX)) >> WSETUP_SHIFT)
> +#define EW_VAL(x)  (((x) & EW(EW_MAX)) >> EW_SHIFT)
> +#define SS_VAL(x)  (((x) & SS(SS_MAX)) >> SS_SHIFT)
> +
> +#define NRCSR_OFFSET   0x00
> +#define AWCCR_OFFSET   0x04
> +#define A1CR_OFFSET0x10
> +
> +#define ACR_ASIZE_MASK 0x3
> +#define ACR_EW_MASK

Re: memcg creates an unkillable task in 3.11-rc2

2013-11-12 Thread Michal Hocko

On Thu 26-09-13 16:41:19, Fabio Kung wrote:
> On Tue, Jul 30, 2013 at 9:28 AM, Eric W. Biederman
>  wrote:
> >
> > ebied...@xmission.com (Eric W. Biederman) writes:
> >
> > Ok.  I have been trying for an hour and I have not been able to
> > reproduce the weird hang with the memcg, and it used to be something I
> > could reproduce trivially.  So it appears the patch below is the fix.
> >
> > After I sleep I will see if I can turn it into a proper patch.
> 
> 
> Contributing with another data point: I am seeing similar issues with
> un-killable tasks inside LXC containers on a vanilla 3.8.11 kernel.
> The stack from zombie tasks look like this:
> 
> # cat /proc/12499/stack
> [] __mem_cgroup_try_charge+0xa96/0xbf0
> [] __mem_cgroup_try_charge_swapin+0xab/0xd0
> [] mem_cgroup_try_charge_swapin+0x5d/0x70
> [] handle_pte_fault+0x315/0xac0
> [] handle_mm_fault+0x271/0x3d0
> [] __do_page_fault+0x20b/0x4c0
> [] do_page_fault+0xe/0x10
> [] page_fault+0x28/0x30
> [] mm_release+0x127/0x140
> [] do_exit+0x171/0xa70
> [] do_group_exit+0x55/0xd0
> [] get_signal_to_deliver+0x23f/0x5d0
> [] do_signal+0x42/0x600
> [] do_notify_resume+0x88/0xc0
> [] int_signal+0x12/0x17
> [] 0x
> 
> Same symptoms that Eric described: a race condition in memcg when
> there is a page fault and the process is exiting.
> 
> I went ahead and reproduced the bug described earlier here on the same
> 3.8.11 kernel, also using the Mesos framework
> (http://mesos.apache.org/) memory Ballooning tests. The call trace
> from zombie tasks in this case look very similar:
> 
> # cat /proc/22827/stack
> [] __mem_cgroup_try_charge+0xaf0/0xbf0
> [] __mem_cgroup_try_charge_swapin+0xab/0xd0
> [] mem_cgroup_try_charge_swapin+0x5d/0x70
> [] handle_pte_fault+0x315/0xac0
> [] handle_mm_fault+0x271/0x3d0
> [] __do_page_fault+0x20b/0x4c0
> [] do_page_fault+0xe/0x10
> [] page_fault+0x28/0x30
> [] mm_release+0x127/0x140
> [] do_exit+0x171/0xa70
> [] do_group_exit+0x55/0xd0
> [] get_signal_to_deliver+0x23f/0x5d0
> [] do_signal+0x42/0x600
> [] do_notify_resume+0x88/0xc0
> [] int_signal+0x12/0x17
> [] 0x
> 
> Then, I applied Eric's patch below, and I can't reproduce the problem
> anymore. Before the patch, it was very easy to reproduce it with some
> extra memory pressure from other processes in the instance (increasing
> the probability of page faults when processes are exiting).

Could you try to reproduce with the patch posted earlier in the thread,
please? https://lkml.org/lkml/2013/7/31/94

Eric had some concerns about the patch (https://lkml.org/lkml/2013/7/31/603)
but I wasn't quite sure whether the issue he raised exists. As I tried
to explain in the follow up answer the race shouldn't exit and the
thread basically died at that state.

The memcg handling was reworked considerably since then by Johannes -
merged in 3.12 - and it has moved outside of memcg charging path.
I still think that the rework hasn't fixed this particular bug and we
still need a fix. And I would prefer if we simply set TIF_MEMDIE after
we wake up from the sleep.

> We also tried a vanilla 3.11.1 kernel, and we could reproduce the bug
> on it pretty easily.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: phy: remove dead code

2013-11-12 Thread Michal Nazarewicz

Commit [4d175f34: usb: phy: nop: Defer clock prepare until PHY init]
removed a goto reaching behind a “return ret” at the end of the function
thus removing the only possible way that statement could be reached, and
so rendering it a dead code.  This commit cleans it up by removing said
dead code.

Signed-off-by: Michal Nazarewicz 
---
 drivers/usb/phy/phy-am335x.c  | 2 --
 drivers/usb/phy/phy-generic.c | 2 --
 2 files changed, 4 deletions(-)

On Tue, Nov 12 2013, Felipe Balbi wrote:
> no SoB, cannot apply.

Sorry about that.

> I already had this patch in my tree but didn't send it.  I'm fine with
> using yours but I need SoB and commit log.

I don't mind either way as long as the code gets deleted. ;)

diff --git a/drivers/usb/phy/phy-am335x.c b/drivers/usb/phy/phy-am335x.c
index 6370e50..48d41ab 100644
--- a/drivers/usb/phy/phy-am335x.c
+++ b/drivers/usb/phy/phy-am335x.c
@@ -66,8 +66,6 @@ static int am335x_phy_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, am_phy);

return 0;
-
-   return ret;
 }

 static int am335x_phy_remove(struct platform_device *pdev)
diff --git a/drivers/usb/phy/phy-generic.c b/drivers/usb/phy/phy-generic.c
index fce3a9e..db4fc22 100644
--- a/drivers/usb/phy/phy-generic.c
+++ b/drivers/usb/phy/phy-generic.c
@@ -271,8 +271,6 @@ static int usb_phy_gen_xceiv_probe(struct platform_device 
*pdev)
platform_set_drvdata(pdev, nop);

return 0;
-
-   return err;
 }

 static int usb_phy_gen_xceiv_remove(struct platform_device *pdev)
--
1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/12] mtd: nand: davinci: extend description of bindings

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 11:58 AM, Khoronzhuk, Ivan wrote:
> Extend bindings for davinci_nand driver to be more clear.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Looks fine to me but it needs blessing from DT guys. 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] regulator: s5m8767: Disable OVCB in probe

2013-11-12 Thread Lee Jones

On Tue, 12 Nov 2013, Krzysztof Kozlowski wrote:

> According to SW Guide the Over-Voltage Clamp may malfunction at VBatt
> 5.25V and 110'C temperature. This may result in overshooting or
> undershooting LDO's voltage outputs.
> Disable the Over-Voltage Clamp in probe by updating proper bit in all
> LDO registers.
> 
> The patch uses sec_bulk_read/write() API with reordered buf and count
> parameters so it depends on:
>   "mfd: sec: reorder params in API for regmap consistency"
>   http://www.spinics.net/lists/kernel/msg1632519.html
> 
> Signed-off-by: Krzysztof Kozlowski 
> Signed-off-by: Kyungmin Park 
> ---
>  drivers/regulator/s5m8767.c |   26 ++
>  include/linux/mfd/samsung/s5m8767.h |1 +
>  2 files changed, 27 insertions(+)

For the MFD change:
  Acked-by: Lee Jones 

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Consolidate asm/fixmap.h files

2013-11-12 Thread Mark Salter

On Tue, 2013-11-12 at 16:39 +0100, Michal Simek wrote:
> On 11/12/2013 02:22 PM, Mark Salter wrote:
> > 
> >  arch/arm/include/asm/fixmap.h|  25 ++--
> >  arch/hexagon/include/asm/fixmap.h|  40 +
> >  arch/metag/include/asm/fixmap.h  |  32 +--
> >  arch/microblaze/include/asm/fixmap.h |  44 +-
> >  arch/mips/include/asm/fixmap.h   |  33 +--
> >  arch/powerpc/include/asm/fixmap.h|  44 +-
> >  arch/sh/include/asm/fixmap.h |  39 +
> >  arch/tile/include/asm/fixmap.h   |  33 +--
> >  arch/um/include/asm/fixmap.h |  40 +
> >  arch/x86/include/asm/fixmap.h|  59 +--
> >  include/asm-generic/fixmap.h | 107 
> > +++
> >  11 files changed, 125 insertions(+), 371 deletions(-)
> >  create mode 100644 include/asm-generic/fixmap.h
> 
> Any repo/branch with all these patches will be helpful.

https://github.com/mosalter/linux (fixmap branch)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 04/12] mtd: nand: davinci: move bindings under mtd

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 11:55 AM, Khoronzhuk, Ivan wrote:
> Move bindings under mtd. Do this in order to make davinci-nand
> driver usable by keystone architecture.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/12] mtd: nand: davinci: simplify error handling

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 11:54 AM, Khoronzhuk, Ivan wrote:
> There is not needed to use a lot of names for err handling.
> It complicates code support and reading.
> 
This is not always true but looking at the patch, the labels
are just useless since no special handling per label.

> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 2/2] check quirk to pad epout buf size when not aligned to maxpacketsize

2013-11-12 Thread Alan Stern

On Mon, 11 Nov 2013, David Cohen wrote:

> Hi Alan, Michal,
> 
> On 11/11/2013 01:09 PM, Michal Nazarewicz wrote:
> > On Mon, Nov 11 2013, Alan Stern wrote:
> >> On Mon, 11 Nov 2013, Michal Nazarewicz wrote:
> >>
> >>> Check gadget.quirk_ep_out_aligned_size to decide if buffer size requires
> >>> to be aligned to maxpacketsize of an out endpoint.  ffs_epfile_io() needs
> >>> to pad epout buffer to match above condition if quirk is found.
> >>>
> >>> Signed-off-by: Michal Nazarewicz 
> >>
> >> I think this is still wrong.
> >>
> >>> @@ -824,7 +832,7 @@ static ssize_t ffs_epfile_io(struct file *file,
> >>>   req->context  = 
> >>>   req->complete = ffs_epfile_io_complete;
> >>>   req->buf  = data;
> >>> - req->length   = len;
> >>> + req->length   = data_len;
> >>
> >> IIUC, req->length should still be set to len, not to data_len.
> 
> I misunderstood the first time I read it:
> In order to avoid DWC3 to stall, we need to update req->length (this is
> the most important fix). kmalloc() is updated too to prevent USB
> controller to overflow buffer boundaries.

Here I disagree.

If the DWC3 hardware stalls, it is up to the DWC3 UDC driver to fix it.  
Gadget drivers should not have to worry.  Most especially, gadget 
drivers should not lie about a request length.

If the UDC driver decides to round up req->length before sending it to
the hardware, that's okay.  But req->length should be set to len, not
data_len.  And if the hardware receives more than len bytes of data,
the UDC driver should set req->status to -EOVERFLOW.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/12] mtd: nand: davinci: check required ti,davinci-chipselect property

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 11:53 AM, Khoronzhuk, Ivan wrote:
> The property "ti,davinci-chipselect" is required. So we have to check
> if it is set.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
>  drivers/mtd/nand/davinci_nand.c |   10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
> index d87213f..8e1c88e 100644
> --- a/drivers/mtd/nand/davinci_nand.c
> +++ b/drivers/mtd/nand/davinci_nand.c
> @@ -541,10 +541,14 @@ static struct davinci_nand_pdata
> GFP_KERNEL);
> pdev->dev.platform_data = pdata;
> if (!pdata)
> -   return NULL;
> +   return ERR_PTR(-ENOMEM);
> +
This change don't follow commit message.

> if (!of_property_read_u32(pdev->dev.of_node,
> "ti,davinci-chipselect", ))
> pdev->id = prop;
> +   else
> +   return ERR_PTR(-EINVAL);
> +
So the check already exist but the error case wasn't handled.
This should be reflected in change log.

> if (!of_property_read_u32(pdev->dev.of_node,
> "ti,davinci-mask-ale", ))
> pdata->mask_ale = prop;
> @@ -598,6 +602,10 @@ static int __init nand_davinci_probe(struct 
> platform_device *pdev)
> nand_ecc_modes_tecc_mode;
> 
> pdata = nand_davinci_get_pdata(pdev);
> +   if (IS_ERR(pdata)) {
> +   return PTR_ERR(pdata);
> +   }
> +
Again not related to commit log. You might want to split this patch
then.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf trace: Beautify fifth argument of mmap() as fd

2013-11-12 Thread David Ahern


On 11/11/13, 11:24 PM, Namhyung Kim wrote:

From: Namhyung Kim 

The fifth argument of mmap syscall is fd and it often contains -1 as a
value for anon mappings.  Without this patch it doesn't show the file
name as well as it shows -1 as 4294967295.

Cc: David Ahern 
Signed-off-by: Namhyung Kim 
---
  tools/perf/builtin-trace.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c3008b1c369c..aeb6296a76bd 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -951,7 +951,8 @@ static struct syscall_fmt {
{ .name = "mmap", .hexret = true,
  .arg_scnprintf = { [0] = SCA_HEX,   /* addr */
 [2] = SCA_MMAP_PROT, /* prot */
-[3] = SCA_MMAP_FLAGS, /* flags */ }, },
+[3] = SCA_MMAP_FLAGS, /* flags */
+[4] = SCA_FD,/* fd */ }, },
{ .name = "mprotect",   .errmsg = true,
  .arg_scnprintf = { [0] = SCA_HEX, /* start */
 [2] = SCA_MMAP_PROT, /* prot */ }, },



Looks good to me.

Acked-by: David Ahern 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: [PATCH 7/8] watchdog: davinci: add "clocks" property

2013-11-12 Thread Santosh Shilimkar

On Wednesday 06 November 2013 06:32 AM, ivan.khoronzhuk wrote:
> The Keystone arch is using clocks in DT and source clock for watchdog
> has to be specified, so add this to binding.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ASoC: fsl: imx-pcm-fiq: omit fiq counter to avoid harm in unbalanced situations

2013-11-12 Thread Oskar Schirmer

Unbalanced calls to snd_imx_pcm_trigger() may result in endless
FIQ activity and thus provoke eternal sound. While on the first glance,
the switch statement looks pretty symmetric, the SUSPEND/RESUME
pair is not: the suspend case comes along snd_pcm_suspend_all(),
which for fsl/imx-pcm-fiq is called only at snd_soc_suspend(),
but the resume case originates straight from the SNDRV_PCM_IOCTL_RESUME.
This way userland may provoke an unbalanced resume, which might cause
the fiq_enable counter to increase and never return to zero again,
so eventually imx_pcm_fiq is never disabled.

Simply removing the fiq_enable will solve the problem, as long as
one never goes play and capture game simultaneously, but beware
trying both at once, the early TRIGGER_STOP will cut off the other
activity prematurely. So now playing and capturing is scrutinized
separately, instead of by counting.

Signed-off-by: Oskar Schirmer 
---
 sound/soc/fsl/imx-pcm-fiq.c |   29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/sound/soc/fsl/imx-pcm-fiq.c b/sound/soc/fsl/imx-pcm-fiq.c
index 10e3305..f00b512 100644
--- a/sound/soc/fsl/imx-pcm-fiq.c
+++ b/sound/soc/fsl/imx-pcm-fiq.c
@@ -42,7 +42,8 @@ struct imx_pcm_runtime_data {
struct hrtimer hrt;
int poll_time_ns;
struct snd_pcm_substream *substream;
-   atomic_t running;
+   atomic_t playing;
+   atomic_t capturing;
 };
 
 static enum hrtimer_restart snd_hrtimer_callback(struct hrtimer *hrt)
@@ -52,7 +53,7 @@ static enum hrtimer_restart snd_hrtimer_callback(struct 
hrtimer *hrt)
struct snd_pcm_substream *substream = iprtd->substream;
struct pt_regs regs;
 
-   if (!atomic_read(>running))
+   if (!atomic_read(>playing) && !atomic_read(>capturing))
return HRTIMER_NORESTART;
 
get_fiq_regs();
@@ -106,7 +107,6 @@ static int snd_imx_pcm_prepare(struct snd_pcm_substream 
*substream)
return 0;
 }
 
-static int fiq_enable;
 static int imx_pcm_fiq;
 
 static int snd_imx_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
@@ -118,23 +118,27 @@ static int snd_imx_pcm_trigger(struct snd_pcm_substream 
*substream, int cmd)
case SNDRV_PCM_TRIGGER_START:
case SNDRV_PCM_TRIGGER_RESUME:
case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
-   atomic_set(>running, 1);
+   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+   atomic_set(>playing, 1);
+   else
+   atomic_set(>capturing, 1);
hrtimer_start(>hrt, ns_to_ktime(iprtd->poll_time_ns),
  HRTIMER_MODE_REL);
-   if (++fiq_enable == 1)
-   enable_fiq(imx_pcm_fiq);
-
+   enable_fiq(imx_pcm_fiq);
break;
 
case SNDRV_PCM_TRIGGER_STOP:
case SNDRV_PCM_TRIGGER_SUSPEND:
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
-   atomic_set(>running, 0);
-
-   if (--fiq_enable == 0)
+   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+   atomic_set(>playing, 0);
+   else
+   atomic_set(>capturing, 0);
+   if (!atomic_read(>playing) &&
+   !atomic_read(>capturing))
disable_fiq(imx_pcm_fiq);
-
break;
+
default:
return -EINVAL;
}
@@ -182,7 +186,8 @@ static int snd_imx_open(struct snd_pcm_substream *substream)
 
iprtd->substream = substream;
 
-   atomic_set(>running, 0);
+   atomic_set(>playing, 0);
+   atomic_set(>capturing, 0);
hrtimer_init(>hrt, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
iprtd->hrt.function = snd_hrtimer_callback;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [seqcount] INFO: trying to register non-static key.

2013-11-12 Thread Vivek Goyal

On Tue, Nov 12, 2013 at 04:29:56PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 12, 2013 at 10:15:41AM -0500, Vivek Goyal wrote:
> > I see that we allocate per cpu stats but don't do any initializations.
> > 
> > static void tg_stats_alloc_fn(struct work_struct *work)
> > {
> > static struct tg_stats_cpu *stats_cpu;  /* this fn is non-reentrant 
> > */
> > struct delayed_work *dwork = to_delayed_work(work);
> > bool empty = false;
> > 
> > alloc_stats:
> > if (!stats_cpu) {
> > stats_cpu = alloc_percpu(struct tg_stats_cpu);
> > if (!stats_cpu) {
> > /* allocation failed, try again after some time */
> > schedule_delayed_work(dwork, msecs_to_jiffies(10));
> > return;
> > }
> > }
> > 
> > spin_lock_irq(_stats_alloc_lock);
> 
> Absolutely!
> 
> Something like this perhaps? Did I miss more blkg_[rw]stats? If I read
> the git grep output right, this was the last one.

Looks good to me.

This should be the last one. There are only two users of stats right
now. blk-throttle and cfq-iosched.

Thanks
Vivek

> 
> ---
>  block/blk-throttle.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 8331aba9426f..fd743d98c41d 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -256,6 +256,12 @@ static struct throtl_data *sq_to_td(struct 
> throtl_service_queue *sq)
>   }   \
>  } while (0)
>  
> +static void tg_stats_init(struct tg_stats_cpu *tg_stats)
> +{
> + blkg_rwstat_init(_stats->service_bytes);
> + blkg_rwstat_init(_stats->serviced);
> +}
> +
>  /*
>   * Worker for allocating per cpu stat for tgs. This is scheduled on the
>   * system_wq once there are some groups on the alloc_list waiting for
> @@ -269,12 +275,16 @@ static void tg_stats_alloc_fn(struct work_struct *work)
>  
>  alloc_stats:
>   if (!stats_cpu) {
> + int cpu;
> +
>   stats_cpu = alloc_percpu(struct tg_stats_cpu);
>   if (!stats_cpu) {
>   /* allocation failed, try again after some time */
>   schedule_delayed_work(dwork, msecs_to_jiffies(10));
>   return;
>   }
> + for_each_possible_cpu(cpu)
> + tg_stats_init(per_cpu(stats_cpu, cpu));
>   }
>  
>   spin_lock_irq(_stats_alloc_lock);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/12] mtd: nand: davinci: fix driver registration

2013-11-12 Thread Santosh Shilimkar

On Monday 11 November 2013 11:52 AM, Khoronzhuk, Ivan wrote:
> 
> When kernel is booted using DT, there is no guarantee that Davinci
> NAND device has been created already at the time when driver init
> function is executed. Therefore, platform_driver_probe() can't be used
> because this may result the Davinci NAND driver will never be probed.
> The driver probing has to be made with core mechanism.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scripts: checkpatch.pl & Lindent (minor complaint)

2013-11-12 Thread Joe Perches

On Tue, 2013-11-12 at 09:42 -0500, Mimi Zohar wrote: 
> scripts/Lindent and scripts/checkpatch disagree whether the fields in a
> statically initialized array should be blank separated.  
> 
> static struct ima_rule_entry default_rules[] = {
> {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
> IMA_FSMAGIC},
> 
> Lindent adds a blank before '.fsmagic', which checkpatch then complains
> about (eg. commit 75834fc3).

Perhaps I don't understand what you mean.

Lindent _doesn't_add a blank and checkpatch
seems to do the right thing here.

I'd just as soon delete Lindent.

$ git log --format=email -p -1 75834fc3 | ./scripts/checkpatch.pl -
WARNING: line over 80 characters
#35: FILE: security/integrity/ima/ima_policy.c:52:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
IMA_FSMAGIC},

ERROR: space required after that ',' (ctx:VxV)
#35: FILE: security/integrity/ima/ima_policy.c:52:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags = 
IMA_FSMAGIC},
   ^
[...]

$ ./scripts/Lindent security/integrity/ima/ima_policy.c 
$ git diff security/integrity/ima/ima_policy.c | ./scripts/checkpatch.pl -
ERROR: space required after that ',' (ctx:VxV)
#10: FILE: security/integrity/ima/ima_policy.c:72:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags =
   ^

ERROR: space required after that ',' (ctx:VxV)
#10: FILE: security/integrity/ima/ima_policy.c:72:
+   {.action = DONT_MEASURE,.fsmagic = PROC_SUPER_MAGIC,.flags =
   ^
[...]


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fwd: [PATCH 8/8] arm: dts: keystone: add watchdog entry

2013-11-12 Thread Santosh Shilimkar

On Wednesday 06 November 2013 06:33 AM, ivan.khoronzhuk wrote:
> Add watchdog entry to keystone device tree.
> 
> Signed-off-by: Ivan Khoronzhuk 
> ---
This patch I can take via my tree once the watchdog guys are ok
and queuing up rest of the series.

Thanks Ivan for the clean-up and keystone updates.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] staging: zsmalloc: Ensure handle is never 0 on success

2013-11-12 Thread Minchan Kim

On Thu, Nov 07, 2013 at 04:04:51PM +0900, Minchan Kim wrote:
> On Wed, Nov 06, 2013 at 07:05:11PM -0800, Greg KH wrote:
> > On Wed, Nov 06, 2013 at 03:46:19PM -0800, Nitin Gupta wrote:
> >  > I'm getting really tired of them hanging around in here for many years
> > > > now...
> > > >
> > > 
> > > Minchan has tried many times to promote zram out of staging. This was
> > > his most recent attempt:
> > > 
> > > https://lkml.org/lkml/2013/8/21/54
> > > 
> > > There he provided arguments for zram inclusion, how it can help in
> > > situations where zswap can't and why generalizing /dev/ramX would
> > > not be a great idea. So, cannot say why it wasn't picked up
> > > for inclusion at that time.
> > > 
> > > > Should I just remove them if no one is working on getting them merged
> > > > "properly"?
> > > >
> > > 
> > > Please refer the mail thread (link above) and see Minchan's
> > > justifications for zram.
> > > If they don't sound convincing enough then please remove zram+zsmalloc
> > > from staging.
> > 
> > You don't need to be convincing me, you need to be convincing the
> > maintainers of the area of the kernel you are working with.
> > 
> > And since the last time you all tried to get this merged was back in
> > August, I'm feeling that you all have given up, so it needs to be
> > deleted.  I'll go do that for 3.14, and if someone wants to pick it up
> > and merge it properly, they can easily revert it.
> 
> I'm guilty and I have been busy by other stuff. Sorry for that.
> Fortunately, I discussed this issue with Hugh in this Linuxcon for a
> long time(Thanks Hugh!) he felt zram's block device abstraction is
> better design rather than frontswap backend stuff although it's a question
> where we put zsmalloc. I will CC Hugh because many of things is related
> to swap subsystem and his opinion is really important.
> And I discussed it with Rik and he feel positive about zram.
> 
> Last impression Andrw gave me by private mail is he want to merge
> zram's functionality into zswap or vise versa.
> If I misunderstood, please correct me.
> I understand his concern but I guess he didn't have a time to read
> my long description due to a ton of works at that time.
> So, I will try one more time.
> I hope I'd like to listen feedback than *silence* so that we can
> move forward than stall.
> 
> Recently, Bob tried to move zsmalloc under mm directory to unify
> zram and zswap with adding pseudo block device in zswap(It's
> very weired to me. I think it's horrible monster which is lying
> between mm and block in layering POV) but he was ignoring zram's
> block device (a.k.a zram-blk) feature and considered only swap
> usecase of zram, in turn, it lose zram's good concept. 
> I already convered other topics Bob raised in this thread[1]
> and why I think zram is better in the thread.
> 
> Will repeat one more time and hope gray beards penguins grab a
> time in this time and they give a conclusion/direction to me so
> that we don't lose lots of user and functionality.
> 
> == &< ===
> 
> Mel raised an another issue in v6, "maintainance headache".
> He claimed zswap and zram has a similar goal that is to compresss
> swap pages so if we promote zram, maintainance headache happens
> sometime by diverging implementaion between zswap and zram
> so that he want to unify zram and zswap. For it, he want zswap
> to implement pseudo block device like Bob did to emulate zram so
> zswap can have an advantage of writeback as well as zram's benefit.
> But I wonder frontswap-based zswap's writeback is really good
> approach for writeback POV. I think that problem isn't only
> specific for zswap. If we want to configure multiple swap hierarchy
> with various speed device such as RAM, NVRAM, SSD, eMMC, NAS etc,
> it would be a general problem. So we should think of more general
> approach. At a glance, I can see two approach.
> 
> First, VM could be aware of heterogeneous swap configuration
> so it could aim for being able to configure cache hierarchy
> among swap devices. It may need indirction layer on swap, which
> was already talked about that way so VM can migrate a block from
> A to B easily. It will support various configuration with VM's
> hints, maybe, in future.
> http://lkml.indiana.edu/hypermail/linux/kernel/1203.3/03812.html
> 
> Second, as more practical solution, we could use device mapper like
> dm-cache(https://lwn.net/Articles/540996/), which makes it very
> flexible. Now, it supports various configruation and cache policy
> (block size, writeback/writethrough, LRU, MFU although MQ is merged
> now) so it would be good fit for our purpose. Even, it can make zram
> support writeback. I tested it following as following scenario
> in KVM 4 CPU, 1G DRAM with background 800M memory hogger, which is
> allocates random data up to 800M.
> 
> 1) zram swap disk 1G, untar kernel.tgz to tmpfs, build -j 4
>Fail to untar due to shortage of memory space by tmpfs default size limit
> 
> 2) zram swap disk 1G,

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 1590 matches

Mail list logo