[PATCH 4.4 54/73] pinctrl: at91-pio4: fix pull-up/down logic
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ludovic Desrochescommit 5305a7b7e860bb40ab226bc7d58019416073948a upstream. The default configuration of a pin is often with a value in the pull-up/down field at chip reset. So, even if the internal logic of the controller prevents writing a configuration with pull-up and pull-down at the same time, we must ensure explicitly this condition before writing the register. This was leading to a pull-down condition not taken into account for instance. Signed-off-by: Ludovic Desroches Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller") Acked-by: Alexandre Belloni Acked-by: Nicolas Ferre Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman --- drivers/pinctrl/pinctrl-at91-pio4.c |2 ++ 1 file changed, 2 insertions(+) --- a/drivers/pinctrl/pinctrl-at91-pio4.c +++ b/drivers/pinctrl/pinctrl-at91-pio4.c @@ -717,9 +717,11 @@ static int atmel_conf_pin_config_group_s break; case PIN_CONFIG_BIAS_PULL_UP: conf |= ATMEL_PIO_PUEN_MASK; + conf &= (~ATMEL_PIO_PDEN_MASK); break; case PIN_CONFIG_BIAS_PULL_DOWN: conf |= ATMEL_PIO_PDEN_MASK; + conf &= (~ATMEL_PIO_PUEN_MASK); break; case PIN_CONFIG_DRIVE_OPEN_DRAIN: if (arg == 0)
[PATCH 4.4 55/73] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Jack Phamcommit dec8e8f6e6504aa3496c0f7cc10c756bb0e10f44 upstream. Specifically for the case of reads that use the Extended Register Read Long command, a multi-byte read operation is broken up into 8-byte chunks. However the call to spmi_ext_register_readl() is incorrectly passing 'val_size', which if greater than 8 will always fail. The argument should instead be 'len'. Fixes: c9afbb05a9ff ("regmap: spmi: support base and extended register spaces") Signed-off-by: Jack Pham Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/base/regmap/regmap-spmi.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/base/regmap/regmap-spmi.c +++ b/drivers/base/regmap/regmap-spmi.c @@ -142,7 +142,7 @@ static int regmap_spmi_ext_read(void *co while (val_size) { len = min_t(size_t, val_size, 8); - err = spmi_ext_register_readl(context, addr, val, val_size); + err = spmi_ext_register_readl(context, addr, val, len); if (err) goto err_out;
[PATCH 4.4 54/73] pinctrl: at91-pio4: fix pull-up/down logic
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ludovic Desroches commit 5305a7b7e860bb40ab226bc7d58019416073948a upstream. The default configuration of a pin is often with a value in the pull-up/down field at chip reset. So, even if the internal logic of the controller prevents writing a configuration with pull-up and pull-down at the same time, we must ensure explicitly this condition before writing the register. This was leading to a pull-down condition not taken into account for instance. Signed-off-by: Ludovic Desroches Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller") Acked-by: Alexandre Belloni Acked-by: Nicolas Ferre Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman --- drivers/pinctrl/pinctrl-at91-pio4.c |2 ++ 1 file changed, 2 insertions(+) --- a/drivers/pinctrl/pinctrl-at91-pio4.c +++ b/drivers/pinctrl/pinctrl-at91-pio4.c @@ -717,9 +717,11 @@ static int atmel_conf_pin_config_group_s break; case PIN_CONFIG_BIAS_PULL_UP: conf |= ATMEL_PIO_PUEN_MASK; + conf &= (~ATMEL_PIO_PDEN_MASK); break; case PIN_CONFIG_BIAS_PULL_DOWN: conf |= ATMEL_PIO_PDEN_MASK; + conf &= (~ATMEL_PIO_PUEN_MASK); break; case PIN_CONFIG_DRIVE_OPEN_DRAIN: if (arg == 0)
[PATCH 4.4 55/73] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Jack Pham commit dec8e8f6e6504aa3496c0f7cc10c756bb0e10f44 upstream. Specifically for the case of reads that use the Extended Register Read Long command, a multi-byte read operation is broken up into 8-byte chunks. However the call to spmi_ext_register_readl() is incorrectly passing 'val_size', which if greater than 8 will always fail. The argument should instead be 'len'. Fixes: c9afbb05a9ff ("regmap: spmi: support base and extended register spaces") Signed-off-by: Jack Pham Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/base/regmap/regmap-spmi.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/base/regmap/regmap-spmi.c +++ b/drivers/base/regmap/regmap-spmi.c @@ -142,7 +142,7 @@ static int regmap_spmi_ext_read(void *co while (val_size) { len = min_t(size_t, val_size, 8); - err = spmi_ext_register_readl(context, addr, val, val_size); + err = spmi_ext_register_readl(context, addr, val, len); if (err) goto err_out;
[PATCH 4.4 52/73] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ben Hutchingscommit ea1b60fb085839a9544cb3a0069992991beabb7f upstream. Each transfer can specify 8, 16 or 32 bits per word independently of the default for the device being addressed. However, currently we calculate the number of words in the frame assuming that the word size is the device default. If multiple transfers in the same message have differing bits_per_word, we bitwise-or the different values in the WLEN register field. Fix both of these. Also rename 'frame_length' to 'frame_len_words' to make clear that it's not a byte count like spi_message::frame_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-ti-qspi.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -94,6 +94,7 @@ struct ti_qspi { #define QSPI_FLEN(n) ((n - 1) << 0) #define QSPI_WLEN_MAX_BITS 128 #define QSPI_WLEN_MAX_BYTES16 +#define QSPI_WLEN_MASK QSPI_WLEN(QSPI_WLEN_MAX_BITS) /* STATUS REGISTER */ #define BUSY 0x01 @@ -373,7 +374,7 @@ static int ti_qspi_start_transfer_one(st struct spi_device *spi = m->spi; struct spi_transfer *t; int status = 0, ret; - int frame_length; + unsigned int frame_len_words; /* setup device control reg */ qspi->dc = 0; @@ -385,21 +386,23 @@ static int ti_qspi_start_transfer_one(st if (spi->mode & SPI_CS_HIGH) qspi->dc |= QSPI_CSPOL(spi->chip_select); - frame_length = (m->frame_length << 3) / spi->bits_per_word; - - frame_length = clamp(frame_length, 0, QSPI_FRAME); + frame_len_words = 0; + list_for_each_entry(t, >transfers, transfer_list) + frame_len_words += t->len / (t->bits_per_word >> 3); + frame_len_words = min_t(unsigned int, frame_len_words, QSPI_FRAME); /* setup command reg */ qspi->cmd = 0; qspi->cmd |= QSPI_EN_CS(spi->chip_select); - qspi->cmd |= QSPI_FLEN(frame_length); + qspi->cmd |= QSPI_FLEN(frame_len_words); ti_qspi_write(qspi, qspi->dc, QSPI_SPI_DC_REG); mutex_lock(>list_lock); list_for_each_entry(t, >transfers, transfer_list) { - qspi->cmd |= QSPI_WLEN(t->bits_per_word); + qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) | +QSPI_WLEN(t->bits_per_word)); ret = qspi_transfer_msg(qspi, t); if (ret) {
[PATCH 4.4 52/73] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ben Hutchings commit ea1b60fb085839a9544cb3a0069992991beabb7f upstream. Each transfer can specify 8, 16 or 32 bits per word independently of the default for the device being addressed. However, currently we calculate the number of words in the frame assuming that the word size is the device default. If multiple transfers in the same message have differing bits_per_word, we bitwise-or the different values in the WLEN register field. Fix both of these. Also rename 'frame_length' to 'frame_len_words' to make clear that it's not a byte count like spi_message::frame_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-ti-qspi.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -94,6 +94,7 @@ struct ti_qspi { #define QSPI_FLEN(n) ((n - 1) << 0) #define QSPI_WLEN_MAX_BITS 128 #define QSPI_WLEN_MAX_BYTES16 +#define QSPI_WLEN_MASK QSPI_WLEN(QSPI_WLEN_MAX_BITS) /* STATUS REGISTER */ #define BUSY 0x01 @@ -373,7 +374,7 @@ static int ti_qspi_start_transfer_one(st struct spi_device *spi = m->spi; struct spi_transfer *t; int status = 0, ret; - int frame_length; + unsigned int frame_len_words; /* setup device control reg */ qspi->dc = 0; @@ -385,21 +386,23 @@ static int ti_qspi_start_transfer_one(st if (spi->mode & SPI_CS_HIGH) qspi->dc |= QSPI_CSPOL(spi->chip_select); - frame_length = (m->frame_length << 3) / spi->bits_per_word; - - frame_length = clamp(frame_length, 0, QSPI_FRAME); + frame_len_words = 0; + list_for_each_entry(t, >transfers, transfer_list) + frame_len_words += t->len / (t->bits_per_word >> 3); + frame_len_words = min_t(unsigned int, frame_len_words, QSPI_FRAME); /* setup command reg */ qspi->cmd = 0; qspi->cmd |= QSPI_EN_CS(spi->chip_select); - qspi->cmd |= QSPI_FLEN(frame_length); + qspi->cmd |= QSPI_FLEN(frame_len_words); ti_qspi_write(qspi, qspi->dc, QSPI_SPI_DC_REG); mutex_lock(>list_lock); list_for_each_entry(t, >transfers, transfer_list) { - qspi->cmd |= QSPI_WLEN(t->bits_per_word); + qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) | +QSPI_WLEN(t->bits_per_word)); ret = qspi_transfer_msg(qspi, t); if (ret) {
[PATCH 4.4 53/73] spi: spi-ti-qspi: Handle truncated frames properly
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ben Hutchingscommit 1ff7760ff66b98ef244bf0e5e2bd5310651205ad upstream. We clamp frame_len_words to a maximum of 4096, but do not actually limit the number of words written or read through the DATA registers or the length added to spi_message::actual_length. This results in silent data corruption for commands longer than this maximum. Recalculate the length of each transfer, taking frame_len_words into account. Use this length in qspi_{read,write}_msg(), and to increment spi_message::actual_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-ti-qspi.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -225,16 +225,16 @@ static inline int ti_qspi_poll_wc(struct return -ETIMEDOUT; } -static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t, + int count) { - int wlen, count, xfer_len; + int wlen, xfer_len; unsigned int cmd; const u8 *txbuf; u32 data; txbuf = t->tx_buf; cmd = qspi->cmd | QSPI_WR_SNGL; - count = t->len; wlen = t->bits_per_word >> 3; /* in bytes */ xfer_len = wlen; @@ -294,9 +294,10 @@ static int qspi_write_msg(struct ti_qspi return 0; } -static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t, +int count) { - int wlen, count; + int wlen; unsigned int cmd; u8 *rxbuf; @@ -313,7 +314,6 @@ static int qspi_read_msg(struct ti_qspi cmd |= QSPI_RD_SNGL; break; } - count = t->len; wlen = t->bits_per_word >> 3; /* in bytes */ while (count) { @@ -344,12 +344,13 @@ static int qspi_read_msg(struct ti_qspi return 0; } -static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t, +int count) { int ret; if (t->tx_buf) { - ret = qspi_write_msg(qspi, t); + ret = qspi_write_msg(qspi, t, count); if (ret) { dev_dbg(qspi->dev, "Error while writing\n"); return ret; @@ -357,7 +358,7 @@ static int qspi_transfer_msg(struct ti_q } if (t->rx_buf) { - ret = qspi_read_msg(qspi, t); + ret = qspi_read_msg(qspi, t, count); if (ret) { dev_dbg(qspi->dev, "Error while reading\n"); return ret; @@ -374,7 +375,8 @@ static int ti_qspi_start_transfer_one(st struct spi_device *spi = m->spi; struct spi_transfer *t; int status = 0, ret; - unsigned int frame_len_words; + unsigned int frame_len_words, transfer_len_words; + int wlen; /* setup device control reg */ qspi->dc = 0; @@ -404,14 +406,20 @@ static int ti_qspi_start_transfer_one(st qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) | QSPI_WLEN(t->bits_per_word)); - ret = qspi_transfer_msg(qspi, t); + wlen = t->bits_per_word >> 3; + transfer_len_words = min(t->len / wlen, frame_len_words); + + ret = qspi_transfer_msg(qspi, t, transfer_len_words * wlen); if (ret) { dev_dbg(qspi->dev, "transfer message failed\n"); mutex_unlock(>list_lock); return -EINVAL; } - m->actual_length += t->len; + m->actual_length += transfer_len_words * wlen; + frame_len_words -= transfer_len_words; + if (frame_len_words == 0) + break; } mutex_unlock(>list_lock);
[PATCH 4.4 53/73] spi: spi-ti-qspi: Handle truncated frames properly
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Ben Hutchings commit 1ff7760ff66b98ef244bf0e5e2bd5310651205ad upstream. We clamp frame_len_words to a maximum of 4096, but do not actually limit the number of words written or read through the DATA registers or the length added to spi_message::actual_length. This results in silent data corruption for commands longer than this maximum. Recalculate the length of each transfer, taking frame_len_words into account. Use this length in qspi_{read,write}_msg(), and to increment spi_message::actual_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-ti-qspi.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -225,16 +225,16 @@ static inline int ti_qspi_poll_wc(struct return -ETIMEDOUT; } -static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t, + int count) { - int wlen, count, xfer_len; + int wlen, xfer_len; unsigned int cmd; const u8 *txbuf; u32 data; txbuf = t->tx_buf; cmd = qspi->cmd | QSPI_WR_SNGL; - count = t->len; wlen = t->bits_per_word >> 3; /* in bytes */ xfer_len = wlen; @@ -294,9 +294,10 @@ static int qspi_write_msg(struct ti_qspi return 0; } -static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t, +int count) { - int wlen, count; + int wlen; unsigned int cmd; u8 *rxbuf; @@ -313,7 +314,6 @@ static int qspi_read_msg(struct ti_qspi cmd |= QSPI_RD_SNGL; break; } - count = t->len; wlen = t->bits_per_word >> 3; /* in bytes */ while (count) { @@ -344,12 +344,13 @@ static int qspi_read_msg(struct ti_qspi return 0; } -static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t) +static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t, +int count) { int ret; if (t->tx_buf) { - ret = qspi_write_msg(qspi, t); + ret = qspi_write_msg(qspi, t, count); if (ret) { dev_dbg(qspi->dev, "Error while writing\n"); return ret; @@ -357,7 +358,7 @@ static int qspi_transfer_msg(struct ti_q } if (t->rx_buf) { - ret = qspi_read_msg(qspi, t); + ret = qspi_read_msg(qspi, t, count); if (ret) { dev_dbg(qspi->dev, "Error while reading\n"); return ret; @@ -374,7 +375,8 @@ static int ti_qspi_start_transfer_one(st struct spi_device *spi = m->spi; struct spi_transfer *t; int status = 0, ret; - unsigned int frame_len_words; + unsigned int frame_len_words, transfer_len_words; + int wlen; /* setup device control reg */ qspi->dc = 0; @@ -404,14 +406,20 @@ static int ti_qspi_start_transfer_one(st qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) | QSPI_WLEN(t->bits_per_word)); - ret = qspi_transfer_msg(qspi, t); + wlen = t->bits_per_word >> 3; + transfer_len_words = min(t->len / wlen, frame_len_words); + + ret = qspi_transfer_msg(qspi, t, transfer_len_words * wlen); if (ret) { dev_dbg(qspi->dev, "transfer message failed\n"); mutex_unlock(>list_lock); return -EINVAL; } - m->actual_length += t->len; + m->actual_length += transfer_len_words * wlen; + frame_len_words -= transfer_len_words; + if (frame_len_words == 0) + break; } mutex_unlock(>list_lock);
[PATCH 4.4 51/73] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Jarkko Nikulacommit 66ec246eb9982e7eb8e15e1fc55f543230310dd0 upstream. Certain Intel Sunrisepoint PCH variants report zero chip selects in SPI capabilities register even they have one per port. Detection in pxa2xx_spi_probe() sets master->num_chipselect to 0 leading to -EINVAL from spi_register_master() where chip select count is validated. Fix this by not using SPI capabilities register on Sunrisepoint. They don't have more than one chip select so use the default value 1 instead of detection. Fixes: 8b136baa5892 ("spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals") Signed-off-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-pxa2xx.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -111,7 +111,7 @@ static const struct lpss_config lpss_pla .reg_general = -1, .reg_ssp = 0x20, .reg_cs_ctrl = 0x24, - .reg_capabilities = 0xfc, + .reg_capabilities = -1, .rx_threshold = 1, .tx_threshold_lo = 32, .tx_threshold_hi = 56,
[PATCH 4.4 51/73] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Jarkko Nikula commit 66ec246eb9982e7eb8e15e1fc55f543230310dd0 upstream. Certain Intel Sunrisepoint PCH variants report zero chip selects in SPI capabilities register even they have one per port. Detection in pxa2xx_spi_probe() sets master->num_chipselect to 0 leading to -EINVAL from spi_register_master() where chip select count is validated. Fix this by not using SPI capabilities register on Sunrisepoint. They don't have more than one chip select so use the default value 1 instead of detection. Fixes: 8b136baa5892 ("spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals") Signed-off-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-pxa2xx.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -111,7 +111,7 @@ static const struct lpss_config lpss_pla .reg_general = -1, .reg_ssp = 0x20, .reg_cs_ctrl = 0x24, - .reg_capabilities = 0xfc, + .reg_capabilities = -1, .rx_threshold = 1, .tx_threshold_lo = 32, .tx_threshold_hi = 56,
[PATCH 4.4 57/73] vfs: add vfs_select_inode() helper
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Miklos Szeredicommit 54d5ca871e72f2bb172ec9323497f01cd5091ec7 upstream. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman --- fs/open.c | 12 include/linux/dcache.h | 12 2 files changed, 16 insertions(+), 8 deletions(-) --- a/fs/open.c +++ b/fs/open.c @@ -840,16 +840,12 @@ EXPORT_SYMBOL(file_path); int vfs_open(const struct path *path, struct file *file, const struct cred *cred) { - struct dentry *dentry = path->dentry; - struct inode *inode = dentry->d_inode; + struct inode *inode = vfs_select_inode(path->dentry, file->f_flags); - file->f_path = *path; - if (dentry->d_flags & DCACHE_OP_SELECT_INODE) { - inode = dentry->d_op->d_select_inode(dentry, file->f_flags); - if (IS_ERR(inode)) - return PTR_ERR(inode); - } + if (IS_ERR(inode)) + return PTR_ERR(inode); + file->f_path = *path; return do_dentry_open(file, inode, NULL, cred); } --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -592,4 +592,16 @@ static inline struct dentry *d_real(stru return dentry; } +static inline struct inode *vfs_select_inode(struct dentry *dentry, +unsigned open_flags) +{ + struct inode *inode = d_inode(dentry); + + if (inode && unlikely(dentry->d_flags & DCACHE_OP_SELECT_INODE)) + inode = dentry->d_op->d_select_inode(dentry, open_flags); + + return inode; +} + + #endif /* __LINUX_DCACHE_H */
[PATCH 4.4 57/73] vfs: add vfs_select_inode() helper
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Miklos Szeredi commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7 upstream. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman --- fs/open.c | 12 include/linux/dcache.h | 12 2 files changed, 16 insertions(+), 8 deletions(-) --- a/fs/open.c +++ b/fs/open.c @@ -840,16 +840,12 @@ EXPORT_SYMBOL(file_path); int vfs_open(const struct path *path, struct file *file, const struct cred *cred) { - struct dentry *dentry = path->dentry; - struct inode *inode = dentry->d_inode; + struct inode *inode = vfs_select_inode(path->dentry, file->f_flags); - file->f_path = *path; - if (dentry->d_flags & DCACHE_OP_SELECT_INODE) { - inode = dentry->d_op->d_select_inode(dentry, file->f_flags); - if (IS_ERR(inode)) - return PTR_ERR(inode); - } + if (IS_ERR(inode)) + return PTR_ERR(inode); + file->f_path = *path; return do_dentry_open(file, inode, NULL, cred); } --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -592,4 +592,16 @@ static inline struct dentry *d_real(stru return dentry; } +static inline struct inode *vfs_select_inode(struct dentry *dentry, +unsigned open_flags) +{ + struct inode *inode = d_inode(dentry); + + if (inode && unlikely(dentry->d_flags & DCACHE_OP_SELECT_INODE)) + inode = dentry->d_op->d_select_inode(dentry, open_flags); + + return inode; +} + + #endif /* __LINUX_DCACHE_H */
[PATCH 4.4 59/73] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Boris Brezilloncommit aab0a4c83ceb344d2327194bf354820e50607af6 upstream. The memory range assigned to the PMC (Power Management Controller) was not including the PMC_PCR register which are used to control peripheral clocks. This was working fine thanks to the page granularity of ioremap(), but started to fail when we switched to syscon/regmap, because regmap is making sure that all accesses are falling into the reserved range. Signed-off-by: Boris Brezillon Reported-by: Richard Genoud Tested-by: Richard Genoud Fixes: 863a81c3be1d ("clk: at91: make use of syscon to share PMC registers in several drivers") Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman --- arch/arm/boot/dts/at91sam9x5.dtsi |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -106,7 +106,7 @@ pmc: pmc@fc00 { compatible = "atmel,at91sam9x5-pmc", "syscon"; - reg = <0xfc00 0x100>; + reg = <0xfc00 0x200>; interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>; interrupt-controller; #address-cells = <1>;
[PATCH 4.4 58/73] vfs: rename: check backing inode being equal
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Miklos Szeredicommit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca upstream. If a file is renamed to a hardlink of itself POSIX specifies that rename(2) should do nothing and return success. This condition is checked in vfs_rename(). However it won't detect hard links on overlayfs where these are given separate inodes on the overlayfs layer. Overlayfs itself detects this condition and returns success without doing anything, but then vfs_rename() will proceed as if this was a successful rename (detach_mounts(), d_move()). The correct thing to do is to detect this condition before even calling into overlayfs. This patch does this by calling vfs_select_inode() to get the underlying inodes. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman --- fs/namei.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/fs/namei.c +++ b/fs/namei.c @@ -4195,7 +4195,11 @@ int vfs_rename(struct inode *old_dir, st bool new_is_dir = false; unsigned max_links = new_dir->i_sb->s_max_links; - if (source == target) + /* +* Check source == target. +* On overlayfs need to look at underlying inodes. +*/ + if (vfs_select_inode(old_dentry, 0) == vfs_select_inode(new_dentry, 0)) return 0; error = may_delete(old_dir, old_dentry, is_dir);
[PATCH 4.4 59/73] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Boris Brezillon commit aab0a4c83ceb344d2327194bf354820e50607af6 upstream. The memory range assigned to the PMC (Power Management Controller) was not including the PMC_PCR register which are used to control peripheral clocks. This was working fine thanks to the page granularity of ioremap(), but started to fail when we switched to syscon/regmap, because regmap is making sure that all accesses are falling into the reserved range. Signed-off-by: Boris Brezillon Reported-by: Richard Genoud Tested-by: Richard Genoud Fixes: 863a81c3be1d ("clk: at91: make use of syscon to share PMC registers in several drivers") Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman --- arch/arm/boot/dts/at91sam9x5.dtsi |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -106,7 +106,7 @@ pmc: pmc@fc00 { compatible = "atmel,at91sam9x5-pmc", "syscon"; - reg = <0xfc00 0x100>; + reg = <0xfc00 0x200>; interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>; interrupt-controller; #address-cells = <1>;
[PATCH 4.4 58/73] vfs: rename: check backing inode being equal
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Miklos Szeredi commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca upstream. If a file is renamed to a hardlink of itself POSIX specifies that rename(2) should do nothing and return success. This condition is checked in vfs_rename(). However it won't detect hard links on overlayfs where these are given separate inodes on the overlayfs layer. Overlayfs itself detects this condition and returns success without doing anything, but then vfs_rename() will proceed as if this was a successful rename (detach_mounts(), d_move()). The correct thing to do is to detect this condition before even calling into overlayfs. This patch does this by calling vfs_select_inode() to get the underlying inodes. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman --- fs/namei.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/fs/namei.c +++ b/fs/namei.c @@ -4195,7 +4195,11 @@ int vfs_rename(struct inode *old_dir, st bool new_is_dir = false; unsigned max_links = new_dir->i_sb->s_max_links; - if (source == target) + /* +* Check source == target. +* On overlayfs need to look at underlying inodes. +*/ + if (vfs_select_inode(old_dentry, 0) == vfs_select_inode(new_dentry, 0)) return 0; error = may_delete(old_dir, old_dentry, is_dir);
[PATCH 4.4 61/73] regulator: s2mps11: Fix invalid selector mask and voltages for buck9
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Krzysztof Kozlowskicommit 3b672623079bb3e5685b8549e514f2dfaa564406 upstream. The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff instead of 0x1f) thus reading entire register as buck9's voltage. This effectively caused regulator core to interpret values as higher voltages than they were and then to set real voltage much lower than intended. The buck9 provides power to other regulators, including LDO13 and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower voltage caused SD card detection errors on Odroid XU3/XU4: mmc1: card never left busy state mmc1: error -110 whilst initialising SD card During driver probe the regulator core was checking whether initial voltage matches the constraints. With incorrect vsel_mask of 0xff and default value of 0x50, the core interpreted this as 5 V which is outside of constraints (3-3.775 V). Then the regulator core was adjusting the voltage to match the constraints. With incorrect vsel_mask this new voltage mapped to a vere low voltage in the driver. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Javier Martinez Canillas Tested-by: Javier Martinez Canillas Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/s2mps11.c | 28 ++-- include/linux/mfd/samsung/s2mps11.h |2 ++ 2 files changed, 24 insertions(+), 6 deletions(-) --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -305,7 +305,7 @@ static struct regulator_ops s2mps11_buck .enable_mask= S2MPS11_ENABLE_MASK \ } -#define regulator_desc_s2mps11_buck6_10(num, min, step) { \ +#define regulator_desc_s2mps11_buck67810(num, min, step) { \ .name = "BUCK"#num, \ .id = S2MPS11_BUCK##num,\ .ops= _buck_ops,\ @@ -321,6 +321,22 @@ static struct regulator_ops s2mps11_buck .enable_mask= S2MPS11_ENABLE_MASK \ } +#define regulator_desc_s2mps11_buck9 { \ + .name = "BUCK9", \ + .id = S2MPS11_BUCK9,\ + .ops= _buck_ops,\ + .type = REGULATOR_VOLTAGE,\ + .owner = THIS_MODULE, \ + .min_uV = MIN_3000_MV, \ + .uV_step= STEP_25_MV, \ + .n_voltages = S2MPS11_BUCK9_N_VOLTAGES, \ + .ramp_delay = S2MPS11_RAMP_DELAY, \ + .vsel_reg = S2MPS11_REG_B9CTRL2, \ + .vsel_mask = S2MPS11_BUCK9_VSEL_MASK, \ + .enable_reg = S2MPS11_REG_B9CTRL1, \ + .enable_mask= S2MPS11_ENABLE_MASK \ +} + static const struct regulator_desc s2mps11_regulators[] = { regulator_desc_s2mps11_ldo(1, STEP_25_MV), regulator_desc_s2mps11_ldo(2, STEP_50_MV), @@ -365,11 +381,11 @@ static const struct regulator_desc s2mps regulator_desc_s2mps11_buck1_4(3), regulator_desc_s2mps11_buck1_4(4), regulator_desc_s2mps11_buck5, - regulator_desc_s2mps11_buck6_10(6, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(7, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(8, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(9, MIN_3000_MV, STEP_25_MV), - regulator_desc_s2mps11_buck6_10(10, MIN_750_MV, STEP_12_5_MV), + regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck9, + regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV), }; static struct regulator_ops s2mps14_reg_ops; --- a/include/linux/mfd/samsung/s2mps11.h +++ b/include/linux/mfd/samsung/s2mps11.h @@ -173,10 +173,12 @@ enum s2mps11_regulators { #define S2MPS11_LDO_VSEL_MASK 0x3F #define S2MPS11_BUCK_VSEL_MASK 0xFF +#define S2MPS11_BUCK9_VSEL_MASK0x1F #define S2MPS11_ENABLE_MASK(0x03 << S2MPS11_ENABLE_SHIFT) #define S2MPS11_ENABLE_SHIFT 0x06 #define S2MPS11_LDO_N_VOLTAGES (S2MPS11_LDO_VSEL_MASK + 1) #define S2MPS11_BUCK_N_VOLTAGES (S2MPS11_BUCK_VSEL_MASK + 1) +#define S2MPS11_BUCK9_N_VOLTAGES (S2MPS11_BUCK9_VSEL_MASK + 1) #define S2MPS11_RAMP_DELAY 25000 /* uV/us */ #define S2MPS11_CTRL1_PWRHOLD_MASK BIT(4)
[PATCH 4.4 63/73] atomic_open(): fix the handling of create_error
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Virocommit 10c64cea04d3c75c306b3f990586ffb343b63287 upstream. * if we have a hashed negative dentry and either CREAT|EXCL on r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL with failing may_o_create(), we should fail with EROFS or the error may_o_create() has returned, but not ENOENT. Which is what the current code ends up returning. * if we have CREAT|TRUNC hitting a regular file on a read-only filesystem, we can't fail with EROFS here. At the very least, not until we'd done follow_managed() - we might have a writable file (or a device, for that matter) bound on top of that one. Moreover, the code downstream will see that O_TRUNC and attempt to grab the write access (*after* following possible mount), so if we really should fail with EROFS, it will happen. No need to do that inside atomic_open(). The real logics is much simpler than what the current code is trying to do - if we decided to go for simple lookup, ended up with a negative dentry *and* had create_error set, fail with create_error. No matter whether we'd got that negative dentry from lookup_real() or had found it in dcache. Acked-by: Miklos Szeredi Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/namei.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) --- a/fs/namei.c +++ b/fs/namei.c @@ -2906,22 +2906,10 @@ no_open: dentry = lookup_real(dir, dentry, nd->flags); if (IS_ERR(dentry)) return PTR_ERR(dentry); - - if (create_error) { - int open_flag = op->open_flag; - - error = create_error; - if ((open_flag & O_EXCL)) { - if (!dentry->d_inode) - goto out; - } else if (!dentry->d_inode) { - goto out; - } else if ((open_flag & O_TRUNC) && - d_is_reg(dentry)) { - goto out; - } - /* will fail later, go on to get the right error */ - } + } + if (create_error && !dentry->d_inode) { + error = create_error; + goto out; } looked_up: path->dentry = dentry;
Re: [PATCH v5 02/12] mm: migrate: support non-lru movable page migration
On Mon, May 16, 2016 at 04:17:51PM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > [..] > > +++ b/include/linux/migrate.h > > @@ -32,11 +32,16 @@ extern char *migrate_reason_names[MR_TYPES]; > > > > #ifdef CONFIG_MIGRATION > > > > +extern int PageMovable(struct page *page); > > +extern void __SetPageMovable(struct page *page, struct address_space > > *mapping); > > +extern void __ClearPageMovable(struct page *page); > > extern void putback_movable_pages(struct list_head *l); > > extern int migrate_page(struct address_space *, > > struct page *, struct page *, enum migrate_mode); > > extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t > > free, > > unsigned long private, enum migrate_mode mode, int reason); > > +extern bool isolate_movable_page(struct page *page, isolate_mode_t mode); > > +extern void putback_movable_page(struct page *page); > > > > extern int migrate_prep(void); > > extern int migrate_prep_local(void); > > given that some of Movable users can be built as modules, shouldn't > at least some of those symbols be exported via EXPORT_SYMBOL? Those functions aim for VM compaction so driver shouldn't use it. Only driver should be aware of are __SetPageMovable and __CleraPageMovable. I will export them. Thanks for the review, Sergey! > > -ss
[PATCH 4.4 61/73] regulator: s2mps11: Fix invalid selector mask and voltages for buck9
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Krzysztof Kozlowski commit 3b672623079bb3e5685b8549e514f2dfaa564406 upstream. The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff instead of 0x1f) thus reading entire register as buck9's voltage. This effectively caused regulator core to interpret values as higher voltages than they were and then to set real voltage much lower than intended. The buck9 provides power to other regulators, including LDO13 and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower voltage caused SD card detection errors on Odroid XU3/XU4: mmc1: card never left busy state mmc1: error -110 whilst initialising SD card During driver probe the regulator core was checking whether initial voltage matches the constraints. With incorrect vsel_mask of 0xff and default value of 0x50, the core interpreted this as 5 V which is outside of constraints (3-3.775 V). Then the regulator core was adjusting the voltage to match the constraints. With incorrect vsel_mask this new voltage mapped to a vere low voltage in the driver. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Javier Martinez Canillas Tested-by: Javier Martinez Canillas Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/s2mps11.c | 28 ++-- include/linux/mfd/samsung/s2mps11.h |2 ++ 2 files changed, 24 insertions(+), 6 deletions(-) --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -305,7 +305,7 @@ static struct regulator_ops s2mps11_buck .enable_mask= S2MPS11_ENABLE_MASK \ } -#define regulator_desc_s2mps11_buck6_10(num, min, step) { \ +#define regulator_desc_s2mps11_buck67810(num, min, step) { \ .name = "BUCK"#num, \ .id = S2MPS11_BUCK##num,\ .ops= _buck_ops,\ @@ -321,6 +321,22 @@ static struct regulator_ops s2mps11_buck .enable_mask= S2MPS11_ENABLE_MASK \ } +#define regulator_desc_s2mps11_buck9 { \ + .name = "BUCK9", \ + .id = S2MPS11_BUCK9,\ + .ops= _buck_ops,\ + .type = REGULATOR_VOLTAGE,\ + .owner = THIS_MODULE, \ + .min_uV = MIN_3000_MV, \ + .uV_step= STEP_25_MV, \ + .n_voltages = S2MPS11_BUCK9_N_VOLTAGES, \ + .ramp_delay = S2MPS11_RAMP_DELAY, \ + .vsel_reg = S2MPS11_REG_B9CTRL2, \ + .vsel_mask = S2MPS11_BUCK9_VSEL_MASK, \ + .enable_reg = S2MPS11_REG_B9CTRL1, \ + .enable_mask= S2MPS11_ENABLE_MASK \ +} + static const struct regulator_desc s2mps11_regulators[] = { regulator_desc_s2mps11_ldo(1, STEP_25_MV), regulator_desc_s2mps11_ldo(2, STEP_50_MV), @@ -365,11 +381,11 @@ static const struct regulator_desc s2mps regulator_desc_s2mps11_buck1_4(3), regulator_desc_s2mps11_buck1_4(4), regulator_desc_s2mps11_buck5, - regulator_desc_s2mps11_buck6_10(6, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(7, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(8, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck6_10(9, MIN_3000_MV, STEP_25_MV), - regulator_desc_s2mps11_buck6_10(10, MIN_750_MV, STEP_12_5_MV), + regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck9, + regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV), }; static struct regulator_ops s2mps14_reg_ops; --- a/include/linux/mfd/samsung/s2mps11.h +++ b/include/linux/mfd/samsung/s2mps11.h @@ -173,10 +173,12 @@ enum s2mps11_regulators { #define S2MPS11_LDO_VSEL_MASK 0x3F #define S2MPS11_BUCK_VSEL_MASK 0xFF +#define S2MPS11_BUCK9_VSEL_MASK0x1F #define S2MPS11_ENABLE_MASK(0x03 << S2MPS11_ENABLE_SHIFT) #define S2MPS11_ENABLE_SHIFT 0x06 #define S2MPS11_LDO_N_VOLTAGES (S2MPS11_LDO_VSEL_MASK + 1) #define S2MPS11_BUCK_N_VOLTAGES (S2MPS11_BUCK_VSEL_MASK + 1) +#define S2MPS11_BUCK9_N_VOLTAGES (S2MPS11_BUCK9_VSEL_MASK + 1) #define S2MPS11_RAMP_DELAY 25000 /* uV/us */ #define S2MPS11_CTRL1_PWRHOLD_MASK BIT(4)
[PATCH 4.4 63/73] atomic_open(): fix the handling of create_error
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Viro commit 10c64cea04d3c75c306b3f990586ffb343b63287 upstream. * if we have a hashed negative dentry and either CREAT|EXCL on r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL with failing may_o_create(), we should fail with EROFS or the error may_o_create() has returned, but not ENOENT. Which is what the current code ends up returning. * if we have CREAT|TRUNC hitting a regular file on a read-only filesystem, we can't fail with EROFS here. At the very least, not until we'd done follow_managed() - we might have a writable file (or a device, for that matter) bound on top of that one. Moreover, the code downstream will see that O_TRUNC and attempt to grab the write access (*after* following possible mount), so if we really should fail with EROFS, it will happen. No need to do that inside atomic_open(). The real logics is much simpler than what the current code is trying to do - if we decided to go for simple lookup, ended up with a negative dentry *and* had create_error set, fail with create_error. No matter whether we'd got that negative dentry from lookup_real() or had found it in dcache. Acked-by: Miklos Szeredi Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/namei.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) --- a/fs/namei.c +++ b/fs/namei.c @@ -2906,22 +2906,10 @@ no_open: dentry = lookup_real(dir, dentry, nd->flags); if (IS_ERR(dentry)) return PTR_ERR(dentry); - - if (create_error) { - int open_flag = op->open_flag; - - error = create_error; - if ((open_flag & O_EXCL)) { - if (!dentry->d_inode) - goto out; - } else if (!dentry->d_inode) { - goto out; - } else if ((open_flag & O_TRUNC) && - d_is_reg(dentry)) { - goto out; - } - /* will fail later, go on to get the right error */ - } + } + if (create_error && !dentry->d_inode) { + error = create_error; + goto out; } looked_up: path->dentry = dentry;
Re: [PATCH v5 02/12] mm: migrate: support non-lru movable page migration
On Mon, May 16, 2016 at 04:17:51PM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > [..] > > +++ b/include/linux/migrate.h > > @@ -32,11 +32,16 @@ extern char *migrate_reason_names[MR_TYPES]; > > > > #ifdef CONFIG_MIGRATION > > > > +extern int PageMovable(struct page *page); > > +extern void __SetPageMovable(struct page *page, struct address_space > > *mapping); > > +extern void __ClearPageMovable(struct page *page); > > extern void putback_movable_pages(struct list_head *l); > > extern int migrate_page(struct address_space *, > > struct page *, struct page *, enum migrate_mode); > > extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t > > free, > > unsigned long private, enum migrate_mode mode, int reason); > > +extern bool isolate_movable_page(struct page *page, isolate_mode_t mode); > > +extern void putback_movable_page(struct page *page); > > > > extern int migrate_prep(void); > > extern int migrate_prep_local(void); > > given that some of Movable users can be built as modules, shouldn't > at least some of those symbols be exported via EXPORT_SYMBOL? Those functions aim for VM compaction so driver shouldn't use it. Only driver should be aware of are __SetPageMovable and __CleraPageMovable. I will export them. Thanks for the review, Sergey! > > -ss
[PATCH 4.4 66/73] get_rock_ridge_filename(): handle malformed NM entries
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Virocommit 99d825822eade8d827a1817357cbf3f889a552d6 upstream. Payloads of NM entries are not supposed to contain NUL. When we run into such, only the part prior to the first NUL goes into the concatenation (i.e. the directory entry name being encoded by a bunch of NM entries). We do stop when the amount collected so far + the claimed amount in the current NM entry exceed 254. So far, so good, but what we return as the total length is the sum of *claimed* sizes, not the actual amount collected. And that can grow pretty large - not unlimited, since you'd need to put CE entries in between to be able to get more than the maximum that could be contained in one isofs directory entry / continuation chunk and we are stop once we'd encountered 32 CEs, but you can get about 8Kb easily. And that's what will be passed to readdir callback as the name length. 8Kb __copy_to_user() from a buffer allocated by __get_free_page() Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/isofs/rock.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) --- a/fs/isofs/rock.c +++ b/fs/isofs/rock.c @@ -203,6 +203,8 @@ int get_rock_ridge_filename(struct iso_d int retnamlen = 0; int truncate = 0; int ret = 0; + char *p; + int len; if (!ISOFS_SB(inode->i_sb)->s_rock) return 0; @@ -267,12 +269,17 @@ repeat: rr->u.NM.flags); break; } - if ((strlen(retname) + rr->len - 5) >= 254) { + len = rr->len - 5; + if (retnamlen + len >= 254) { truncate = 1; break; } - strncat(retname, rr->u.NM.name, rr->len - 5); - retnamlen += rr->len - 5; + p = memchr(rr->u.NM.name, '\0', len); + if (unlikely(p)) + len = p - rr->u.NM.name; + memcpy(retname + retnamlen, rr->u.NM.name, len); + retnamlen += len; + retname[retnamlen] = '\0'; break; case SIG('R', 'E'): kfree(rs.buffer);
[PATCH 4.4 66/73] get_rock_ridge_filename(): handle malformed NM entries
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Viro commit 99d825822eade8d827a1817357cbf3f889a552d6 upstream. Payloads of NM entries are not supposed to contain NUL. When we run into such, only the part prior to the first NUL goes into the concatenation (i.e. the directory entry name being encoded by a bunch of NM entries). We do stop when the amount collected so far + the claimed amount in the current NM entry exceed 254. So far, so good, but what we return as the total length is the sum of *claimed* sizes, not the actual amount collected. And that can grow pretty large - not unlimited, since you'd need to put CE entries in between to be able to get more than the maximum that could be contained in one isofs directory entry / continuation chunk and we are stop once we'd encountered 32 CEs, but you can get about 8Kb easily. And that's what will be passed to readdir callback as the name length. 8Kb __copy_to_user() from a buffer allocated by __get_free_page() Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/isofs/rock.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) --- a/fs/isofs/rock.c +++ b/fs/isofs/rock.c @@ -203,6 +203,8 @@ int get_rock_ridge_filename(struct iso_d int retnamlen = 0; int truncate = 0; int ret = 0; + char *p; + int len; if (!ISOFS_SB(inode->i_sb)->s_rock) return 0; @@ -267,12 +269,17 @@ repeat: rr->u.NM.flags); break; } - if ((strlen(retname) + rr->len - 5) >= 254) { + len = rr->len - 5; + if (retnamlen + len >= 254) { truncate = 1; break; } - strncat(retname, rr->u.NM.name, rr->len - 5); - retnamlen += rr->len - 5; + p = memchr(rr->u.NM.name, '\0', len); + if (unlikely(p)) + len = p - rr->u.NM.name; + memcpy(retname + retnamlen, rr->u.NM.name, len); + retnamlen += len; + retname[retnamlen] = '\0'; break; case SIG('R', 'E'): kfree(rs.buffer);
[PATCH 4.4 62/73] regulator: axp20x: Fix axp22x ldo_io voltage ranges
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Hans de Goedecommit a2262e5a12e05389ab4c7fc5cf60016b041dd8dc upstream. The minium voltage of 1800mV is a copy and paste error from the axp20x regulator info. The correct minimum voltage for the ldo_io regulators on the axp22x is 700mV. Fixes: 1b82b4e4f954 ("regulator: axp20x: Add support for AXP22X regulators") Signed-off-by: Hans de Goede Acked-by: Chen-Yu Tsai Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/axp20x-regulator.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/regulator/axp20x-regulator.c +++ b/drivers/regulator/axp20x-regulator.c @@ -221,10 +221,10 @@ static const struct regulator_desc axp22 AXP22X_ELDO2_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(1)), AXP_DESC(AXP22X, ELDO3, "eldo3", "eldoin", 700, 3300, 100, AXP22X_ELDO3_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(2)), - AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 1800, 3300, 100, + AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 700, 3300, 100, AXP22X_LDO_IO0_V_OUT, 0x1f, AXP20X_GPIO0_CTRL, 0x07, AXP22X_IO_ENABLED, AXP22X_IO_DISABLED), - AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 1800, 3300, 100, + AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 700, 3300, 100, AXP22X_LDO_IO1_V_OUT, 0x1f, AXP20X_GPIO1_CTRL, 0x07, AXP22X_IO_ENABLED, AXP22X_IO_DISABLED), AXP_DESC_FIXED(AXP22X, RTC_LDO, "rtc_ldo", "ips", 3000),
[PATCH 4.4 67/73] Input: max8997-haptic - fix NULL pointer dereference
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Marek Szyprowskicommit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream. NULL pointer derefence happens when booting with DTB because the platform data for haptic device is not set in supplied data from parent MFD device. The MFD device creates only platform data (from Device Tree) for itself, not for haptic child. Unable to handle kernel NULL pointer dereference at virtual address 009c pgd = c0004000 [009c] *pgd= Internal error: Oops: 5 [#1] PREEMPT SMP ARM (max8997_haptic_probe) from [] (platform_drv_probe+0x4c/0xb0) (platform_drv_probe) from [] (driver_probe_device+0x214/0x2c0) (driver_probe_device) from [] (__driver_attach+0xac/0xb0) (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c) (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218) (bus_add_driver) from [] (driver_register+0x78/0xf8) (driver_register) from [] (do_one_initcall+0x90/0x1d8) (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1fc) (kernel_init_freeable) from [] (kernel_init+0x8/0x114) (kernel_init) from [] (ret_from_fork+0x14/0x3c) Signed-off-by: Marek Szyprowski Fixes: 104594b01ce7 ("Input: add driver support for MAX8997-haptic") [k.kozlowski: Write commit message, add CC-stable] Signed-off-by: Krzysztof Kozlowski Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman --- drivers/input/misc/max8997_haptic.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/input/misc/max8997_haptic.c +++ b/drivers/input/misc/max8997_haptic.c @@ -255,12 +255,14 @@ static int max8997_haptic_probe(struct p struct max8997_dev *iodev = dev_get_drvdata(pdev->dev.parent); const struct max8997_platform_data *pdata = dev_get_platdata(iodev->dev); - const struct max8997_haptic_platform_data *haptic_pdata = - pdata->haptic_pdata; + const struct max8997_haptic_platform_data *haptic_pdata = NULL; struct max8997_haptic *chip; struct input_dev *input_dev; int error; + if (pdata) + haptic_pdata = pdata->haptic_pdata; + if (!haptic_pdata) { dev_err(>dev, "no haptic platform data\n"); return -EINVAL;
[PATCH 4.4 62/73] regulator: axp20x: Fix axp22x ldo_io voltage ranges
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Hans de Goede commit a2262e5a12e05389ab4c7fc5cf60016b041dd8dc upstream. The minium voltage of 1800mV is a copy and paste error from the axp20x regulator info. The correct minimum voltage for the ldo_io regulators on the axp22x is 700mV. Fixes: 1b82b4e4f954 ("regulator: axp20x: Add support for AXP22X regulators") Signed-off-by: Hans de Goede Acked-by: Chen-Yu Tsai Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/axp20x-regulator.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/regulator/axp20x-regulator.c +++ b/drivers/regulator/axp20x-regulator.c @@ -221,10 +221,10 @@ static const struct regulator_desc axp22 AXP22X_ELDO2_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(1)), AXP_DESC(AXP22X, ELDO3, "eldo3", "eldoin", 700, 3300, 100, AXP22X_ELDO3_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(2)), - AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 1800, 3300, 100, + AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 700, 3300, 100, AXP22X_LDO_IO0_V_OUT, 0x1f, AXP20X_GPIO0_CTRL, 0x07, AXP22X_IO_ENABLED, AXP22X_IO_DISABLED), - AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 1800, 3300, 100, + AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 700, 3300, 100, AXP22X_LDO_IO1_V_OUT, 0x1f, AXP20X_GPIO1_CTRL, 0x07, AXP22X_IO_ENABLED, AXP22X_IO_DISABLED), AXP_DESC_FIXED(AXP22X, RTC_LDO, "rtc_ldo", "ips", 3000),
[PATCH 4.4 67/73] Input: max8997-haptic - fix NULL pointer dereference
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Marek Szyprowski commit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream. NULL pointer derefence happens when booting with DTB because the platform data for haptic device is not set in supplied data from parent MFD device. The MFD device creates only platform data (from Device Tree) for itself, not for haptic child. Unable to handle kernel NULL pointer dereference at virtual address 009c pgd = c0004000 [009c] *pgd= Internal error: Oops: 5 [#1] PREEMPT SMP ARM (max8997_haptic_probe) from [] (platform_drv_probe+0x4c/0xb0) (platform_drv_probe) from [] (driver_probe_device+0x214/0x2c0) (driver_probe_device) from [] (__driver_attach+0xac/0xb0) (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c) (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218) (bus_add_driver) from [] (driver_register+0x78/0xf8) (driver_register) from [] (do_one_initcall+0x90/0x1d8) (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1fc) (kernel_init_freeable) from [] (kernel_init+0x8/0x114) (kernel_init) from [] (ret_from_fork+0x14/0x3c) Signed-off-by: Marek Szyprowski Fixes: 104594b01ce7 ("Input: add driver support for MAX8997-haptic") [k.kozlowski: Write commit message, add CC-stable] Signed-off-by: Krzysztof Kozlowski Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman --- drivers/input/misc/max8997_haptic.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/input/misc/max8997_haptic.c +++ b/drivers/input/misc/max8997_haptic.c @@ -255,12 +255,14 @@ static int max8997_haptic_probe(struct p struct max8997_dev *iodev = dev_get_drvdata(pdev->dev.parent); const struct max8997_platform_data *pdata = dev_get_platdata(iodev->dev); - const struct max8997_haptic_platform_data *haptic_pdata = - pdata->haptic_pdata; + const struct max8997_haptic_platform_data *haptic_pdata = NULL; struct max8997_haptic *chip; struct input_dev *input_dev; int error; + if (pdata) + haptic_pdata = pdata->haptic_pdata; + if (!haptic_pdata) { dev_err(>dev, "no haptic platform data\n"); return -EINVAL;
[PATCH 4.4 69/73] drm/radeon: fix PLL sharing on DCE6.1 (v2)
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Lucas Stachcommit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream. On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ 1 file changed, 10 insertions(+) --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1739,6 +1739,7 @@ static u32 radeon_get_pll_use_mask(struc static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; @@ -1748,6 +1749,10 @@ static int radeon_get_shared_dp_ppll(str test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* for DP use the same PLL for all */ if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID) return test_radeon_crtc->pll_id; @@ -1769,6 +1774,7 @@ static int radeon_get_shared_nondp_ppll( { struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; u32 adjusted_clock, test_adjusted_clock; @@ -1784,6 +1790,10 @@ static int radeon_get_shared_nondp_ppll( test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* check if we are already driving this connector with another crtc */ if (test_radeon_crtc->connector == radeon_crtc->connector) { /* if we are, return that pll */
[PATCH 4.4 70/73] drm/i915: Bail out of pipe config compute loop on LPT
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Daniel Vettercommit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream. LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson Cc: Maarten Lankhorst References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter Tested-by: Chris Wilson Signed-off-by: Maarten Lankhorst Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vet...@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/intel_crt.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -248,8 +248,14 @@ static bool intel_crt_compute_config(str pipe_config->has_pch_encoder = true; /* LPT FDI RX only supports 8bpc. */ - if (HAS_PCH_LPT(dev)) + if (HAS_PCH_LPT(dev)) { + if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) { + DRM_DEBUG_KMS("LPT only supports 24bpp\n"); + return false; + } + pipe_config->pipe_bpp = 24; + } /* FDI must always be 2.7 GHz */ if (HAS_DDI(dev)) {
[PATCH 4.4 69/73] drm/radeon: fix PLL sharing on DCE6.1 (v2)
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Lucas Stach commit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream. On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ 1 file changed, 10 insertions(+) --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1739,6 +1739,7 @@ static u32 radeon_get_pll_use_mask(struc static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; @@ -1748,6 +1749,10 @@ static int radeon_get_shared_dp_ppll(str test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* for DP use the same PLL for all */ if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID) return test_radeon_crtc->pll_id; @@ -1769,6 +1774,7 @@ static int radeon_get_shared_nondp_ppll( { struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; u32 adjusted_clock, test_adjusted_clock; @@ -1784,6 +1790,10 @@ static int radeon_get_shared_nondp_ppll( test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* check if we are already driving this connector with another crtc */ if (test_radeon_crtc->connector == radeon_crtc->connector) { /* if we are, return that pll */
[PATCH 4.4 70/73] drm/i915: Bail out of pipe config compute loop on LPT
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Daniel Vetter commit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream. LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson Cc: Maarten Lankhorst References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter Tested-by: Chris Wilson Signed-off-by: Maarten Lankhorst Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vet...@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/intel_crt.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -248,8 +248,14 @@ static bool intel_crt_compute_config(str pipe_config->has_pch_encoder = true; /* LPT FDI RX only supports 8bpc. */ - if (HAS_PCH_LPT(dev)) + if (HAS_PCH_LPT(dev)) { + if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) { + DRM_DEBUG_KMS("LPT only supports 24bpp\n"); + return false; + } + pipe_config->pipe_bpp = 24; + } /* FDI must always be 2.7 GHz */ if (HAS_DDI(dev)) {
[PATCH 4.4 73/73] nf_conntrack: avoid kernel pointer value leak in slab name
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Linus Torvaldscommit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream. The slab name ends up being visible in the directory structure under /sys, and even if you don't have access rights to the file you can see the filenames. Just use a 64-bit counter instead of the pointer to the 'net' structure to generate a unique name. This code will go away in 4.7 when the conntrack code moves to a single kmemcache, but this is the backportable simple solution to avoiding leaking kernel pointers to user space. Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep") Signed-off-by: Linus Torvalds Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_conntrack_core.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1757,6 +1757,7 @@ void nf_conntrack_init_end(void) int nf_conntrack_init_net(struct net *net) { + static atomic64_t unique_id; int ret = -ENOMEM; int cpu; @@ -1779,7 +1780,8 @@ int nf_conntrack_init_net(struct net *ne if (!net->ct.stat) goto err_pcpu_lists; - net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%p", net); + net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%llu", + (u64)atomic64_inc_return(_id)); if (!net->ct.slabname) goto err_slabname;
[PATCH 4.4 68/73] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehabcommit 93f0750dcdaed083d6209b01e952e98ca730db66 upstream. This patch causes a Kernel panic when called on a DVB driver. This was also reported by David R : May 7 14:47:35 server kernel: [ 501.247123] BUG: unable to handle kernel NULL pointer dereference at 0004 May 7 14:47:35 server kernel: [ 501.247239] IP: [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.247354] PGD cae6f067 PUD ca99c067 PMD 0 May 7 14:47:35 server kernel: [ 501.247426] Oops: [#1] SMP May 7 14:47:35 server kernel: [ 501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport May 7 14:47:35 server kernel: [ 501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3 May 7 14:47:35 server kernel: [ 501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 021107/08/2009 May 7 14:47:35 server kernel: [ 501.249767] task: 8800aebf3600 ti: 8801e07a task.ti: 8801e07a May 7 14:47:35 server kernel: [ 501.249861] RIP: 0010:[] [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.250002] RSP: 0018:8801e07a3de8 EFLAGS: 00010086 May 7 14:47:35 server kernel: [ 501.250071] RAX: 0283 RBX: 880210dc5000 RCX: 0283 May 7 14:47:35 server kernel: [ 501.250161] RDX: a0222cf0 RSI: RDI: 880210dc5014 May 7 14:47:35 server kernel: [ 501.250251] RBP: 8801e07a3df8 R08: 8801e07a R09: May 7 14:47:35 server kernel: [ 501.250348] R10: R11: 0001 R12: 8800cda2a9d8 May 7 14:47:35 server kernel: [ 501.250438] R13: 880210dc51b8 R14: R15: 8800cda2a828 May 7 14:47:35 server kernel: [ 501.250528] FS: 7f5b77fff700() GS:88021fc4() knlGS:adaffb40 May 7 14:47:35 server kernel: [ 501.250631] CS: 0010 DS: ES: CR0: 8005003b May 7 14:47:35 server kernel: [ 501.250704] CR2: 0004 CR3: ca19d000 CR4: 06e0 May 7 14:47:35 server kernel: [ 501.250794] Stack: May 7 14:47:35 server kernel: [ 501.250822] 8801e07a3df8 a0222cfd 8801e07a3e70 a0236beb May 7 14:47:35 server kernel: [ 501.250937] 0283 8801e07a3e94 May 7 14:47:35 server kernel: [ 501.251051] 8800aebf3600 8108d8e0 8801e07a3e38 8801e07a3e38 May 7 14:47:35 server kernel: [ 501.251165] Call Trace: May 7 14:47:35 server kernel: [ 501.251200] [] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.251306] [] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251398] [] ? prepare_to_wait_event+0x100/0x100 May 7 14:47:35 server kernel: [ 501.251482] [] vb2_thread+0x1cb/0x220 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251569] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251662] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.255982] [] kthread+0xc4/0xe0 May 7 14:47:35 server kernel: [ 501.260292] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.264615] [] ret_from_fork+0x3f/0x70 May 7 14:47:35 server kernel: [ 501.268962] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00
[PATCH 4.4 73/73] nf_conntrack: avoid kernel pointer value leak in slab name
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Linus Torvalds commit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream. The slab name ends up being visible in the directory structure under /sys, and even if you don't have access rights to the file you can see the filenames. Just use a 64-bit counter instead of the pointer to the 'net' structure to generate a unique name. This code will go away in 4.7 when the conntrack code moves to a single kmemcache, but this is the backportable simple solution to avoiding leaking kernel pointers to user space. Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep") Signed-off-by: Linus Torvalds Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_conntrack_core.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1757,6 +1757,7 @@ void nf_conntrack_init_end(void) int nf_conntrack_init_net(struct net *net) { + static atomic64_t unique_id; int ret = -ENOMEM; int cpu; @@ -1779,7 +1780,8 @@ int nf_conntrack_init_net(struct net *ne if (!net->ct.stat) goto err_pcpu_lists; - net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%p", net); + net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%llu", + (u64)atomic64_inc_return(_id)); if (!net->ct.slabname) goto err_slabname;
[PATCH 4.4 68/73] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit 93f0750dcdaed083d6209b01e952e98ca730db66 upstream. This patch causes a Kernel panic when called on a DVB driver. This was also reported by David R : May 7 14:47:35 server kernel: [ 501.247123] BUG: unable to handle kernel NULL pointer dereference at 0004 May 7 14:47:35 server kernel: [ 501.247239] IP: [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.247354] PGD cae6f067 PUD ca99c067 PMD 0 May 7 14:47:35 server kernel: [ 501.247426] Oops: [#1] SMP May 7 14:47:35 server kernel: [ 501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport May 7 14:47:35 server kernel: [ 501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3 May 7 14:47:35 server kernel: [ 501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 021107/08/2009 May 7 14:47:35 server kernel: [ 501.249767] task: 8800aebf3600 ti: 8801e07a task.ti: 8801e07a May 7 14:47:35 server kernel: [ 501.249861] RIP: 0010:[] [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.250002] RSP: 0018:8801e07a3de8 EFLAGS: 00010086 May 7 14:47:35 server kernel: [ 501.250071] RAX: 0283 RBX: 880210dc5000 RCX: 0283 May 7 14:47:35 server kernel: [ 501.250161] RDX: a0222cf0 RSI: RDI: 880210dc5014 May 7 14:47:35 server kernel: [ 501.250251] RBP: 8801e07a3df8 R08: 8801e07a R09: May 7 14:47:35 server kernel: [ 501.250348] R10: R11: 0001 R12: 8800cda2a9d8 May 7 14:47:35 server kernel: [ 501.250438] R13: 880210dc51b8 R14: R15: 8800cda2a828 May 7 14:47:35 server kernel: [ 501.250528] FS: 7f5b77fff700() GS:88021fc4() knlGS:adaffb40 May 7 14:47:35 server kernel: [ 501.250631] CS: 0010 DS: ES: CR0: 8005003b May 7 14:47:35 server kernel: [ 501.250704] CR2: 0004 CR3: ca19d000 CR4: 06e0 May 7 14:47:35 server kernel: [ 501.250794] Stack: May 7 14:47:35 server kernel: [ 501.250822] 8801e07a3df8 a0222cfd 8801e07a3e70 a0236beb May 7 14:47:35 server kernel: [ 501.250937] 0283 8801e07a3e94 May 7 14:47:35 server kernel: [ 501.251051] 8800aebf3600 8108d8e0 8801e07a3e38 8801e07a3e38 May 7 14:47:35 server kernel: [ 501.251165] Call Trace: May 7 14:47:35 server kernel: [ 501.251200] [] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.251306] [] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251398] [] ? prepare_to_wait_event+0x100/0x100 May 7 14:47:35 server kernel: [ 501.251482] [] vb2_thread+0x1cb/0x220 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251569] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251662] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.255982] [] kthread+0xc4/0xe0 May 7 14:47:35 server kernel: [ 501.260292] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.264615] [] ret_from_fork+0x3f/0x70 May 7 14:47:35 server kernel: [ 501.268962] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31
[PATCH 3.14 14/17] VSOCK: do not disconnect socket when peer has shutdown SEND only
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Ian Campbell[ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ] The peer may be expecting a reply having sent a request and then done a shutdown(SHUT_WR), so tearing down the whole socket at this point seems wrong and breaks for me with a client which does a SHUT_WR. Looking at other socket family's stream_recvmsg callbacks doing a shutdown here does not seem to be the norm and removing it does not seem to have had any adverse effects that I can see. I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact on the vmci transport. Signed-off-by: Ian Campbell Cc: "David S. Miller" Cc: Stefan Hajnoczi Cc: Claudio Imbrenda Cc: Andy King Cc: Dmitry Torokhov Cc: Jorgen Hansen Cc: Adit Ranadive Cc: net...@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/vmw_vsock/af_vsock.c | 21 + 1 file changed, 1 insertion(+), 20 deletions(-) --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1796,27 +1796,8 @@ vsock_stream_recvmsg(struct kiocb *kiocb else if (sk->sk_shutdown & RCV_SHUTDOWN) err = 0; - if (copied > 0) { - /* We only do these additional bookkeeping/notification steps -* if we actually copied something out of the queue pair -* instead of just peeking ahead. -*/ - - if (!(flags & MSG_PEEK)) { - /* If the other side has shutdown for sending and there -* is nothing more to read, then modify the socket -* state. -*/ - if (vsk->peer_shutdown & SEND_SHUTDOWN) { - if (vsock_stream_has_data(vsk) <= 0) { - sk->sk_state = SS_UNCONNECTED; - sock_set_flag(sk, SOCK_DONE); - sk->sk_state_change(sk); - } - } - } + if (copied > 0) err = copied; - } out_wait: finish_wait(sk_sleep(sk), );
[PATCH 3.14 07/17] ARM: OMAP3: Fix booting with thumb2 kernel
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Tony Lindgrencommit d8a50941c91a68da202aaa96a3dacd471ea9c693 upstream. We get a NULL pointer dereference on omap3 for thumb2 compiled kernels: Internal error: Oops: 8005 [#1] SMP THUMB2 ... [] (_raw_spin_unlock_irqrestore) from [] (omap3_enter_idle_bm+0xc5/0x178) [] (omap3_enter_idle_bm) from [] (cpuidle_enter_state+0x77/0x27c) [] (cpuidle_enter_state) from [] (cpu_startup_entry+0x155/0x23c) [] (cpu_startup_entry) from [] (start_kernel+0x32f/0x338) [] (start_kernel) from [<8000807f>] (0x8000807f) The power management related assembly on omaps needs to interact with ARM mode bootrom code, so we need to keep most of the related assembly in ARM mode. Turns out this error is because of missing ENDPROC for assembly code as suggested by Stephen Boyd . Let's fix the problem by adding ENDPROC in two places to sleep34xx.S. Let's also remove the now duplicate custom code for mode switching. This has been unnecessary since commit 6ebbf2ce437b ("ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+"). And let's also remove the comments about local variables, they are now just confusing after the ENDPROC. The reason why ENDPROC makes a difference is it sets .type and then the compiler knows what to do with the thumb bit as explained at: https://wiki.ubuntu.com/ARM/Thumb2PortingHowto Reported-by: Kevin Hilman Tested-by: Kevin Hilman Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman --- arch/arm/mach-omap2/sleep34xx.S | 22 ++ 1 file changed, 2 insertions(+), 20 deletions(-) --- a/arch/arm/mach-omap2/sleep34xx.S +++ b/arch/arm/mach-omap2/sleep34xx.S @@ -203,23 +203,8 @@ save_context_wfi: */ ldr r1, kernel_flush blx r1 - /* -* The kernel doesn't interwork: v7_flush_dcache_all in particluar will -* always return in Thumb state when CONFIG_THUMB2_KERNEL is enabled. -* This sequence switches back to ARM. Note that .align may insert a -* nop: bx pc needs to be word-aligned in order to work. -*/ - THUMB(.thumb ) - THUMB(.align ) - THUMB(bx pc ) - THUMB(nop ) - .arm - b omap3_do_wfi - -/* - * Local variables - */ +ENDPROC(omap34xx_cpu_suspend) omap3_do_wfi_sram_addr: .word omap3_do_wfi_sram kernel_flush: @@ -364,10 +349,7 @@ exit_nonoff_modes: * === */ ldmfd sp!, {r4 - r11, pc} @ restore regs and return - -/* - * Local variables - */ +ENDPROC(omap3_do_wfi) sdrc_power: .word SDRC_POWER_V cm_idlest1_core:
[PATCH 4.4 72/73] drm/radeon: fix DP link training issue with second 4K monitor
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Arindam Nathcommit 1a738347df2ee4977459a8776fe2c62196bdcb1b upstream. There is an issue observed when we hotplug a second DP 4K monitor to the system. Sometimes, the link training fails for the second monitor after HPD interrupt generation. The issue happens when some queued or deferred transactions are already present on the AUX channel when we initiate a new transcation to (say) get DPCD or during link training. We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL register so that we can ignore any such deferred transactions when a new AUX transaction is initiated. Signed-off-by: Arindam Nath Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/radeon_dp_auxch.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/radeon/radeon_dp_auxch.c +++ b/drivers/gpu/drm/radeon/radeon_dp_auxch.c @@ -105,7 +105,7 @@ radeon_dp_aux_transfer_native(struct drm tmp &= AUX_HPD_SEL(0x7); tmp |= AUX_HPD_SEL(chan->rec.hpd); - tmp |= AUX_EN | AUX_LS_READ_EN; + tmp |= AUX_EN | AUX_LS_READ_EN | AUX_HPD_DISCON(0x1); WREG32(AUX_CONTROL + aux_offset[instance], tmp);
[PATCH 4.4 20/73] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Tim Bingham[ Upstream commit 2c94b53738549d81dc7464a32117d1f5112c64d3 ] Prior to commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the implementation of net_dbg_ratelimited() was buggy for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases. The bug was that net_ratelimit() was being called and, despite returning true, nothing was being printed to the console. This resulted in messages like the following - "net_ratelimit: %d callbacks suppressed" with no other output nearby. After commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the bug is fixed for the DEBUG case. However, there's no output at all for CONFIG_DYNAMIC_DEBUG case. This patch restores debug output (if enabled) for the CONFIG_DYNAMIC_DEBUG case. Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG case. The implementation takes care to check that dynamic debugging is enabled before calling net_ratelimit(). Fixes: d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") Signed-off-by: Tim Bingham Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- include/linux/net.h | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) --- a/include/linux/net.h +++ b/include/linux/net.h @@ -245,7 +245,15 @@ do { \ net_ratelimited_function(pr_warn, fmt, ##__VA_ARGS__) #define net_info_ratelimited(fmt, ...) \ net_ratelimited_function(pr_info, fmt, ##__VA_ARGS__) -#if defined(DEBUG) +#if defined(CONFIG_DYNAMIC_DEBUG) +#define net_dbg_ratelimited(fmt, ...) \ +do { \ + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, fmt); \ + if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT) &&\ + net_ratelimit())\ + __dynamic_pr_debug(, fmt, ##__VA_ARGS__);\ +} while (0) +#elif defined(DEBUG) #define net_dbg_ratelimited(fmt, ...) \ net_ratelimited_function(pr_debug, fmt, ##__VA_ARGS__) #else
[PATCH 3.14 14/17] VSOCK: do not disconnect socket when peer has shutdown SEND only
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Ian Campbell [ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ] The peer may be expecting a reply having sent a request and then done a shutdown(SHUT_WR), so tearing down the whole socket at this point seems wrong and breaks for me with a client which does a SHUT_WR. Looking at other socket family's stream_recvmsg callbacks doing a shutdown here does not seem to be the norm and removing it does not seem to have had any adverse effects that I can see. I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact on the vmci transport. Signed-off-by: Ian Campbell Cc: "David S. Miller" Cc: Stefan Hajnoczi Cc: Claudio Imbrenda Cc: Andy King Cc: Dmitry Torokhov Cc: Jorgen Hansen Cc: Adit Ranadive Cc: net...@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/vmw_vsock/af_vsock.c | 21 + 1 file changed, 1 insertion(+), 20 deletions(-) --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1796,27 +1796,8 @@ vsock_stream_recvmsg(struct kiocb *kiocb else if (sk->sk_shutdown & RCV_SHUTDOWN) err = 0; - if (copied > 0) { - /* We only do these additional bookkeeping/notification steps -* if we actually copied something out of the queue pair -* instead of just peeking ahead. -*/ - - if (!(flags & MSG_PEEK)) { - /* If the other side has shutdown for sending and there -* is nothing more to read, then modify the socket -* state. -*/ - if (vsk->peer_shutdown & SEND_SHUTDOWN) { - if (vsock_stream_has_data(vsk) <= 0) { - sk->sk_state = SS_UNCONNECTED; - sock_set_flag(sk, SOCK_DONE); - sk->sk_state_change(sk); - } - } - } + if (copied > 0) err = copied; - } out_wait: finish_wait(sk_sleep(sk), );
[PATCH 3.14 07/17] ARM: OMAP3: Fix booting with thumb2 kernel
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Tony Lindgren commit d8a50941c91a68da202aaa96a3dacd471ea9c693 upstream. We get a NULL pointer dereference on omap3 for thumb2 compiled kernels: Internal error: Oops: 8005 [#1] SMP THUMB2 ... [] (_raw_spin_unlock_irqrestore) from [] (omap3_enter_idle_bm+0xc5/0x178) [] (omap3_enter_idle_bm) from [] (cpuidle_enter_state+0x77/0x27c) [] (cpuidle_enter_state) from [] (cpu_startup_entry+0x155/0x23c) [] (cpu_startup_entry) from [] (start_kernel+0x32f/0x338) [] (start_kernel) from [<8000807f>] (0x8000807f) The power management related assembly on omaps needs to interact with ARM mode bootrom code, so we need to keep most of the related assembly in ARM mode. Turns out this error is because of missing ENDPROC for assembly code as suggested by Stephen Boyd . Let's fix the problem by adding ENDPROC in two places to sleep34xx.S. Let's also remove the now duplicate custom code for mode switching. This has been unnecessary since commit 6ebbf2ce437b ("ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+"). And let's also remove the comments about local variables, they are now just confusing after the ENDPROC. The reason why ENDPROC makes a difference is it sets .type and then the compiler knows what to do with the thumb bit as explained at: https://wiki.ubuntu.com/ARM/Thumb2PortingHowto Reported-by: Kevin Hilman Tested-by: Kevin Hilman Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman --- arch/arm/mach-omap2/sleep34xx.S | 22 ++ 1 file changed, 2 insertions(+), 20 deletions(-) --- a/arch/arm/mach-omap2/sleep34xx.S +++ b/arch/arm/mach-omap2/sleep34xx.S @@ -203,23 +203,8 @@ save_context_wfi: */ ldr r1, kernel_flush blx r1 - /* -* The kernel doesn't interwork: v7_flush_dcache_all in particluar will -* always return in Thumb state when CONFIG_THUMB2_KERNEL is enabled. -* This sequence switches back to ARM. Note that .align may insert a -* nop: bx pc needs to be word-aligned in order to work. -*/ - THUMB(.thumb ) - THUMB(.align ) - THUMB(bx pc ) - THUMB(nop ) - .arm - b omap3_do_wfi - -/* - * Local variables - */ +ENDPROC(omap34xx_cpu_suspend) omap3_do_wfi_sram_addr: .word omap3_do_wfi_sram kernel_flush: @@ -364,10 +349,7 @@ exit_nonoff_modes: * === */ ldmfd sp!, {r4 - r11, pc} @ restore regs and return - -/* - * Local variables - */ +ENDPROC(omap3_do_wfi) sdrc_power: .word SDRC_POWER_V cm_idlest1_core:
[PATCH 4.4 72/73] drm/radeon: fix DP link training issue with second 4K monitor
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Arindam Nath commit 1a738347df2ee4977459a8776fe2c62196bdcb1b upstream. There is an issue observed when we hotplug a second DP 4K monitor to the system. Sometimes, the link training fails for the second monitor after HPD interrupt generation. The issue happens when some queued or deferred transactions are already present on the AUX channel when we initiate a new transcation to (say) get DPCD or during link training. We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL register so that we can ignore any such deferred transactions when a new AUX transaction is initiated. Signed-off-by: Arindam Nath Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/radeon_dp_auxch.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/radeon/radeon_dp_auxch.c +++ b/drivers/gpu/drm/radeon/radeon_dp_auxch.c @@ -105,7 +105,7 @@ radeon_dp_aux_transfer_native(struct drm tmp &= AUX_HPD_SEL(0x7); tmp |= AUX_HPD_SEL(chan->rec.hpd); - tmp |= AUX_EN | AUX_LS_READ_EN; + tmp |= AUX_EN | AUX_LS_READ_EN | AUX_HPD_DISCON(0x1); WREG32(AUX_CONTROL + aux_offset[instance], tmp);
[PATCH 4.4 20/73] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Tim Bingham [ Upstream commit 2c94b53738549d81dc7464a32117d1f5112c64d3 ] Prior to commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the implementation of net_dbg_ratelimited() was buggy for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases. The bug was that net_ratelimit() was being called and, despite returning true, nothing was being printed to the console. This resulted in messages like the following - "net_ratelimit: %d callbacks suppressed" with no other output nearby. After commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the bug is fixed for the DEBUG case. However, there's no output at all for CONFIG_DYNAMIC_DEBUG case. This patch restores debug output (if enabled) for the CONFIG_DYNAMIC_DEBUG case. Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG case. The implementation takes care to check that dynamic debugging is enabled before calling net_ratelimit(). Fixes: d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") Signed-off-by: Tim Bingham Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- include/linux/net.h | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) --- a/include/linux/net.h +++ b/include/linux/net.h @@ -245,7 +245,15 @@ do { \ net_ratelimited_function(pr_warn, fmt, ##__VA_ARGS__) #define net_info_ratelimited(fmt, ...) \ net_ratelimited_function(pr_info, fmt, ##__VA_ARGS__) -#if defined(DEBUG) +#if defined(CONFIG_DYNAMIC_DEBUG) +#define net_dbg_ratelimited(fmt, ...) \ +do { \ + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, fmt); \ + if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT) &&\ + net_ratelimit())\ + __dynamic_pr_debug(, fmt, ##__VA_ARGS__);\ +} while (0) +#elif defined(DEBUG) #define net_dbg_ratelimited(fmt, ...) \ net_ratelimited_function(pr_debug, fmt, ##__VA_ARGS__) #else
[PATCH 4.4 64/73] qla1280: Dont allocate 512kb of host tags
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Johannes Thumshirncommit 2bcbc81421c511ef117cadcf0bee9c4340e68db0 upstream. The qla1280 driver sets the scsi_host_template's can_queue field to 0xf which results in an allocation failure when allocating the block layer tags for the driver's queues. This was introduced with the change for host wide tags in commit 64d513ac31b - "scsi: use host wide tags by default". Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation error. Signed-off-by: Johannes Thumshirn Fixes: 64d513ac31b - "scsi: use host wide tags by default" Cc: Laura Abbott Cc: Michael Reed Reviewed-by: Laurence Oberman Reviewed-by: Lee Duncan Signed-off-by: Martin K. Petersen Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/qla1280.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -4214,7 +4214,7 @@ static struct scsi_host_template qla1280 .eh_bus_reset_handler = qla1280_eh_bus_reset, .eh_host_reset_handler = qla1280_eh_adapter_reset, .bios_param = qla1280_biosparam, - .can_queue = 0xf, + .can_queue = MAX_OUTSTANDING_COMMANDS, .this_id= -1, .sg_tablesize = SG_ALL, .use_clustering = ENABLE_CLUSTERING,
[PATCH 4.4 50/73] ALSA: hda - Fix broken reconfig
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Takashi Iwaicommit addacd801e1638f41d659cb53b9b73fc14322cb1 upstream. The HD-audio reconfig function got broken in the recent kernels, typically resulting in a failure like: snd_hda_intel :00:1b.0: control 3:0:0:Playback Channel Map:0 is already present This is because of the code restructuring to move the PCM and control instantiation into the codec drive probe, by the commit [bcd96557bd0a: ALSA: hda - Build PCMs and controls at codec driver probe]. Although the commit above removed the calls of snd_hda_codec_build_pcms() and *_build_controls() at the controller driver probe, the similar calls in the reconfig were still left forgotten. This caused the conflicting and duplicated PCMs and controls. The fix is trivial: just remove these superfluous calls from reconfig_codec(). Fixes: bcd96557bd0a ('ALSA: hda - Build PCMs and controls at codec driver probe') Reported-by: Jochen Henneberg Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/hda_sysfs.c |8 1 file changed, 8 deletions(-) --- a/sound/pci/hda/hda_sysfs.c +++ b/sound/pci/hda/hda_sysfs.c @@ -141,14 +141,6 @@ static int reconfig_codec(struct hda_cod err = snd_hda_codec_configure(codec); if (err < 0) goto error; - /* rebuild PCMs */ - err = snd_hda_codec_build_pcms(codec); - if (err < 0) - goto error; - /* rebuild mixers */ - err = snd_hda_codec_build_controls(codec); - if (err < 0) - goto error; err = snd_card_register(codec->card); error: snd_hda_power_down(codec);
[PATCH 4.4 64/73] qla1280: Dont allocate 512kb of host tags
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Johannes Thumshirn commit 2bcbc81421c511ef117cadcf0bee9c4340e68db0 upstream. The qla1280 driver sets the scsi_host_template's can_queue field to 0xf which results in an allocation failure when allocating the block layer tags for the driver's queues. This was introduced with the change for host wide tags in commit 64d513ac31b - "scsi: use host wide tags by default". Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation error. Signed-off-by: Johannes Thumshirn Fixes: 64d513ac31b - "scsi: use host wide tags by default" Cc: Laura Abbott Cc: Michael Reed Reviewed-by: Laurence Oberman Reviewed-by: Lee Duncan Signed-off-by: Martin K. Petersen Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/qla1280.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -4214,7 +4214,7 @@ static struct scsi_host_template qla1280 .eh_bus_reset_handler = qla1280_eh_bus_reset, .eh_host_reset_handler = qla1280_eh_adapter_reset, .bios_param = qla1280_biosparam, - .can_queue = 0xf, + .can_queue = MAX_OUTSTANDING_COMMANDS, .this_id= -1, .sg_tablesize = SG_ALL, .use_clustering = ENABLE_CLUSTERING,
[PATCH 4.4 50/73] ALSA: hda - Fix broken reconfig
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Takashi Iwai commit addacd801e1638f41d659cb53b9b73fc14322cb1 upstream. The HD-audio reconfig function got broken in the recent kernels, typically resulting in a failure like: snd_hda_intel :00:1b.0: control 3:0:0:Playback Channel Map:0 is already present This is because of the code restructuring to move the PCM and control instantiation into the codec drive probe, by the commit [bcd96557bd0a: ALSA: hda - Build PCMs and controls at codec driver probe]. Although the commit above removed the calls of snd_hda_codec_build_pcms() and *_build_controls() at the controller driver probe, the similar calls in the reconfig were still left forgotten. This caused the conflicting and duplicated PCMs and controls. The fix is trivial: just remove these superfluous calls from reconfig_codec(). Fixes: bcd96557bd0a ('ALSA: hda - Build PCMs and controls at codec driver probe') Reported-by: Jochen Henneberg Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/hda_sysfs.c |8 1 file changed, 8 deletions(-) --- a/sound/pci/hda/hda_sysfs.c +++ b/sound/pci/hda/hda_sysfs.c @@ -141,14 +141,6 @@ static int reconfig_codec(struct hda_cod err = snd_hda_codec_configure(codec); if (err < 0) goto error; - /* rebuild PCMs */ - err = snd_hda_codec_build_pcms(codec); - if (err < 0) - goto error; - /* rebuild mixers */ - err = snd_hda_codec_build_controls(codec); - if (err < 0) - goto error; err = snd_card_register(codec->card); error: snd_hda_power_down(codec);
[PATCH 3.14 10/17] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Mathias Krause[ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ] Because we miss to wipe the remainder of i->addr[] in packet_mc_add(), pdiag_put_mclist() leaks uninitialized heap bytes via the PACKET_DIAG_MCLIST netlink attribute. Fix this by explicitly memset(0)ing the remaining bytes in i->addr[]. Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module") Signed-off-by: Mathias Krause Cc: Eric W. Biederman Cc: Pavel Emelyanov Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/packet/af_packet.c |1 + 1 file changed, 1 insertion(+) --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -3153,6 +3153,7 @@ static int packet_mc_add(struct sock *sk i->ifindex = mreq->mr_ifindex; i->alen = mreq->mr_alen; memcpy(i->addr, mreq->mr_address, i->alen); + memset(i->addr + i->alen, 0, sizeof(i->addr) - i->alen); i->count = 1; i->next = po->mclist; po->mclist = i;
[PATCH 3.14 05/17] drm/radeon: fix PLL sharing on DCE6.1 (v2)
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Lucas Stachcommit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream. On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ 1 file changed, 10 insertions(+) --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1600,6 +1600,7 @@ static u32 radeon_get_pll_use_mask(struc static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; @@ -1609,6 +1610,10 @@ static int radeon_get_shared_dp_ppll(str test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* for DP use the same PLL for all */ if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID) return test_radeon_crtc->pll_id; @@ -1630,6 +1635,7 @@ static int radeon_get_shared_nondp_ppll( { struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; u32 adjusted_clock, test_adjusted_clock; @@ -1645,6 +1651,10 @@ static int radeon_get_shared_nondp_ppll( test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* check if we are already driving this connector with another crtc */ if (test_radeon_crtc->connector == radeon_crtc->connector) { /* if we are, return that pll */
[PATCH 3.14 00/17] 3.14.70-stable review
This is the start of the stable review cycle for the 3.14.70 release. There are 17 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu May 19 01:13:32 UTC 2016. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.70-rc1.gz and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-HartmanLinux 3.14.70-rc1 Jon Medhurst arm64: Make arch_randomize_brk avoid stack area Kangjie Lu net: fix a kernel infoleak in x25 module Nikolay Aleksandrov net: bridge: fix old ioctl unlocked net device walk Ian Campbell VSOCK: do not disconnect socket when peer has shutdown SEND only Kangjie Lu net: fix infoleak in rtnetlink Kangjie Lu net: fix infoleak in llc Ben Hutchings atl2: Disable unimplemented scatter/gather feature Mathias Krause packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface Chris Friesen route: do not cache fib route info on local routes with oif David S. Miller decnet: Do not build routes to devices without decnet private data. Tony Lindgren ARM: OMAP3: Fix booting with thumb2 kernel Daniel Vetter drm/i915: Bail out of pipe config compute loop on LPT Lucas Stach drm/radeon: fix PLL sharing on DCE6.1 (v2) Andi Kleen asmlinkage, pnp: Make variables used from assembler code visible Marek Szyprowski Input: max8997-haptic - fix NULL pointer dereference Al Viro get_rock_ridge_filename(): handle malformed NM entries Herbert Xu crypto: hash - Fix page length clamping in hash walk - Diffstat: Makefile | 4 ++-- arch/arm/mach-omap2/sleep34xx.S | 22 ++ arch/arm64/kernel/process.c | 24 ++-- crypto/ahash.c | 3 ++- drivers/gpu/drm/i915/intel_crt.c | 8 +++- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ drivers/input/misc/max8997_haptic.c | 6 -- drivers/net/ethernet/atheros/atlx/atl2.c | 2 +- drivers/pnp/pnpbios/bioscalls.c | 9 + fs/isofs/rock.c | 13 ++--- net/bridge/br_ioctl.c| 5 +++-- net/core/rtnetlink.c | 18 ++ net/decnet/dn_route.c| 9 - net/ipv4/route.c | 12 net/llc/af_llc.c | 1 + net/packet/af_packet.c | 1 + net/vmw_vsock/af_vsock.c | 21 + net/x25/x25_facilities.c | 1 + 18 files changed, 98 insertions(+), 71 deletions(-)
[PATCH 3.14 10/17] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Mathias Krause [ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ] Because we miss to wipe the remainder of i->addr[] in packet_mc_add(), pdiag_put_mclist() leaks uninitialized heap bytes via the PACKET_DIAG_MCLIST netlink attribute. Fix this by explicitly memset(0)ing the remaining bytes in i->addr[]. Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module") Signed-off-by: Mathias Krause Cc: Eric W. Biederman Cc: Pavel Emelyanov Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/packet/af_packet.c |1 + 1 file changed, 1 insertion(+) --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -3153,6 +3153,7 @@ static int packet_mc_add(struct sock *sk i->ifindex = mreq->mr_ifindex; i->alen = mreq->mr_alen; memcpy(i->addr, mreq->mr_address, i->alen); + memset(i->addr + i->alen, 0, sizeof(i->addr) - i->alen); i->count = 1; i->next = po->mclist; po->mclist = i;
[PATCH 3.14 05/17] drm/radeon: fix PLL sharing on DCE6.1 (v2)
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Lucas Stach commit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream. On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ 1 file changed, 10 insertions(+) --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1600,6 +1600,7 @@ static u32 radeon_get_pll_use_mask(struc static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; @@ -1609,6 +1610,10 @@ static int radeon_get_shared_dp_ppll(str test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* for DP use the same PLL for all */ if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID) return test_radeon_crtc->pll_id; @@ -1630,6 +1635,7 @@ static int radeon_get_shared_nondp_ppll( { struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; u32 adjusted_clock, test_adjusted_clock; @@ -1645,6 +1651,10 @@ static int radeon_get_shared_nondp_ppll( test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* check if we are already driving this connector with another crtc */ if (test_radeon_crtc->connector == radeon_crtc->connector) { /* if we are, return that pll */
[PATCH 3.14 00/17] 3.14.70-stable review
This is the start of the stable review cycle for the 3.14.70 release. There are 17 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu May 19 01:13:32 UTC 2016. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.70-rc1.gz and the diffstat can be found below. thanks, greg k-h - Pseudo-Shortlog of commits: Greg Kroah-Hartman Linux 3.14.70-rc1 Jon Medhurst arm64: Make arch_randomize_brk avoid stack area Kangjie Lu net: fix a kernel infoleak in x25 module Nikolay Aleksandrov net: bridge: fix old ioctl unlocked net device walk Ian Campbell VSOCK: do not disconnect socket when peer has shutdown SEND only Kangjie Lu net: fix infoleak in rtnetlink Kangjie Lu net: fix infoleak in llc Ben Hutchings atl2: Disable unimplemented scatter/gather feature Mathias Krause packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface Chris Friesen route: do not cache fib route info on local routes with oif David S. Miller decnet: Do not build routes to devices without decnet private data. Tony Lindgren ARM: OMAP3: Fix booting with thumb2 kernel Daniel Vetter drm/i915: Bail out of pipe config compute loop on LPT Lucas Stach drm/radeon: fix PLL sharing on DCE6.1 (v2) Andi Kleen asmlinkage, pnp: Make variables used from assembler code visible Marek Szyprowski Input: max8997-haptic - fix NULL pointer dereference Al Viro get_rock_ridge_filename(): handle malformed NM entries Herbert Xu crypto: hash - Fix page length clamping in hash walk - Diffstat: Makefile | 4 ++-- arch/arm/mach-omap2/sleep34xx.S | 22 ++ arch/arm64/kernel/process.c | 24 ++-- crypto/ahash.c | 3 ++- drivers/gpu/drm/i915/intel_crt.c | 8 +++- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++ drivers/input/misc/max8997_haptic.c | 6 -- drivers/net/ethernet/atheros/atlx/atl2.c | 2 +- drivers/pnp/pnpbios/bioscalls.c | 9 + fs/isofs/rock.c | 13 ++--- net/bridge/br_ioctl.c| 5 +++-- net/core/rtnetlink.c | 18 ++ net/decnet/dn_route.c| 9 - net/ipv4/route.c | 12 net/llc/af_llc.c | 1 + net/packet/af_packet.c | 1 + net/vmw_vsock/af_vsock.c | 21 + net/x25/x25_facilities.c | 1 + 18 files changed, 98 insertions(+), 71 deletions(-)
[PATCH 3.14 06/17] drm/i915: Bail out of pipe config compute loop on LPT
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Daniel Vettercommit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream. LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson Cc: Maarten Lankhorst References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter Tested-by: Chris Wilson Signed-off-by: Maarten Lankhorst Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vet...@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/intel_crt.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -259,8 +259,14 @@ static bool intel_crt_compute_config(str pipe_config->has_pch_encoder = true; /* LPT FDI RX only supports 8bpc. */ - if (HAS_PCH_LPT(dev)) + if (HAS_PCH_LPT(dev)) { + if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) { + DRM_DEBUG_KMS("LPT only supports 24bpp\n"); + return false; + } + pipe_config->pipe_bpp = 24; + } return true; }
[PATCH 3.14 01/17] crypto: hash - Fix page length clamping in hash walk
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Herbert Xucommit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream. The crypto hash walk code is broken when supplied with an offset greater than or equal to PAGE_SIZE. This patch fixes it by adjusting walk->pg and walk->offset when this happens. Reported-by: Steffen Klassert Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman --- crypto/ahash.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -64,8 +64,9 @@ static int hash_walk_new_entry(struct cr struct scatterlist *sg; sg = walk->sg; - walk->pg = sg_page(sg); walk->offset = sg->offset; + walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT); + walk->offset = offset_in_page(walk->offset); walk->entrylen = sg->length; if (walk->entrylen > walk->total)
[PATCH 3.14 08/17] decnet: Do not build routes to devices without decnet private data.
3.14-stable review patch. If anyone has any objections, please let me know. -- From: "David S. Miller"[ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ] In particular, make sure we check for decnet private presence for loopback devices. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/decnet/dn_route.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1030,10 +1030,13 @@ source_ok: if (!fld.daddr) { fld.daddr = fld.saddr; - err = -EADDRNOTAVAIL; if (dev_out) dev_put(dev_out); + err = -EINVAL; dev_out = init_net.loopback_dev; + if (!dev_out->dn_ptr) + goto out; + err = -EADDRNOTAVAIL; dev_hold(dev_out); if (!fld.daddr) { fld.daddr = @@ -1106,6 +1109,8 @@ source_ok: if (dev_out == NULL) goto out; dn_db = rcu_dereference_raw(dev_out->dn_ptr); + if (!dn_db) + goto e_inval; /* Possible improvement - check all devices for local addr */ if (dn_dev_islocal(dev_out, fld.daddr)) { dev_put(dev_out); @@ -1147,6 +1152,8 @@ select_source: dev_put(dev_out); dev_out = init_net.loopback_dev; dev_hold(dev_out); + if (!dev_out->dn_ptr) + goto e_inval; fld.flowidn_oif = dev_out->ifindex; if (res.fi) dn_fib_info_put(res.fi);
[PATCH 3.14 06/17] drm/i915: Bail out of pipe config compute loop on LPT
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Daniel Vetter commit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream. LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson Cc: Maarten Lankhorst References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter Tested-by: Chris Wilson Signed-off-by: Maarten Lankhorst Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vet...@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/intel_crt.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -259,8 +259,14 @@ static bool intel_crt_compute_config(str pipe_config->has_pch_encoder = true; /* LPT FDI RX only supports 8bpc. */ - if (HAS_PCH_LPT(dev)) + if (HAS_PCH_LPT(dev)) { + if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) { + DRM_DEBUG_KMS("LPT only supports 24bpp\n"); + return false; + } + pipe_config->pipe_bpp = 24; + } return true; }
[PATCH 3.14 01/17] crypto: hash - Fix page length clamping in hash walk
3.14-stable review patch. If anyone has any objections, please let me know. -- From: Herbert Xu commit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream. The crypto hash walk code is broken when supplied with an offset greater than or equal to PAGE_SIZE. This patch fixes it by adjusting walk->pg and walk->offset when this happens. Reported-by: Steffen Klassert Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman --- crypto/ahash.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -64,8 +64,9 @@ static int hash_walk_new_entry(struct cr struct scatterlist *sg; sg = walk->sg; - walk->pg = sg_page(sg); walk->offset = sg->offset; + walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT); + walk->offset = offset_in_page(walk->offset); walk->entrylen = sg->length; if (walk->entrylen > walk->total)
[PATCH 3.14 08/17] decnet: Do not build routes to devices without decnet private data.
3.14-stable review patch. If anyone has any objections, please let me know. -- From: "David S. Miller" [ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ] In particular, make sure we check for decnet private presence for loopback devices. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/decnet/dn_route.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1030,10 +1030,13 @@ source_ok: if (!fld.daddr) { fld.daddr = fld.saddr; - err = -EADDRNOTAVAIL; if (dev_out) dev_put(dev_out); + err = -EINVAL; dev_out = init_net.loopback_dev; + if (!dev_out->dn_ptr) + goto out; + err = -EADDRNOTAVAIL; dev_hold(dev_out); if (!fld.daddr) { fld.daddr = @@ -1106,6 +1109,8 @@ source_ok: if (dev_out == NULL) goto out; dn_db = rcu_dereference_raw(dev_out->dn_ptr); + if (!dn_db) + goto e_inval; /* Possible improvement - check all devices for local addr */ if (dn_dev_islocal(dev_out, fld.daddr)) { dev_put(dev_out); @@ -1147,6 +1152,8 @@ select_source: dev_put(dev_out); dev_out = init_net.loopback_dev; dev_hold(dev_out); + if (!dev_out->dn_ptr) + goto e_inval; fld.flowidn_oif = dev_out->ifindex; if (res.fi) dn_fib_info_put(res.fi);
Re: [PATCH v5 08/12] zsmalloc: introduce zspage structure
On Mon, May 16, 2016 at 12:09:41PM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > > We have squeezed meta data of zspage into first page's descriptor. > > So, to get meta data from subpage, we should get first page first > > of all. But it makes trouble to implment page migration feature > > of zsmalloc because any place where to get first page from subpage > > can be raced with first page migration. IOW, first page it got > > could be stale. For preventing it, I have tried several approahces > > but it made code complicated so finally, I concluded to separate > > metadata from first page. Of course, it consumes more memory. IOW, > > 16bytes per zspage on 32bit at the moment. It means we lost 1% > > at *worst case*(40B/4096B) which is not bad I think at the cost of > > maintenance. > > > > Cc: Sergey Senozhatsky> > Signed-off-by: Minchan Kim > [..] > > @@ -153,8 +138,6 @@ > > enum fullness_group { > > ZS_ALMOST_FULL, > > ZS_ALMOST_EMPTY, > > - _ZS_NR_FULLNESS_GROUPS, > > - > > ZS_EMPTY, > > ZS_FULL > > }; > > @@ -203,7 +186,7 @@ static const int fullness_threshold_frac = 4; > > > > struct size_class { > > spinlock_t lock; > > - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > > + struct list_head fullness_list[2]; > > seems that it also has some cleaup bits in it. > > [..] > > -static int create_handle_cache(struct zs_pool *pool) > > +static int create_cache(struct zs_pool *pool) > > { > > pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE, > > 0, 0, NULL); > > - return pool->handle_cachep ? 0 : 1; > > + if (!pool->handle_cachep) > > + return 1; > > + > > + pool->zspage_cachep = kmem_cache_create("zspage", sizeof(struct zspage), > > + 0, 0, NULL); > > + if (!pool->zspage_cachep) { > > + kmem_cache_destroy(pool->handle_cachep); > ^ > > do you need to NULL a pool->handle_cachep here? Thanks, Will fix. > > zs_create_pool() > if (create_cache() == 1) { > pool->zspage_cachep NULL > pool->handle_cachep !NULL already freed -> > kmem_cache_destroy() > return 1; > goto err > } > err: > zs_destroy_pool() > destroy_cache() { > kmem_cache_destroy(pool->handle_cachep); !NULL and > freed > kmem_cache_destroy(pool->zspage_cachep); NULL ok > } > > > can we also switch create_cache() to errnos? I just like a bit > better > return -ENOMEM; > else > return 0; > > than > > return 1; > else > return 0; > Hmm, of course, I can do it easily. But zs_create_pool returns NULL without error propagation from sub functions so I don't see any gain from returning errno from create_cache. I don't mean I hate it but just need a justificaion to persuade grumpy me. > > > @@ -997,44 +951,38 @@ static void init_zspage(struct size_class *class, > > struct page *first_page) > > off %= PAGE_SIZE; > > } > > > > - set_freeobj(first_page, (unsigned long)location_to_obj(first_page, 0)); > > + set_freeobj(zspage, > > + (unsigned long)location_to_obj(zspage->first_page, 0)); > > static unsigned long location_to_obj() > > it's already returning "(unsigned long)", so here and in several other places > this cast can be dropped. Yeb. > > [..] > > +static struct zspage *isolate_zspage(struct size_class *class, bool source) > > { > > + struct zspage *zspage; > > + enum fullness_group fg[2] = {ZS_ALMOST_EMPTY, ZS_ALMOST_FULL}; > > + if (!source) { > > + fg[0] = ZS_ALMOST_FULL; > > + fg[1] = ZS_ALMOST_EMPTY; > > + } > > + > > + for (i = 0; i < 2; i++) { > > sorry, why not "for (i = ZS_ALMOST_EMPTY; i <= ZS_ALMOST_FULL ..." ? For source zspage, the policy is to find a fragment object from ZS_ALMOST_EMPTY. For target zspage, the policy is to find a fragment object from ZS_ALMOST_FULL. Do I misunderstand your question? > > > + zspage = list_first_entry_or_null(>fullness_list[fg[i]], > > + struct zspage, list); > > + if (zspage) { > > + remove_zspage(class, zspage, fg[i]); > > + return zspage; > > } > > } > > > > - return page; > > + return zspage; > > } > > -ss
Re: [PATCH v5 08/12] zsmalloc: introduce zspage structure
On Mon, May 16, 2016 at 12:09:41PM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > > We have squeezed meta data of zspage into first page's descriptor. > > So, to get meta data from subpage, we should get first page first > > of all. But it makes trouble to implment page migration feature > > of zsmalloc because any place where to get first page from subpage > > can be raced with first page migration. IOW, first page it got > > could be stale. For preventing it, I have tried several approahces > > but it made code complicated so finally, I concluded to separate > > metadata from first page. Of course, it consumes more memory. IOW, > > 16bytes per zspage on 32bit at the moment. It means we lost 1% > > at *worst case*(40B/4096B) which is not bad I think at the cost of > > maintenance. > > > > Cc: Sergey Senozhatsky > > Signed-off-by: Minchan Kim > [..] > > @@ -153,8 +138,6 @@ > > enum fullness_group { > > ZS_ALMOST_FULL, > > ZS_ALMOST_EMPTY, > > - _ZS_NR_FULLNESS_GROUPS, > > - > > ZS_EMPTY, > > ZS_FULL > > }; > > @@ -203,7 +186,7 @@ static const int fullness_threshold_frac = 4; > > > > struct size_class { > > spinlock_t lock; > > - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; > > + struct list_head fullness_list[2]; > > seems that it also has some cleaup bits in it. > > [..] > > -static int create_handle_cache(struct zs_pool *pool) > > +static int create_cache(struct zs_pool *pool) > > { > > pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE, > > 0, 0, NULL); > > - return pool->handle_cachep ? 0 : 1; > > + if (!pool->handle_cachep) > > + return 1; > > + > > + pool->zspage_cachep = kmem_cache_create("zspage", sizeof(struct zspage), > > + 0, 0, NULL); > > + if (!pool->zspage_cachep) { > > + kmem_cache_destroy(pool->handle_cachep); > ^ > > do you need to NULL a pool->handle_cachep here? Thanks, Will fix. > > zs_create_pool() > if (create_cache() == 1) { > pool->zspage_cachep NULL > pool->handle_cachep !NULL already freed -> > kmem_cache_destroy() > return 1; > goto err > } > err: > zs_destroy_pool() > destroy_cache() { > kmem_cache_destroy(pool->handle_cachep); !NULL and > freed > kmem_cache_destroy(pool->zspage_cachep); NULL ok > } > > > can we also switch create_cache() to errnos? I just like a bit > better > return -ENOMEM; > else > return 0; > > than > > return 1; > else > return 0; > Hmm, of course, I can do it easily. But zs_create_pool returns NULL without error propagation from sub functions so I don't see any gain from returning errno from create_cache. I don't mean I hate it but just need a justificaion to persuade grumpy me. > > > @@ -997,44 +951,38 @@ static void init_zspage(struct size_class *class, > > struct page *first_page) > > off %= PAGE_SIZE; > > } > > > > - set_freeobj(first_page, (unsigned long)location_to_obj(first_page, 0)); > > + set_freeobj(zspage, > > + (unsigned long)location_to_obj(zspage->first_page, 0)); > > static unsigned long location_to_obj() > > it's already returning "(unsigned long)", so here and in several other places > this cast can be dropped. Yeb. > > [..] > > +static struct zspage *isolate_zspage(struct size_class *class, bool source) > > { > > + struct zspage *zspage; > > + enum fullness_group fg[2] = {ZS_ALMOST_EMPTY, ZS_ALMOST_FULL}; > > + if (!source) { > > + fg[0] = ZS_ALMOST_FULL; > > + fg[1] = ZS_ALMOST_EMPTY; > > + } > > + > > + for (i = 0; i < 2; i++) { > > sorry, why not "for (i = ZS_ALMOST_EMPTY; i <= ZS_ALMOST_FULL ..." ? For source zspage, the policy is to find a fragment object from ZS_ALMOST_EMPTY. For target zspage, the policy is to find a fragment object from ZS_ALMOST_FULL. Do I misunderstand your question? > > > + zspage = list_first_entry_or_null(>fullness_list[fg[i]], > > + struct zspage, list); > > + if (zspage) { > > + remove_zspage(class, zspage, fg[i]); > > + return zspage; > > } > > } > > > > - return page; > > + return zspage; > > } > > -ss
Re: [RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t
On Mon, May 16, 2016 at 5:37 PM, Jason Lowwrote: > > This rest of the series converts the rwsem count variable to an atomic_long_t > since it is used it as an atomic variable. This allows us to also remove > the rwsem_atomic_{add,update} abstraction and reduce 100+ lines of code. I would suggest you merge all the "remove rwsem_atomic_{add,update}" patches into a single patch. I don't see the advantage to splitting those up by architecture, and it does add noise to the series. Other than that it all looks fine to me. Linus
Re: [RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t
On Mon, May 16, 2016 at 5:37 PM, Jason Low wrote: > > This rest of the series converts the rwsem count variable to an atomic_long_t > since it is used it as an atomic variable. This allows us to also remove > the rwsem_atomic_{add,update} abstraction and reduce 100+ lines of code. I would suggest you merge all the "remove rwsem_atomic_{add,update}" patches into a single patch. I don't see the advantage to splitting those up by architecture, and it does add noise to the series. Other than that it all looks fine to me. Linus
Re: [PATCH 02/11] mtd: nand_bbt: introduce BBT related data structure
Hi Boris, Firstly, sorry for late reply. On Thu, May 5, 2016 at 4:33 AM, Boris Brezillonwrote: > Hi Peter, > > On Wed, 4 May 2016 09:36:05 +0800 > Peter Pan wrote: > >> Hi Boris, >> >> On Tue, Apr 19, 2016 at 3:34 PM, Boris Brezillon >> wrote: >> > Hi Peter, >> > >> > On Tue, 19 Apr 2016 08:40:40 +0800 >> > Peter Pan wrote: >> >> >> >> > >> >> >> So it's true, it >> >> >> should still be numchips in nand_bbt.c? I just came out this question >> >> >> when >> >> >> making v4. :) >> >> > >> >> > BTW, I have something for you [1]. I started to move things around to >> >> > allow spinand and onenand layers to lie under drivers/mtd/nand/, and I >> >> > wonder if we shouldn't do this move before reworking the nand_bbt code >> >> > to make it generic. >> >> > Note that this rework is not finished yet, but it gives a rough idea of >> >> > what I'd like to see. >> >> >> >> I saw you also rework BBT in your git tree, which is a bit duplicate >> >> with my BBT patch, >> >> so should I continue my BBT patch by join part of your BBT rework code >> >> or continue >> >> your git tree ? >> > >> > Well, if you ask me what I'd prefer, it's clearly the 2nd solution. >> > Note that my branch should just serve as a reference of what I expect, >> > it just a pile of rework that should probably be reordered and cleaned >> > up. >> > >> > Here's the sequencing I'd like to see: >> > >> > 1/ Move include/linux/mtd/nand.h into include/linux/mtd/rawnand.h and >> >move all files under drivers/mtd/nand/ into >> >drivers/mtd/nand/rawnand (including nand_bbt.c). This can be done in >> >several patches >> > >> > 2/ Add the generic nand layer (include/linux/mtd/nand.h and >> >drivers/mtd/nand/core.c). In my version I put everything in >> >include/linux/mtd/nand.h, but maybe we'll need a few functions to be >> >defined in drivers/mtd/nand/core.c. >> > >> > 3/ Create a rawnand_device structure inheriting from nand_device, and >> >then make nand_chip inherit from rawnand_device. Patch the >> >nand_base.c code to initialize all the nand_device fields properly, >> >so that we'll be ready to switch to the generic BBT code. >> > >> > 4/ Modify the nand_bbt.c code to make use of the generic NAND interface >> >instead of the MTD and rawnand one (this implies identifying all the >> >generic helpers you might need, and implementing them in >> >include/linux/mtd/nand.h or drivers/mtd/nand/core.c). >> > >> > 5/ Move drivers/mtd/nand/rawnand/nand_bbt.c into >> >drivers/mtd/nand/bbt.c >> > >> > 6/[optional] Implement your spinand layer in drivers/mtd/nand/spinand >> > >> > I know I'm asking a lot, especially given that you already spent a lot >> > of time iterating on this BBT rework series. But your goal is to move >> > mt29f driver out of staging, and you'll need to do the generic NAND >> > layer to achieve that, so I'd really prefer having the BBT code use >> > this generic layer instead of directly using the MTD API + an extra set >> > of NAND specific structs (like the nand_chip_layout_info one). >> >> Yes, I want to upstreaming my SPI NAND frameworks and it's indeed better >> to have a nand core. In fact, I already finished a SPI NAND framework with >> the BBT patch I sent which is directly under MTD (don't have a NAND core >> layer). >> >> Actually, I'm interested in this NAND framework refining work. And I >> know you already >> gave a speech on ELC about this. But due to the resource limitation, I may >> not >> to do all of the things. So how about I continue my BBT patch with >> your NAND refining >> ideas. I'll try to make the BBT patch compatible with the refining >> work. What do you >> think? > > The thing is, I'm not happy with these intermediate reworks, which in my > opinion are adding more confusion and will make things even harder to > rework afterward. > You said you already developed your SPI NAND framework and it's not > based on the generic NAND layer, which means you (or someone else) will > have to migrate it to this approach at some point, and this extra work > is kind of useless, especially since we seem to agree that the generic > NAND layer is the way to go for SPI NAND (and other NAND based devices) > support. > > Since I'm the one who pushed for this transition to an intermediate > "NAND core" layer, I'm willing to help you with this task. I actually > reworked my series [1] to move the BBT code in drivers/mtd/nand/bbt.c > and move raw NAND code into drivers/mtd/nand/rawnand/ (still have to > rework the commit logs, and test the implementation, but the different > steps are there and we end-up with something clean in > drivers/mtd/nand/). > > Could you help me debug this code and base your SPI NAND framework on > top of it? Yes I can. Actually I already clone your git tree and start to go through your commits. And I'll let you know when I have
Re: [PATCH 02/11] mtd: nand_bbt: introduce BBT related data structure
Hi Boris, Firstly, sorry for late reply. On Thu, May 5, 2016 at 4:33 AM, Boris Brezillon wrote: > Hi Peter, > > On Wed, 4 May 2016 09:36:05 +0800 > Peter Pan wrote: > >> Hi Boris, >> >> On Tue, Apr 19, 2016 at 3:34 PM, Boris Brezillon >> wrote: >> > Hi Peter, >> > >> > On Tue, 19 Apr 2016 08:40:40 +0800 >> > Peter Pan wrote: >> >> >> >> > >> >> >> So it's true, it >> >> >> should still be numchips in nand_bbt.c? I just came out this question >> >> >> when >> >> >> making v4. :) >> >> > >> >> > BTW, I have something for you [1]. I started to move things around to >> >> > allow spinand and onenand layers to lie under drivers/mtd/nand/, and I >> >> > wonder if we shouldn't do this move before reworking the nand_bbt code >> >> > to make it generic. >> >> > Note that this rework is not finished yet, but it gives a rough idea of >> >> > what I'd like to see. >> >> >> >> I saw you also rework BBT in your git tree, which is a bit duplicate >> >> with my BBT patch, >> >> so should I continue my BBT patch by join part of your BBT rework code >> >> or continue >> >> your git tree ? >> > >> > Well, if you ask me what I'd prefer, it's clearly the 2nd solution. >> > Note that my branch should just serve as a reference of what I expect, >> > it just a pile of rework that should probably be reordered and cleaned >> > up. >> > >> > Here's the sequencing I'd like to see: >> > >> > 1/ Move include/linux/mtd/nand.h into include/linux/mtd/rawnand.h and >> >move all files under drivers/mtd/nand/ into >> >drivers/mtd/nand/rawnand (including nand_bbt.c). This can be done in >> >several patches >> > >> > 2/ Add the generic nand layer (include/linux/mtd/nand.h and >> >drivers/mtd/nand/core.c). In my version I put everything in >> >include/linux/mtd/nand.h, but maybe we'll need a few functions to be >> >defined in drivers/mtd/nand/core.c. >> > >> > 3/ Create a rawnand_device structure inheriting from nand_device, and >> >then make nand_chip inherit from rawnand_device. Patch the >> >nand_base.c code to initialize all the nand_device fields properly, >> >so that we'll be ready to switch to the generic BBT code. >> > >> > 4/ Modify the nand_bbt.c code to make use of the generic NAND interface >> >instead of the MTD and rawnand one (this implies identifying all the >> >generic helpers you might need, and implementing them in >> >include/linux/mtd/nand.h or drivers/mtd/nand/core.c). >> > >> > 5/ Move drivers/mtd/nand/rawnand/nand_bbt.c into >> >drivers/mtd/nand/bbt.c >> > >> > 6/[optional] Implement your spinand layer in drivers/mtd/nand/spinand >> > >> > I know I'm asking a lot, especially given that you already spent a lot >> > of time iterating on this BBT rework series. But your goal is to move >> > mt29f driver out of staging, and you'll need to do the generic NAND >> > layer to achieve that, so I'd really prefer having the BBT code use >> > this generic layer instead of directly using the MTD API + an extra set >> > of NAND specific structs (like the nand_chip_layout_info one). >> >> Yes, I want to upstreaming my SPI NAND frameworks and it's indeed better >> to have a nand core. In fact, I already finished a SPI NAND framework with >> the BBT patch I sent which is directly under MTD (don't have a NAND core >> layer). >> >> Actually, I'm interested in this NAND framework refining work. And I >> know you already >> gave a speech on ELC about this. But due to the resource limitation, I may >> not >> to do all of the things. So how about I continue my BBT patch with >> your NAND refining >> ideas. I'll try to make the BBT patch compatible with the refining >> work. What do you >> think? > > The thing is, I'm not happy with these intermediate reworks, which in my > opinion are adding more confusion and will make things even harder to > rework afterward. > You said you already developed your SPI NAND framework and it's not > based on the generic NAND layer, which means you (or someone else) will > have to migrate it to this approach at some point, and this extra work > is kind of useless, especially since we seem to agree that the generic > NAND layer is the way to go for SPI NAND (and other NAND based devices) > support. > > Since I'm the one who pushed for this transition to an intermediate > "NAND core" layer, I'm willing to help you with this task. I actually > reworked my series [1] to move the BBT code in drivers/mtd/nand/bbt.c > and move raw NAND code into drivers/mtd/nand/rawnand/ (still have to > rework the commit logs, and test the implementation, but the different > steps are there and we end-up with something clean in > drivers/mtd/nand/). > > Could you help me debug this code and base your SPI NAND framework on > top of it? Yes I can. Actually I already clone your git tree and start to go through your commits. And I'll let you know when I have questions. > > Again, I'm sorry that you had to be the one supporting this transition, > but I don't want to introduce
Re: [PATCH v5 07/12] zsmalloc: factor page chain functionality out
On Mon, May 16, 2016 at 11:14:20AM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > > For page migration, we need to create page chain of zspage dynamically > > so this patch factors it out from alloc_zspage. > > > > Cc: Sergey Senozhatsky> > Signed-off-by: Minchan Kim > > Reviewed-by: Sergey Senozhatsky Thanks! > > [..] > > + page = alloc_page(flags); > > + if (!page) { > > + while (--i >= 0) > > + __free_page(pages[i]); > > put_page() ? > > a minor nit, put_page() here probably will be in alignment > with __free_zspage(), which does put_page(). Normally, we use put_page in case that someone can grab a referece of the page so we cannot free the page. Otherwise, alloc_page and __free_page is more straight to me code readability POV. > > -ss > > > + return NULL; > > + } > > + pages[i] = page; > > } > > > > + create_page_chain(pages, class->pages_per_zspage); > > + first_page = pages[0]; > > + init_zspage(class, first_page); > > + > > return first_page; > > }
Re: [PATCH v5 07/12] zsmalloc: factor page chain functionality out
On Mon, May 16, 2016 at 11:14:20AM +0900, Sergey Senozhatsky wrote: > On (05/09/16 11:20), Minchan Kim wrote: > > For page migration, we need to create page chain of zspage dynamically > > so this patch factors it out from alloc_zspage. > > > > Cc: Sergey Senozhatsky > > Signed-off-by: Minchan Kim > > Reviewed-by: Sergey Senozhatsky Thanks! > > [..] > > + page = alloc_page(flags); > > + if (!page) { > > + while (--i >= 0) > > + __free_page(pages[i]); > > put_page() ? > > a minor nit, put_page() here probably will be in alignment > with __free_zspage(), which does put_page(). Normally, we use put_page in case that someone can grab a referece of the page so we cannot free the page. Otherwise, alloc_page and __free_page is more straight to me code readability POV. > > -ss > > > + return NULL; > > + } > > + pages[i] = page; > > } > > > > + create_page_chain(pages, class->pages_per_zspage); > > + first_page = pages[0]; > > + init_zspage(class, first_page); > > + > > return first_page; > > }
Re: [rcu_sched stall] regression/miss-config ?
On Mon, May 16, 2016 at 12:49:41PM -0700, Santosh Shilimkar wrote: > On 5/16/2016 10:34 AM, Paul E. McKenney wrote: > >On Mon, May 16, 2016 at 09:33:57AM -0700, Santosh Shilimkar wrote: > >>On 5/16/2016 5:03 AM, Paul E. McKenney wrote: > >>>On Sun, May 15, 2016 at 09:35:40PM -0700, santosh.shilim...@oracle.com > >>>wrote: > On 5/15/16 2:18 PM, Santosh Shilimkar wrote: > >Hi Paul, > > > >I was asking Sasha about [1] since other folks in Oracle > >also stumbled upon similar RCU stalls with v4.1 kernel in > >different workloads. I was reported similar issue with > >RDS as well and looking at [1], [2], [3] and [4], thought > >of reaching out to see if you can help us to understand > >this issue better. > > > >Have also included RCU specific config used in these > >test(s). Its very hard to reproduce the issue but one of > >the data point is, it reproduces on systems with larger > >CPUs(64+). Same workload with less than 64 CPUs, don't > >show the issue. Someone also told me, making use of > >SLAB instead SLUB allocator makes difference but I > >haven't verified that part for RDS. > > > >Let me know your thoughts. Thanks in advance !! > > > One of my colleague told me the pastebin server I used > is Oracle internal only so adding the relevant logs along > with email. > > >> > >>[...] > >> > >[1] https://lkml.org/lkml/2014/12/14/304 > > > [2] Log 1 snippet: > - > INFO: rcu_sched self-detected stall on CPU > INFO: rcu_sched self-detected stall on CPU { 54} (t=6 jiffies > g=66023 c=66022 q=0) > Task dump for CPU 54: > ksoftirqd/54R running task0 389 2 0x0008 > 0007 88ff7f403d38 810a8621 0036 > 81ab6540 88ff7f403d58 810a86cf 0086 > 81ab6940 88ff7f403d88 810e3ad3 81ab6540 > Call Trace: > [] sched_show_task+0xb1/0x120 > [] dump_cpu_task+0x3f/0x50 > [] rcu_dump_cpu_stacks+0x83/0xc0 > [] print_cpu_stall+0xfc/0x170 > [] __rcu_pending+0x2bb/0x2c0 > [] rcu_check_callbacks+0x9d/0x170 > [] update_process_times+0x42/0x70 > [] tick_sched_handle+0x39/0x80 > [] tick_sched_timer+0x44/0x80 > [] __run_hrtimer+0x74/0x1d0 > [] ? tick_nohz_handler+0xa0/0xa0 > [] hrtimer_interrupt+0x102/0x240 > [] local_apic_timer_interrupt+0x39/0x60 > [] smp_apic_timer_interrupt+0x45/0x59 > [] apic_timer_interrupt+0x6e/0x80 > [] ? free_one_page+0x164/0x380 > [] ? __free_pages_ok+0xc3/0xe0 > [] __free_pages+0x25/0x40 > [] rds_message_purge+0x60/0x150 [rds] > [] rds_message_put+0x44/0x80 [rds] > [] rds_ib_send_cqe_handler+0x134/0x2d0 [rds_rdma] > [] ? _raw_spin_unlock_irqrestore+0x1b/0x50 > [] ? mlx4_ib_poll_cq+0xb3/0x2a0 [mlx4_ib] > [] poll_cq+0xa1/0xe0 [rds_rdma] > [] rds_ib_tasklet_fn_send+0x79/0xf0 [rds_rdma] > >>> > >>>The most likely possibility is that there is a 60-second-long loop in > >>>one of the above functions. This is within bottom-half execution, so > >>>unfortunately the usual trick of placing cond_resched_rcu_qs() within this > >>>loop, but outside of any RCU read-side critical section does not work. > >>> > >>First of all thanks for explanation. > >> > >>There is no loop which can last for 60 seconds in above code since > >>its just completion queue handler used to free up buffers much like > >>NIC > >>drivers bottom half(NAPI). Its done in tasklet context for latency > >>reasons which RDS care most. Just to get your attention, the RCU > >>stall is also seen with XEN code too. Log for it end of the email. > >> > >>Another important observation is, for RDS if we avoid higher > >>order page(s) allocation, issue is not reproducible so far. > >>In other words, for PAGE_SIZE(4K, get_order(bytes) ==0) allocations, > >>the system continues to run without any issue, so the loop scenario > >>is ruled out more or less. > >> > >>To be specific, with PAGE_SIZE allocations, alloc_pages() > >>is just allocating a page and __free_page() is used > >>instead of __free_pages() from below snippet. > >> > >>-- > >>if (bytes >= PAGE_SIZE) > >>page = alloc_pages(gfp, get_order(bytes)); > >> > >>. > >> > >>(rm->data.op_sg[i].length <= PAGE_SIZE) ? > >>__free_page(sg_page(>data.op_sg[i])) : > >>__free_pages(sg_page(>data.op_sg[i]), > >>get_order(rm->data.op_sg[i].length)); > >> > > > >This sounds like something to take up with the mm folks. > > > Sure. Will do once the link between two issues is established. Fair enough! > >>>Therefore, if there really is a loop here, one fix would be to > >>>periodically unwind back out to run_ksoftirqd(), but setting up so that > >>>the work would be continued later. Another fix might be to move
Re: [rcu_sched stall] regression/miss-config ?
On Mon, May 16, 2016 at 12:49:41PM -0700, Santosh Shilimkar wrote: > On 5/16/2016 10:34 AM, Paul E. McKenney wrote: > >On Mon, May 16, 2016 at 09:33:57AM -0700, Santosh Shilimkar wrote: > >>On 5/16/2016 5:03 AM, Paul E. McKenney wrote: > >>>On Sun, May 15, 2016 at 09:35:40PM -0700, santosh.shilim...@oracle.com > >>>wrote: > On 5/15/16 2:18 PM, Santosh Shilimkar wrote: > >Hi Paul, > > > >I was asking Sasha about [1] since other folks in Oracle > >also stumbled upon similar RCU stalls with v4.1 kernel in > >different workloads. I was reported similar issue with > >RDS as well and looking at [1], [2], [3] and [4], thought > >of reaching out to see if you can help us to understand > >this issue better. > > > >Have also included RCU specific config used in these > >test(s). Its very hard to reproduce the issue but one of > >the data point is, it reproduces on systems with larger > >CPUs(64+). Same workload with less than 64 CPUs, don't > >show the issue. Someone also told me, making use of > >SLAB instead SLUB allocator makes difference but I > >haven't verified that part for RDS. > > > >Let me know your thoughts. Thanks in advance !! > > > One of my colleague told me the pastebin server I used > is Oracle internal only so adding the relevant logs along > with email. > > >> > >>[...] > >> > >[1] https://lkml.org/lkml/2014/12/14/304 > > > [2] Log 1 snippet: > - > INFO: rcu_sched self-detected stall on CPU > INFO: rcu_sched self-detected stall on CPU { 54} (t=6 jiffies > g=66023 c=66022 q=0) > Task dump for CPU 54: > ksoftirqd/54R running task0 389 2 0x0008 > 0007 88ff7f403d38 810a8621 0036 > 81ab6540 88ff7f403d58 810a86cf 0086 > 81ab6940 88ff7f403d88 810e3ad3 81ab6540 > Call Trace: > [] sched_show_task+0xb1/0x120 > [] dump_cpu_task+0x3f/0x50 > [] rcu_dump_cpu_stacks+0x83/0xc0 > [] print_cpu_stall+0xfc/0x170 > [] __rcu_pending+0x2bb/0x2c0 > [] rcu_check_callbacks+0x9d/0x170 > [] update_process_times+0x42/0x70 > [] tick_sched_handle+0x39/0x80 > [] tick_sched_timer+0x44/0x80 > [] __run_hrtimer+0x74/0x1d0 > [] ? tick_nohz_handler+0xa0/0xa0 > [] hrtimer_interrupt+0x102/0x240 > [] local_apic_timer_interrupt+0x39/0x60 > [] smp_apic_timer_interrupt+0x45/0x59 > [] apic_timer_interrupt+0x6e/0x80 > [] ? free_one_page+0x164/0x380 > [] ? __free_pages_ok+0xc3/0xe0 > [] __free_pages+0x25/0x40 > [] rds_message_purge+0x60/0x150 [rds] > [] rds_message_put+0x44/0x80 [rds] > [] rds_ib_send_cqe_handler+0x134/0x2d0 [rds_rdma] > [] ? _raw_spin_unlock_irqrestore+0x1b/0x50 > [] ? mlx4_ib_poll_cq+0xb3/0x2a0 [mlx4_ib] > [] poll_cq+0xa1/0xe0 [rds_rdma] > [] rds_ib_tasklet_fn_send+0x79/0xf0 [rds_rdma] > >>> > >>>The most likely possibility is that there is a 60-second-long loop in > >>>one of the above functions. This is within bottom-half execution, so > >>>unfortunately the usual trick of placing cond_resched_rcu_qs() within this > >>>loop, but outside of any RCU read-side critical section does not work. > >>> > >>First of all thanks for explanation. > >> > >>There is no loop which can last for 60 seconds in above code since > >>its just completion queue handler used to free up buffers much like > >>NIC > >>drivers bottom half(NAPI). Its done in tasklet context for latency > >>reasons which RDS care most. Just to get your attention, the RCU > >>stall is also seen with XEN code too. Log for it end of the email. > >> > >>Another important observation is, for RDS if we avoid higher > >>order page(s) allocation, issue is not reproducible so far. > >>In other words, for PAGE_SIZE(4K, get_order(bytes) ==0) allocations, > >>the system continues to run without any issue, so the loop scenario > >>is ruled out more or less. > >> > >>To be specific, with PAGE_SIZE allocations, alloc_pages() > >>is just allocating a page and __free_page() is used > >>instead of __free_pages() from below snippet. > >> > >>-- > >>if (bytes >= PAGE_SIZE) > >>page = alloc_pages(gfp, get_order(bytes)); > >> > >>. > >> > >>(rm->data.op_sg[i].length <= PAGE_SIZE) ? > >>__free_page(sg_page(>data.op_sg[i])) : > >>__free_pages(sg_page(>data.op_sg[i]), > >>get_order(rm->data.op_sg[i].length)); > >> > > > >This sounds like something to take up with the mm folks. > > > Sure. Will do once the link between two issues is established. Fair enough! > >>>Therefore, if there really is a loop here, one fix would be to > >>>periodically unwind back out to run_ksoftirqd(), but setting up so that > >>>the work would be continued later. Another fix might be to move
[RFC][PATCH 7/7] locking,asm-generic: Remove generic rwsem add and rwsem update definitions
The rwsem count has been converted to an atomic variable and we now directly use atomic_long_add() and atomic_long_add_return() on the count, so we can remove the asm-generic implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low--- include/asm-generic/rwsem.h | 16 1 file changed, 16 deletions(-) diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h index 3fc94a0..dd9db88 100644 --- a/include/asm-generic/rwsem.h +++ b/include/asm-generic/rwsem.h @@ -107,14 +107,6 @@ static inline void __up_write(struct rw_semaphore *sem) } /* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - atomic_long_add(delta, (atomic_long_t *)>count); -} - -/* * downgrade write lock to read lock */ static inline void __downgrade_write(struct rw_semaphore *sem) @@ -134,13 +126,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - return atomic_long_add_return(delta, (atomic_long_t *)>count); -} - #endif /* __KERNEL__ */ #endif /* _ASM_GENERIC_RWSEM_H */ -- 2.1.4
[RFC][PATCH 7/7] locking,asm-generic: Remove generic rwsem add and rwsem update definitions
The rwsem count has been converted to an atomic variable and we now directly use atomic_long_add() and atomic_long_add_return() on the count, so we can remove the asm-generic implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- include/asm-generic/rwsem.h | 16 1 file changed, 16 deletions(-) diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h index 3fc94a0..dd9db88 100644 --- a/include/asm-generic/rwsem.h +++ b/include/asm-generic/rwsem.h @@ -107,14 +107,6 @@ static inline void __up_write(struct rw_semaphore *sem) } /* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - atomic_long_add(delta, (atomic_long_t *)>count); -} - -/* * downgrade write lock to read lock */ static inline void __downgrade_write(struct rw_semaphore *sem) @@ -134,13 +126,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - return atomic_long_add_return(delta, (atomic_long_t *)>count); -} - #endif /* __KERNEL__ */ #endif /* _ASM_GENERIC_RWSEM_H */ -- 2.1.4
[RFC][PATCH 6/7] locking,s390: Remove s390 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the s390 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low--- arch/s390/include/asm/rwsem.h | 37 - 1 file changed, 37 deletions(-) diff --git a/arch/s390/include/asm/rwsem.h b/arch/s390/include/asm/rwsem.h index c75e447..597e7e9 100644 --- a/arch/s390/include/asm/rwsem.h +++ b/arch/s390/include/asm/rwsem.h @@ -207,41 +207,4 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - signed long old, new; - - asm volatile( - " lg %0,%2\n" - "0: lgr %1,%0\n" - " agr %1,%4\n" - " csg %0,%1,%2\n" - " jl 0b" - : "=" (old), "=" (new), "=Q" (sem->count) - : "Q" (sem->count), "d" (delta) - : "cc", "memory"); -} - -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - signed long old, new; - - asm volatile( - " lg %0,%2\n" - "0: lgr %1,%0\n" - " agr %1,%4\n" - " csg %0,%1,%2\n" - " jl 0b" - : "=" (old), "=" (new), "=Q" (sem->count) - : "Q" (sem->count), "d" (delta) - : "cc", "memory"); - return new; -} - #endif /* _S390_RWSEM_H */ -- 2.1.4
[RFC][PATCH 6/7] locking,s390: Remove s390 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the s390 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/s390/include/asm/rwsem.h | 37 - 1 file changed, 37 deletions(-) diff --git a/arch/s390/include/asm/rwsem.h b/arch/s390/include/asm/rwsem.h index c75e447..597e7e9 100644 --- a/arch/s390/include/asm/rwsem.h +++ b/arch/s390/include/asm/rwsem.h @@ -207,41 +207,4 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - signed long old, new; - - asm volatile( - " lg %0,%2\n" - "0: lgr %1,%0\n" - " agr %1,%4\n" - " csg %0,%1,%2\n" - " jl 0b" - : "=" (old), "=" (new), "=Q" (sem->count) - : "Q" (sem->count), "d" (delta) - : "cc", "memory"); -} - -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - signed long old, new; - - asm volatile( - " lg %0,%2\n" - "0: lgr %1,%0\n" - " agr %1,%4\n" - " csg %0,%1,%2\n" - " jl 0b" - : "=" (old), "=" (new), "=Q" (sem->count) - : "Q" (sem->count), "d" (delta) - : "cc", "memory"); - return new; -} - #endif /* _S390_RWSEM_H */ -- 2.1.4
[RFC][PATCH 3/7] locking,x86: Remove x86 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the x86 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low--- arch/x86/include/asm/rwsem.h | 18 -- 1 file changed, 18 deletions(-) diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h index d2f8d10..91cf42c 100644 --- a/arch/x86/include/asm/rwsem.h +++ b/arch/x86/include/asm/rwsem.h @@ -215,23 +215,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) : "memory", "cc"); } -/* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - asm volatile(LOCK_PREFIX _ASM_ADD "%1,%0" -: "+m" (sem->count) -: "er" (delta)); -} - -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - return delta + xadd(>count, delta); -} - #endif /* __KERNEL__ */ #endif /* _ASM_X86_RWSEM_H */ -- 2.1.4
[RFC][PATCH 4/7] locking,alpha: Remove Alpha rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the alpha implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low--- arch/alpha/include/asm/rwsem.h | 42 -- 1 file changed, 42 deletions(-) diff --git a/arch/alpha/include/asm/rwsem.h b/arch/alpha/include/asm/rwsem.h index 0131a70..a217bf8 100644 --- a/arch/alpha/include/asm/rwsem.h +++ b/arch/alpha/include/asm/rwsem.h @@ -191,47 +191,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -static inline void rwsem_atomic_add(long val, struct rw_semaphore *sem) -{ -#ifndefCONFIG_SMP - sem->count += val; -#else - long temp; - __asm__ __volatile__( - "1: ldq_l %0,%1\n" - " addq%0,%2,%0\n" - " stq_c %0,%1\n" - " beq %0,2f\n" - ".subsection 2\n" - "2: br 1b\n" - ".previous" - :"=" (temp), "=m" (sem->count) - :"Ir" (val), "m" (sem->count)); -#endif -} - -static inline long rwsem_atomic_update(long val, struct rw_semaphore *sem) -{ -#ifndefCONFIG_SMP - sem->count += val; - return sem->count; -#else - long ret, temp; - __asm__ __volatile__( - "1: ldq_l %0,%1\n" - " addq%0,%3,%2\n" - " addq%0,%3,%0\n" - " stq_c %2,%1\n" - " beq %2,2f\n" - ".subsection 2\n" - "2: br 1b\n" - ".previous" - :"=" (ret), "=m" (sem->count), "=" (temp) - :"Ir" (val), "m" (sem->count)); - - return ret; -#endif -} - #endif /* __KERNEL__ */ #endif /* _ALPHA_RWSEM_H */ -- 2.1.4
[RFC][PATCH 5/7] locking,ia64: Remove ia64 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the ia64 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low--- arch/ia64/include/asm/rwsem.h | 7 --- 1 file changed, 7 deletions(-) diff --git a/arch/ia64/include/asm/rwsem.h b/arch/ia64/include/asm/rwsem.h index 8b23e07..dfd5895 100644 --- a/arch/ia64/include/asm/rwsem.h +++ b/arch/ia64/include/asm/rwsem.h @@ -151,11 +151,4 @@ __downgrade_write (struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * Implement atomic add functionality. These used to be "inline" functions, but GCC v3.1 - * doesn't quite optimize this stuff right and ends up with bad calls to fetchandadd. - */ -#define rwsem_atomic_add(delta, sem) atomic64_add(delta, (atomic64_t *)(&(sem)->count)) -#define rwsem_atomic_update(delta, sem)atomic64_add_return(delta, (atomic64_t *)(&(sem)->count)) - #endif /* _ASM_IA64_RWSEM_H */ -- 2.1.4
[RFC][PATCH 3/7] locking,x86: Remove x86 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the x86 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/x86/include/asm/rwsem.h | 18 -- 1 file changed, 18 deletions(-) diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h index d2f8d10..91cf42c 100644 --- a/arch/x86/include/asm/rwsem.h +++ b/arch/x86/include/asm/rwsem.h @@ -215,23 +215,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) : "memory", "cc"); } -/* - * implement atomic add functionality - */ -static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem) -{ - asm volatile(LOCK_PREFIX _ASM_ADD "%1,%0" -: "+m" (sem->count) -: "er" (delta)); -} - -/* - * implement exchange and add functionality - */ -static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem) -{ - return delta + xadd(>count, delta); -} - #endif /* __KERNEL__ */ #endif /* _ASM_X86_RWSEM_H */ -- 2.1.4
[RFC][PATCH 4/7] locking,alpha: Remove Alpha rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the alpha implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/alpha/include/asm/rwsem.h | 42 -- 1 file changed, 42 deletions(-) diff --git a/arch/alpha/include/asm/rwsem.h b/arch/alpha/include/asm/rwsem.h index 0131a70..a217bf8 100644 --- a/arch/alpha/include/asm/rwsem.h +++ b/arch/alpha/include/asm/rwsem.h @@ -191,47 +191,5 @@ static inline void __downgrade_write(struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -static inline void rwsem_atomic_add(long val, struct rw_semaphore *sem) -{ -#ifndefCONFIG_SMP - sem->count += val; -#else - long temp; - __asm__ __volatile__( - "1: ldq_l %0,%1\n" - " addq%0,%2,%0\n" - " stq_c %0,%1\n" - " beq %0,2f\n" - ".subsection 2\n" - "2: br 1b\n" - ".previous" - :"=" (temp), "=m" (sem->count) - :"Ir" (val), "m" (sem->count)); -#endif -} - -static inline long rwsem_atomic_update(long val, struct rw_semaphore *sem) -{ -#ifndefCONFIG_SMP - sem->count += val; - return sem->count; -#else - long ret, temp; - __asm__ __volatile__( - "1: ldq_l %0,%1\n" - " addq%0,%3,%2\n" - " addq%0,%3,%0\n" - " stq_c %2,%1\n" - " beq %2,2f\n" - ".subsection 2\n" - "2: br 1b\n" - ".previous" - :"=" (ret), "=m" (sem->count), "=" (temp) - :"Ir" (val), "m" (sem->count)); - - return ret; -#endif -} - #endif /* __KERNEL__ */ #endif /* _ALPHA_RWSEM_H */ -- 2.1.4
[RFC][PATCH 5/7] locking,ia64: Remove ia64 rwsem add and rwsem update
The rwsem count has been converted to an atomic variable and the rwsem code now directly uses atomic_long_add() and atomic_long_add_return(), so we can remove the ia64 implementation of rwsem_atomic_add() and rwsem_atomic_update(). Signed-off-by: Jason Low --- arch/ia64/include/asm/rwsem.h | 7 --- 1 file changed, 7 deletions(-) diff --git a/arch/ia64/include/asm/rwsem.h b/arch/ia64/include/asm/rwsem.h index 8b23e07..dfd5895 100644 --- a/arch/ia64/include/asm/rwsem.h +++ b/arch/ia64/include/asm/rwsem.h @@ -151,11 +151,4 @@ __downgrade_write (struct rw_semaphore *sem) rwsem_downgrade_wake(sem); } -/* - * Implement atomic add functionality. These used to be "inline" functions, but GCC v3.1 - * doesn't quite optimize this stuff right and ends up with bad calls to fetchandadd. - */ -#define rwsem_atomic_add(delta, sem) atomic64_add(delta, (atomic64_t *)(&(sem)->count)) -#define rwsem_atomic_update(delta, sem)atomic64_add_return(delta, (atomic64_t *)(&(sem)->count)) - #endif /* _ASM_IA64_RWSEM_H */ -- 2.1.4
[RFC][PATCH 2/7] locking/rwsem: Convert sem->count to atomic_long_t
Convert the rwsem count variable to an atomic_long_t since we use it as an atomic variable. This also allows us to remove the rwsem_atomic_{add,update} "abstraction" which would now be an unnecesary level of indirection. In follow up patches, we also remove the rwsem_atomic_{add,update} definitions across the various architectures. Suggested-by: Peter ZijlstraSigned-off-by: Jason Low --- include/linux/rwsem.h | 6 +++--- kernel/locking/rwsem-xadd.c | 31 --- 2 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index d1c12d1..e3d5a00 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -26,7 +26,7 @@ struct rw_semaphore; #else /* All arch specific implementations share the same struct */ struct rw_semaphore { - long count; + atomic_long_t count; struct list_head wait_list; raw_spinlock_t wait_lock; #ifdef CONFIG_RWSEM_SPIN_ON_OWNER @@ -54,7 +54,7 @@ extern struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem); /* In all implementations count != 0 means locked */ static inline int rwsem_is_locked(struct rw_semaphore *sem) { - return sem->count != 0; + return atomic_long_read(>count) != 0; } #endif @@ -74,7 +74,7 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem) #endif #define __RWSEM_INITIALIZER(name) \ - { .count = RWSEM_UNLOCKED_VALUE,\ + { .count = ATOMIC_LONG_INIT(RWSEM_UNLOCKED_VALUE), \ .wait_list = LIST_HEAD_INIT((name).wait_list),\ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(name.wait_lock) \ __RWSEM_OPT_INIT(name)\ diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 296d421..d5ecec3 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -80,7 +80,7 @@ void __init_rwsem(struct rw_semaphore *sem, const char *name, debug_check_no_locks_freed((void *)sem, sizeof(*sem)); lockdep_init_map(>dep_map, name, key, 0); #endif - sem->count = RWSEM_UNLOCKED_VALUE; + atomic_long_set(>count, RWSEM_UNLOCKED_VALUE); raw_spin_lock_init(>wait_lock); INIT_LIST_HEAD(>wait_list); #ifdef CONFIG_RWSEM_SPIN_ON_OWNER @@ -146,10 +146,11 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type) if (wake_type != RWSEM_WAKE_READ_OWNED) { adjustment = RWSEM_ACTIVE_READ_BIAS; try_reader_grant: - oldcount = rwsem_atomic_update(adjustment, sem) - adjustment; + oldcount = atomic_long_add_return(adjustment, >count) - adjustment; + if (unlikely(oldcount < RWSEM_WAITING_BIAS)) { /* A writer stole the lock. Undo our reader grant. */ - if (rwsem_atomic_update(-adjustment, sem) & + if (atomic_long_sub_return(adjustment, >count) & RWSEM_ACTIVE_MASK) goto out; /* Last active locker left. Retry waking readers. */ @@ -179,7 +180,7 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type) adjustment -= RWSEM_WAITING_BIAS; if (adjustment) - rwsem_atomic_add(adjustment, sem); + atomic_long_add(adjustment, >count); next = sem->wait_list.next; loop = woken; @@ -228,7 +229,7 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem) list_add_tail(, >wait_list); /* we're now waiting on the lock, but no longer actively locking */ - count = rwsem_atomic_update(adjustment, sem); + count = atomic_long_add_return(adjustment, >count); /* If there are no active locks, wake the front queued process(es). * @@ -276,7 +277,8 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) RWSEM_ACTIVE_WRITE_BIAS : RWSEM_ACTIVE_WRITE_BIAS + RWSEM_WAITING_BIAS; - if (cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) == RWSEM_WAITING_BIAS) { + if (atomic_long_cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) + == RWSEM_WAITING_BIAS) { rwsem_set_owner(sem); return true; } @@ -290,13 +292,13 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) */ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) { - long old, count = READ_ONCE(sem->count); + long old, count = atomic_long_read(>count); while (true) { if (!(count == 0 || count == RWSEM_WAITING_BIAS)) return false; - old =
[RFC][PATCH 1/7] locking/rwsem: Optimize write lock by reducing operations in slowpath
When acquiring the rwsem write lock in the slowpath, we first try to set count to RWSEM_WAITING_BIAS. When that is successful, we then atomically add the RWSEM_WAITING_BIAS in cases where there are other tasks on the wait list. This causes write lock operations to often issue multiple atomic operations. We can instead make the list_is_singular() check first, and then set the count accordingly, so that we issue at most 1 atomic operation when acquiring the write lock and reduce unnecessary cacheline contention. Signed-off-by: Jason LowAcked-by: Waiman Long Acked-by: Davidlohr Bueso --- kernel/locking/rwsem-xadd.c | 25 ++--- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 09e30c6..296d421 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -255,17 +255,28 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem) } EXPORT_SYMBOL(rwsem_down_read_failed); +/* + * This function must be called with the sem->wait_lock held to prevent + * race conditions between checking the rwsem wait list and setting the + * sem->count accordingly. + */ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) { /* -* Try acquiring the write lock. Check count first in order -* to reduce unnecessary expensive cmpxchg() operations. +* Avoid trying to acquire write lock if count isn't RWSEM_WAITING_BIAS. */ - if (count == RWSEM_WAITING_BIAS && - cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, - RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) { - if (!list_is_singular(>wait_list)) - rwsem_atomic_update(RWSEM_WAITING_BIAS, sem); + if (count != RWSEM_WAITING_BIAS) + return false; + + /* +* Acquire the lock by trying to set it to ACTIVE_WRITE_BIAS. If there +* are other tasks on the wait list, we need to add on WAITING_BIAS. +*/ + count = list_is_singular(>wait_list) ? + RWSEM_ACTIVE_WRITE_BIAS : + RWSEM_ACTIVE_WRITE_BIAS + RWSEM_WAITING_BIAS; + + if (cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) == RWSEM_WAITING_BIAS) { rwsem_set_owner(sem); return true; } -- 2.1.4
[RFC][PATCH 2/7] locking/rwsem: Convert sem->count to atomic_long_t
Convert the rwsem count variable to an atomic_long_t since we use it as an atomic variable. This also allows us to remove the rwsem_atomic_{add,update} "abstraction" which would now be an unnecesary level of indirection. In follow up patches, we also remove the rwsem_atomic_{add,update} definitions across the various architectures. Suggested-by: Peter Zijlstra Signed-off-by: Jason Low --- include/linux/rwsem.h | 6 +++--- kernel/locking/rwsem-xadd.c | 31 --- 2 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index d1c12d1..e3d5a00 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -26,7 +26,7 @@ struct rw_semaphore; #else /* All arch specific implementations share the same struct */ struct rw_semaphore { - long count; + atomic_long_t count; struct list_head wait_list; raw_spinlock_t wait_lock; #ifdef CONFIG_RWSEM_SPIN_ON_OWNER @@ -54,7 +54,7 @@ extern struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem); /* In all implementations count != 0 means locked */ static inline int rwsem_is_locked(struct rw_semaphore *sem) { - return sem->count != 0; + return atomic_long_read(>count) != 0; } #endif @@ -74,7 +74,7 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem) #endif #define __RWSEM_INITIALIZER(name) \ - { .count = RWSEM_UNLOCKED_VALUE,\ + { .count = ATOMIC_LONG_INIT(RWSEM_UNLOCKED_VALUE), \ .wait_list = LIST_HEAD_INIT((name).wait_list),\ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(name.wait_lock) \ __RWSEM_OPT_INIT(name)\ diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 296d421..d5ecec3 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -80,7 +80,7 @@ void __init_rwsem(struct rw_semaphore *sem, const char *name, debug_check_no_locks_freed((void *)sem, sizeof(*sem)); lockdep_init_map(>dep_map, name, key, 0); #endif - sem->count = RWSEM_UNLOCKED_VALUE; + atomic_long_set(>count, RWSEM_UNLOCKED_VALUE); raw_spin_lock_init(>wait_lock); INIT_LIST_HEAD(>wait_list); #ifdef CONFIG_RWSEM_SPIN_ON_OWNER @@ -146,10 +146,11 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type) if (wake_type != RWSEM_WAKE_READ_OWNED) { adjustment = RWSEM_ACTIVE_READ_BIAS; try_reader_grant: - oldcount = rwsem_atomic_update(adjustment, sem) - adjustment; + oldcount = atomic_long_add_return(adjustment, >count) - adjustment; + if (unlikely(oldcount < RWSEM_WAITING_BIAS)) { /* A writer stole the lock. Undo our reader grant. */ - if (rwsem_atomic_update(-adjustment, sem) & + if (atomic_long_sub_return(adjustment, >count) & RWSEM_ACTIVE_MASK) goto out; /* Last active locker left. Retry waking readers. */ @@ -179,7 +180,7 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type) adjustment -= RWSEM_WAITING_BIAS; if (adjustment) - rwsem_atomic_add(adjustment, sem); + atomic_long_add(adjustment, >count); next = sem->wait_list.next; loop = woken; @@ -228,7 +229,7 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem) list_add_tail(, >wait_list); /* we're now waiting on the lock, but no longer actively locking */ - count = rwsem_atomic_update(adjustment, sem); + count = atomic_long_add_return(adjustment, >count); /* If there are no active locks, wake the front queued process(es). * @@ -276,7 +277,8 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) RWSEM_ACTIVE_WRITE_BIAS : RWSEM_ACTIVE_WRITE_BIAS + RWSEM_WAITING_BIAS; - if (cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) == RWSEM_WAITING_BIAS) { + if (atomic_long_cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) + == RWSEM_WAITING_BIAS) { rwsem_set_owner(sem); return true; } @@ -290,13 +292,13 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) */ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) { - long old, count = READ_ONCE(sem->count); + long old, count = atomic_long_read(>count); while (true) { if (!(count == 0 || count == RWSEM_WAITING_BIAS)) return false; - old = cmpxchg_acquire(>count, count, + old =
[RFC][PATCH 1/7] locking/rwsem: Optimize write lock by reducing operations in slowpath
When acquiring the rwsem write lock in the slowpath, we first try to set count to RWSEM_WAITING_BIAS. When that is successful, we then atomically add the RWSEM_WAITING_BIAS in cases where there are other tasks on the wait list. This causes write lock operations to often issue multiple atomic operations. We can instead make the list_is_singular() check first, and then set the count accordingly, so that we issue at most 1 atomic operation when acquiring the write lock and reduce unnecessary cacheline contention. Signed-off-by: Jason Low Acked-by: Waiman Long Acked-by: Davidlohr Bueso --- kernel/locking/rwsem-xadd.c | 25 ++--- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 09e30c6..296d421 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -255,17 +255,28 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem) } EXPORT_SYMBOL(rwsem_down_read_failed); +/* + * This function must be called with the sem->wait_lock held to prevent + * race conditions between checking the rwsem wait list and setting the + * sem->count accordingly. + */ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) { /* -* Try acquiring the write lock. Check count first in order -* to reduce unnecessary expensive cmpxchg() operations. +* Avoid trying to acquire write lock if count isn't RWSEM_WAITING_BIAS. */ - if (count == RWSEM_WAITING_BIAS && - cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, - RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) { - if (!list_is_singular(>wait_list)) - rwsem_atomic_update(RWSEM_WAITING_BIAS, sem); + if (count != RWSEM_WAITING_BIAS) + return false; + + /* +* Acquire the lock by trying to set it to ACTIVE_WRITE_BIAS. If there +* are other tasks on the wait list, we need to add on WAITING_BIAS. +*/ + count = list_is_singular(>wait_list) ? + RWSEM_ACTIVE_WRITE_BIAS : + RWSEM_ACTIVE_WRITE_BIAS + RWSEM_WAITING_BIAS; + + if (cmpxchg_acquire(>count, RWSEM_WAITING_BIAS, count) == RWSEM_WAITING_BIAS) { rwsem_set_owner(sem); return true; } -- 2.1.4
[RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t
The first patch contains an optimization for acquiring the rwsem write lock in the slowpath. This rest of the series converts the rwsem count variable to an atomic_long_t since it is used it as an atomic variable. This allows us to also remove the rwsem_atomic_{add,update} abstraction and reduce 100+ lines of code. arch/alpha/include/asm/rwsem.h | 42 arch/ia64/include/asm/rwsem.h | 7 -- arch/s390/include/asm/rwsem.h | 37 - arch/x86/include/asm/rwsem.h | 18 -- include/asm-generic/rwsem.h| 16 - include/linux/rwsem.h | 6 ++--- kernel/locking/rwsem-xadd.c| 54 ++ 7 files changed, 36 insertions(+), 144 deletions(-) -- 2.1.4
[RFC][PATCH 0/7] locking/rwsem: Convert rwsem count to atomic_long_t
The first patch contains an optimization for acquiring the rwsem write lock in the slowpath. This rest of the series converts the rwsem count variable to an atomic_long_t since it is used it as an atomic variable. This allows us to also remove the rwsem_atomic_{add,update} abstraction and reduce 100+ lines of code. arch/alpha/include/asm/rwsem.h | 42 arch/ia64/include/asm/rwsem.h | 7 -- arch/s390/include/asm/rwsem.h | 37 - arch/x86/include/asm/rwsem.h | 18 -- include/asm-generic/rwsem.h| 16 - include/linux/rwsem.h | 6 ++--- kernel/locking/rwsem-xadd.c| 54 ++ 7 files changed, 36 insertions(+), 144 deletions(-) -- 2.1.4
RE: [PATCH v2 0/4] ACPI 2.0: Enable TermList interpretion for table loading
Hi, Rafael Can we queue this up in linux-next? ASLTS recursive tests are done in ACPICA upstream and no regressions can be seen. We need more tests around this experimental change from the real users to have the chances to learn the unknown cases. If they reported regressions, we could stop the regressions by reverting [PATCH 4/4]. So it should be safe to do such experiments in the Linux upstream. Thanks in advance. Best regards -Lv > From: Zheng, Lv > Subject: [PATCH v2 0/4] ACPI 2.0: Enable TermList interpretion for table > loading > > MLC (module level code) is an ACPICA terminology describing the AML code > out of any control method, currently only Type1Opcode (If/Else/While) > wrapped MLC code blocks are executed by the AML interpreter after the table > loading. But the issue which is fixed by this patchset is: >Not only Type1Opcode, but also Type2Opcode will be executed as MLC and >MLC is not executed after loading the table, but is executed right in >place. > > The following AML code is assembled into a static loading SSDT, and used > as an instrumentation to pry into the de-facto standard AML interpreter > behaviors: > Name (ECOK, Zero) > Scope (\) > { > DBUG ("TermList 1") > If (LEqual (ECOK, Zero)) > { > DBUG ("TermList 2") > Device (MDEV) > { > DEBUG (TermList 3") > If (CondRefOf (MDEV)) > { > DBUG ("MDEV exists") > } > If (CondRefOf (MDEV._STA)) > { > DBUG ("MDEV._STA exists") > } > If (CondRefOf (\_SB.PCI0.EC)) > { > DBUG ("\\_SB.PCI0.EC exists") > } > Name (_HID, EisaId ("PNP")) > Method (_STA, 0, Serialized) > { > DEBUG ("\\_SB.MDEV._STA") > Return (0x0F) > } > } > DBUG ("TermList 4") > } > Method (_INI, 0, Serialized) > { > DBUG ("\\_SB._INI") > } > } > Scope (_SB.PCI0) > { > Device (EC) > { > ... > } > } > The DBUG function is a function to write the debugging messages into a > SystemIo debug port. > Running Windows with the BIOS providing this SSDT via RSDT, the following > messages are obtained from the debug port: > TermList 1 > TermList 2 > TermList 3 > \_SB.MDEV exists > TermList 4 > \_SB._INI > ... > > This test reveals the de-facto grammar for the AMLCode to us: > 1. During the table loading, MLC will be executed by the interpreter, this >is partially supported by the current ACPICA; > 2. For SystemIo, not only after the _REG(1, 1) is evaluated (current ACPICA >interpreter limitation), but when the table is being loaded, the >SystemIo (the debugging port) is accessible, this is recently fixed in >the upstream, now all early operation regions are accessible during the >table loading; > 3. Not only Type1Opcode, but also Type2Opcode will be executed as MLC and >MLC is not executed after loading the table, but is executed right in >place, the Linux upstream is not compliant to this behavior. > > The last compliance issue has already been clarified in ACPI 2.0 > specification, so the compliance issue is not that Linux is not compliant > to the de-facto standard OS, but that Linux is not compliant to ACPI 2.0. > Definition block tables in fact is defined by the spec as TermList, which > has no difference than the control methods, thus the interpretion of the > table should be no difference that the control method evaluation: > AMLCode := DefBlockHeader TermList > DefMethod := MethodOp PkgLength NameString MethodFlags TermList > > Why ACPICA interpreter is acting so differently from this definition? This > is because, there are many software entropies preventing this from being > enabled, such entropies need to be cleaned up first in order not to trigger > regressions for specific platforms. These entropies include: > 1. ECDT support is broken. In fact, the original EC driver was correct, but >devlopers started to use the namespace EC instead of ECDT just because >several broken ECDT tables were reported on the bugzilla. They trusted >the namespace EC settings rather than the ECDT ones, this led to the >evaluation of _REG/_GPE/_CRS and namespace walk before executing the >module level AML opcodes. And the fixes in fact finally disable early EC >usages (used during table loading and early device enumeration >processes). > 2. _REG evaluations are wrong. ACPICA provides APIs for OSPMs to register >operation region handlers. But for the early operation region accesses, >ACPI spec declares that the evaluations of _REG are not required, but >the ACPICA APIs do not avoid running _REG to meet this early >requirements. Code to fix this is partially
RE: [PATCH v2 0/4] ACPI 2.0: Enable TermList interpretion for table loading
Hi, Rafael Can we queue this up in linux-next? ASLTS recursive tests are done in ACPICA upstream and no regressions can be seen. We need more tests around this experimental change from the real users to have the chances to learn the unknown cases. If they reported regressions, we could stop the regressions by reverting [PATCH 4/4]. So it should be safe to do such experiments in the Linux upstream. Thanks in advance. Best regards -Lv > From: Zheng, Lv > Subject: [PATCH v2 0/4] ACPI 2.0: Enable TermList interpretion for table > loading > > MLC (module level code) is an ACPICA terminology describing the AML code > out of any control method, currently only Type1Opcode (If/Else/While) > wrapped MLC code blocks are executed by the AML interpreter after the table > loading. But the issue which is fixed by this patchset is: >Not only Type1Opcode, but also Type2Opcode will be executed as MLC and >MLC is not executed after loading the table, but is executed right in >place. > > The following AML code is assembled into a static loading SSDT, and used > as an instrumentation to pry into the de-facto standard AML interpreter > behaviors: > Name (ECOK, Zero) > Scope (\) > { > DBUG ("TermList 1") > If (LEqual (ECOK, Zero)) > { > DBUG ("TermList 2") > Device (MDEV) > { > DEBUG (TermList 3") > If (CondRefOf (MDEV)) > { > DBUG ("MDEV exists") > } > If (CondRefOf (MDEV._STA)) > { > DBUG ("MDEV._STA exists") > } > If (CondRefOf (\_SB.PCI0.EC)) > { > DBUG ("\\_SB.PCI0.EC exists") > } > Name (_HID, EisaId ("PNP")) > Method (_STA, 0, Serialized) > { > DEBUG ("\\_SB.MDEV._STA") > Return (0x0F) > } > } > DBUG ("TermList 4") > } > Method (_INI, 0, Serialized) > { > DBUG ("\\_SB._INI") > } > } > Scope (_SB.PCI0) > { > Device (EC) > { > ... > } > } > The DBUG function is a function to write the debugging messages into a > SystemIo debug port. > Running Windows with the BIOS providing this SSDT via RSDT, the following > messages are obtained from the debug port: > TermList 1 > TermList 2 > TermList 3 > \_SB.MDEV exists > TermList 4 > \_SB._INI > ... > > This test reveals the de-facto grammar for the AMLCode to us: > 1. During the table loading, MLC will be executed by the interpreter, this >is partially supported by the current ACPICA; > 2. For SystemIo, not only after the _REG(1, 1) is evaluated (current ACPICA >interpreter limitation), but when the table is being loaded, the >SystemIo (the debugging port) is accessible, this is recently fixed in >the upstream, now all early operation regions are accessible during the >table loading; > 3. Not only Type1Opcode, but also Type2Opcode will be executed as MLC and >MLC is not executed after loading the table, but is executed right in >place, the Linux upstream is not compliant to this behavior. > > The last compliance issue has already been clarified in ACPI 2.0 > specification, so the compliance issue is not that Linux is not compliant > to the de-facto standard OS, but that Linux is not compliant to ACPI 2.0. > Definition block tables in fact is defined by the spec as TermList, which > has no difference than the control methods, thus the interpretion of the > table should be no difference that the control method evaluation: > AMLCode := DefBlockHeader TermList > DefMethod := MethodOp PkgLength NameString MethodFlags TermList > > Why ACPICA interpreter is acting so differently from this definition? This > is because, there are many software entropies preventing this from being > enabled, such entropies need to be cleaned up first in order not to trigger > regressions for specific platforms. These entropies include: > 1. ECDT support is broken. In fact, the original EC driver was correct, but >devlopers started to use the namespace EC instead of ECDT just because >several broken ECDT tables were reported on the bugzilla. They trusted >the namespace EC settings rather than the ECDT ones, this led to the >evaluation of _REG/_GPE/_CRS and namespace walk before executing the >module level AML opcodes. And the fixes in fact finally disable early EC >usages (used during table loading and early device enumeration >processes). > 2. _REG evaluations are wrong. ACPICA provides APIs for OSPMs to register >operation region handlers. But for the early operation region accesses, >ACPI spec declares that the evaluations of _REG are not required, but >the ACPICA APIs do not avoid running _REG to meet this early >requirements. Code to fix this is partially
linux-next: manual merge of the net-next tree with the arm64 tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: arch/arm64/Kconfig between commit: 8ee708792e1c ("arm64: Kconfig: remove redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE definition") from the arm64 tree and commit: 606b5908 ("bpf: split HAVE_BPF_JIT into cBPF and eBPF variant") from the net-next tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/arm64/Kconfig index 8845c0d100d7,e6761ea2feec.. --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@@ -59,9 -58,7 +59,9 @@@ config ARM6 select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK + select HAVE_ARCH_TRANSPARENT_HUGEPAGE + select HAVE_ARM_SMCCC - select HAVE_BPF_JIT + select HAVE_EBPF_JIT select HAVE_C_RECORDMCOUNT select HAVE_CC_STACKPROTECTOR select HAVE_CMPXCHG_DOUBLE
linux-next: manual merge of the net-next tree with the arm64 tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in: arch/arm64/Kconfig between commit: 8ee708792e1c ("arm64: Kconfig: remove redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE definition") from the arm64 tree and commit: 606b5908 ("bpf: split HAVE_BPF_JIT into cBPF and eBPF variant") from the net-next tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/arm64/Kconfig index 8845c0d100d7,e6761ea2feec.. --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@@ -59,9 -58,7 +59,9 @@@ config ARM6 select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK + select HAVE_ARCH_TRANSPARENT_HUGEPAGE + select HAVE_ARM_SMCCC - select HAVE_BPF_JIT + select HAVE_EBPF_JIT select HAVE_C_RECORDMCOUNT select HAVE_CC_STACKPROTECTOR select HAVE_CMPXCHG_DOUBLE
linux-next: manual merge of the vfs tree with the ext4 tree
Hi all, Today's linux-next merge of the vfs tree got conflicts in: fs/ext4/ext4.h fs/ext4/indirect.c fs/ext4/inode.c between commit: 914f82a32d02 ("ext4: refactor direct IO code") from the ext4 tree and commit: c8b8e32d700f ("direct-io: eliminate the offset argument to ->direct_IO") from the vfs tree. I fixed it up (hopefully - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc fs/ext4/ext4.h index b84aa1ca480a,72f4c9e00e97.. --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h diff --cc fs/ext4/indirect.c index bc15c2c17633,627b7e8f9ef3.. --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c diff --cc fs/ext4/inode.c index f9ab1e8cc416,79b298d397b4.. --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@@ -3327,13 -3334,12 +3327,13 @@@ static int ext4_end_io_dio(struct kioc * if the machine crashes during the write. * */ - static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter, - loff_t offset) -static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter) ++static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; + struct ext4_inode_info *ei = EXT4_I(inode); ssize_t ret; + loff_t offset = iocb->ki_pos; size_t count = iov_iter_count(iter); int overwrite = 0; get_block_t *get_block_func = NULL; @@@ -3423,12 -3399,12 +3423,12 @@@ #ifdef CONFIG_EXT4_FS_ENCRYPTION BUG_ON(ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode)); #endif - if (IS_DAX(inode)) + if (IS_DAX(inode)) { - ret = dax_do_io(iocb, inode, iter, offset, get_block_func, + ret = dax_do_io(iocb, inode, iter, get_block_func, ext4_end_io_dio, dio_flags); - else + } else ret = __blockdev_direct_IO(iocb, inode, - inode->i_sb->s_bdev, iter, offset, + inode->i_sb->s_bdev, iter, get_block_func, ext4_end_io_dio, NULL, dio_flags); @@@ -3451,82 -3428,6 +3451,82 @@@ if (overwrite) inode_lock(inode); + if (ret < 0 && final_size > inode->i_size) + ext4_truncate_failed_write(inode); + + /* Handle extending of i_size after direct IO write */ + if (orphan) { + int err; + + /* Credits for sb + inode write */ + handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); + if (IS_ERR(handle)) { + /* This is really bad luck. We've written the data + * but cannot extend i_size. Bail out and pretend + * the write failed... */ + ret = PTR_ERR(handle); + if (inode->i_nlink) + ext4_orphan_del(NULL, inode); + + goto out; + } + if (inode->i_nlink) + ext4_orphan_del(handle, inode); + if (ret > 0) { + loff_t end = offset + ret; + if (end > inode->i_size) { + ei->i_disksize = end; + i_size_write(inode, end); + /* + * We're going to return a positive `ret' + * here due to non-zero-length I/O, so there's + * no way of reporting error returns from + * ext4_mark_inode_dirty() to userspace. So + * ignore it. + */ + ext4_mark_inode_dirty(handle, inode); + } + } + err = ext4_journal_stop(handle); + if (ret == 0) + ret = err; + } +out: + return ret; +} + - static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter, - loff_t offset) ++static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter) +{ + int unlocked = 0; + struct inode *inode = iocb->ki_filp->f_mapping->host; ++ loff_t offset = iocb->ki_pos; + ssize_t ret; + + if (ext4_should_dioread_nolock(inode)) { + /* + * Nolock dioread
linux-next: manual merge of the vfs tree with the ext4 tree
Hi all, Today's linux-next merge of the vfs tree got conflicts in: fs/ext4/ext4.h fs/ext4/indirect.c fs/ext4/inode.c between commit: 914f82a32d02 ("ext4: refactor direct IO code") from the ext4 tree and commit: c8b8e32d700f ("direct-io: eliminate the offset argument to ->direct_IO") from the vfs tree. I fixed it up (hopefully - see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc fs/ext4/ext4.h index b84aa1ca480a,72f4c9e00e97.. --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h diff --cc fs/ext4/indirect.c index bc15c2c17633,627b7e8f9ef3.. --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c diff --cc fs/ext4/inode.c index f9ab1e8cc416,79b298d397b4.. --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@@ -3327,13 -3334,12 +3327,13 @@@ static int ext4_end_io_dio(struct kioc * if the machine crashes during the write. * */ - static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter, - loff_t offset) -static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter) ++static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; + struct ext4_inode_info *ei = EXT4_I(inode); ssize_t ret; + loff_t offset = iocb->ki_pos; size_t count = iov_iter_count(iter); int overwrite = 0; get_block_t *get_block_func = NULL; @@@ -3423,12 -3399,12 +3423,12 @@@ #ifdef CONFIG_EXT4_FS_ENCRYPTION BUG_ON(ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode)); #endif - if (IS_DAX(inode)) + if (IS_DAX(inode)) { - ret = dax_do_io(iocb, inode, iter, offset, get_block_func, + ret = dax_do_io(iocb, inode, iter, get_block_func, ext4_end_io_dio, dio_flags); - else + } else ret = __blockdev_direct_IO(iocb, inode, - inode->i_sb->s_bdev, iter, offset, + inode->i_sb->s_bdev, iter, get_block_func, ext4_end_io_dio, NULL, dio_flags); @@@ -3451,82 -3428,6 +3451,82 @@@ if (overwrite) inode_lock(inode); + if (ret < 0 && final_size > inode->i_size) + ext4_truncate_failed_write(inode); + + /* Handle extending of i_size after direct IO write */ + if (orphan) { + int err; + + /* Credits for sb + inode write */ + handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); + if (IS_ERR(handle)) { + /* This is really bad luck. We've written the data + * but cannot extend i_size. Bail out and pretend + * the write failed... */ + ret = PTR_ERR(handle); + if (inode->i_nlink) + ext4_orphan_del(NULL, inode); + + goto out; + } + if (inode->i_nlink) + ext4_orphan_del(handle, inode); + if (ret > 0) { + loff_t end = offset + ret; + if (end > inode->i_size) { + ei->i_disksize = end; + i_size_write(inode, end); + /* + * We're going to return a positive `ret' + * here due to non-zero-length I/O, so there's + * no way of reporting error returns from + * ext4_mark_inode_dirty() to userspace. So + * ignore it. + */ + ext4_mark_inode_dirty(handle, inode); + } + } + err = ext4_journal_stop(handle); + if (ret == 0) + ret = err; + } +out: + return ret; +} + - static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter, - loff_t offset) ++static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter) +{ + int unlocked = 0; + struct inode *inode = iocb->ki_filp->f_mapping->host; ++ loff_t offset = iocb->ki_pos; + ssize_t ret; + + if (ext4_should_dioread_nolock(inode)) { + /* + * Nolock dioread
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On Wed, May 11, 2016 at 3:00 PM, Dean Jenkinswrote: > > Your observations are consistent with missing URBs from the USB host > controller. > > Here is a summary of what I think is happening in your case: > > Good case: > URB #1: 1514 octets of 1514 Ethernet frame (A) > URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet > frame (C) > URB #3: 988 octets of 1514 Ethernet frame (C) > URB #4: 1514 octets of 1514 Ethernet frame (D) > > Therefore, Ethernet frame (C) is spanning URBs #2 and #3. > > Bad case, URB #3 is lost: > URB #1: 1514 octets of 1514 Ethernet frame (A) > URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet > frame (C) > Remaining is 988 > URB #4: 1514 octets of 1514 Ethernet frame (D) > > But when URB #4 is analysed the 32-bit Header word is not found after 988 > octets in the URB buffer so "sync lost". > The end of Ethernet frame (C) is missing so drop the Ethernet frame. > Now look at the start of the URB #4 buffer and find a 32-bit header word so > Ethernet frame (D) can be consumed. > > So I think the commit is acting as intended and you are suffering from lost > URBs. No. I went digging on this for a bit longer, and it looks like its just that you're calculating the offset wrong in your check. I was wondering why without your patch we wouldn't see "Bad Header Length" messages, since if the remaining was 988 and the skb->len was 2048 as seen in my logs, without your patch we should copy the 988 bytes out clear remaining and then continue processing the rest of the skb, which calculates the header and checks the size. If we really lost the URB, we should throw an error at that point, since really we'd be midway through the following frame. But we just don't see that with your patch removed. Looking more closely, in the main loop, we do: (where offset is zero, or set to "offset += (copy_length + 1) & 0xfffe" in the previous loop) rx->header = get_unaligned_le32(skb->data + offset); offset += sizeof(u32); But your check calculates: offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32); rx->header = get_unaligned_le32(skb->data + offset); Adding some debug logic to check those offset calculation used to find rx->header, the one in your code is always too large by sizeof(u32). So removing the extra addition in your offset calculation seems to solve this for me. I'll send out a patch here shortly. thanks -john
Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
On Wed, May 11, 2016 at 3:00 PM, Dean Jenkins wrote: > > Your observations are consistent with missing URBs from the USB host > controller. > > Here is a summary of what I think is happening in your case: > > Good case: > URB #1: 1514 octets of 1514 Ethernet frame (A) > URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet > frame (C) > URB #3: 988 octets of 1514 Ethernet frame (C) > URB #4: 1514 octets of 1514 Ethernet frame (D) > > Therefore, Ethernet frame (C) is spanning URBs #2 and #3. > > Bad case, URB #3 is lost: > URB #1: 1514 octets of 1514 Ethernet frame (A) > URB #2: 1514 octets of 1514 Ethernet frame (B) + 526 octets of 1514 Ethernet > frame (C) > Remaining is 988 > URB #4: 1514 octets of 1514 Ethernet frame (D) > > But when URB #4 is analysed the 32-bit Header word is not found after 988 > octets in the URB buffer so "sync lost". > The end of Ethernet frame (C) is missing so drop the Ethernet frame. > Now look at the start of the URB #4 buffer and find a 32-bit header word so > Ethernet frame (D) can be consumed. > > So I think the commit is acting as intended and you are suffering from lost > URBs. No. I went digging on this for a bit longer, and it looks like its just that you're calculating the offset wrong in your check. I was wondering why without your patch we wouldn't see "Bad Header Length" messages, since if the remaining was 988 and the skb->len was 2048 as seen in my logs, without your patch we should copy the 988 bytes out clear remaining and then continue processing the rest of the skb, which calculates the header and checks the size. If we really lost the URB, we should throw an error at that point, since really we'd be midway through the following frame. But we just don't see that with your patch removed. Looking more closely, in the main loop, we do: (where offset is zero, or set to "offset += (copy_length + 1) & 0xfffe" in the previous loop) rx->header = get_unaligned_le32(skb->data + offset); offset += sizeof(u32); But your check calculates: offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32); rx->header = get_unaligned_le32(skb->data + offset); Adding some debug logic to check those offset calculation used to find rx->header, the one in your code is always too large by sizeof(u32). So removing the extra addition in your offset calculation seems to solve this for me. I'll send out a patch here shortly. thanks -john
Re: [PATCH v2 0/8] crypto: caam - add support for LS1043A SoC
On Mon, May 16, 2016 at 03:49:27PM +, Horia Ioan Geanta Neag wrote: > > I assume it's too late for 4.7, however applying the patches would solve > dependencies b/w on-going caam development. I will be merging this after the merge window is closed. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 0/8] crypto: caam - add support for LS1043A SoC
On Mon, May 16, 2016 at 03:49:27PM +, Horia Ioan Geanta Neag wrote: > > I assume it's too late for 4.7, however applying the patches would solve > dependencies b/w on-going caam development. I will be merging this after the merge window is closed. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[PATCH v2 net-next] bpf: arm64: remove callee-save registers use for tmp registers
In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag to save 2 instructions for some jited BPF program. CC: Daniel BorkmannAcked-by: Zi Shen Lim Signed-off-by: Yang Shi --- Changelog v1 --> v2: * Updated stack diagram * Added the comment from Zi for the commit log * Added Zi's Acked-by Apply on top of Daniel's blinding constant patchset arch/arm64/net/bpf_jit_comp.c | 34 +- 1 file changed, 5 insertions(+), 29 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index d0d5190..49ba37e 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -51,9 +51,9 @@ static const int bpf2a64[] = { [BPF_REG_9] = A64_R(22), /* read-only frame pointer to access stack */ [BPF_REG_FP] = A64_R(25), - /* temporary register for internal BPF JIT */ - [TMP_REG_1] = A64_R(23), - [TMP_REG_2] = A64_R(24), + /* temporary registers for internal BPF JIT */ + [TMP_REG_1] = A64_R(10), + [TMP_REG_2] = A64_R(11), /* temporary register for blinding constants */ [BPF_REG_AX] = A64_R(9), }; @@ -61,7 +61,6 @@ static const int bpf2a64[] = { struct jit_ctx { const struct bpf_prog *prog; int idx; - int tmp_used; int epilogue_offset; int *offset; u32 *image; @@ -154,8 +153,6 @@ static void build_prologue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* * BPF prog stack layout @@ -167,7 +164,7 @@ static void build_prologue(struct jit_ctx *ctx) *| ... | callee saved registers *+-+ *| | x25/x26 -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) *| | *| ... | BPF prog stack *| | @@ -189,8 +186,6 @@ static void build_prologue(struct jit_ctx *ctx) /* Save callee-saved register */ emit(A64_PUSH(r6, r7, A64_SP), ctx); emit(A64_PUSH(r8, r9, A64_SP), ctx); - if (ctx->tmp_used) - emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx); /* Save fp (x25) and x26. SP requires 16 bytes alignment */ emit(A64_PUSH(fp, A64_R(26), A64_SP), ctx); @@ -210,8 +205,6 @@ static void build_epilogue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* We're done with BPF stack */ emit(A64_ADD_I(1, A64_SP, A64_SP, STACK_SIZE), ctx); @@ -220,8 +213,6 @@ static void build_epilogue(struct jit_ctx *ctx) emit(A64_POP(fp, A64_R(26), A64_SP), ctx); /* Restore callee-saved register */ - if (ctx->tmp_used) - emit(A64_POP(tmp1, tmp2, A64_SP), ctx); emit(A64_POP(r8, r9, A64_SP), ctx); emit(A64_POP(r6, r7, A64_SP), ctx); @@ -317,7 +308,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) emit(A64_UDIV(is64, dst, dst, src), ctx); break; case BPF_MOD: - ctx->tmp_used = 1; emit(A64_UDIV(is64, tmp, dst, src), ctx); emit(A64_MUL(is64, tmp, tmp, src), ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); @@ -390,49 +380,41 @@ emit_bswap_uxt: /* dst = dst OP imm */ case BPF_ALU | BPF_ADD | BPF_K: case BPF_ALU64 | BPF_ADD | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ADD(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_SUB | BPF_K: case BPF_ALU64 | BPF_SUB | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_AND | BPF_K: case BPF_ALU64 | BPF_AND | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_AND(is64, dst, dst, tmp), ctx);
[PATCH v2 net-next] bpf: arm64: remove callee-save registers use for tmp registers
In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag to save 2 instructions for some jited BPF program. CC: Daniel Borkmann Acked-by: Zi Shen Lim Signed-off-by: Yang Shi --- Changelog v1 --> v2: * Updated stack diagram * Added the comment from Zi for the commit log * Added Zi's Acked-by Apply on top of Daniel's blinding constant patchset arch/arm64/net/bpf_jit_comp.c | 34 +- 1 file changed, 5 insertions(+), 29 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index d0d5190..49ba37e 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -51,9 +51,9 @@ static const int bpf2a64[] = { [BPF_REG_9] = A64_R(22), /* read-only frame pointer to access stack */ [BPF_REG_FP] = A64_R(25), - /* temporary register for internal BPF JIT */ - [TMP_REG_1] = A64_R(23), - [TMP_REG_2] = A64_R(24), + /* temporary registers for internal BPF JIT */ + [TMP_REG_1] = A64_R(10), + [TMP_REG_2] = A64_R(11), /* temporary register for blinding constants */ [BPF_REG_AX] = A64_R(9), }; @@ -61,7 +61,6 @@ static const int bpf2a64[] = { struct jit_ctx { const struct bpf_prog *prog; int idx; - int tmp_used; int epilogue_offset; int *offset; u32 *image; @@ -154,8 +153,6 @@ static void build_prologue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* * BPF prog stack layout @@ -167,7 +164,7 @@ static void build_prologue(struct jit_ctx *ctx) *| ... | callee saved registers *+-+ *| | x25/x26 -* BPF fp register => -80:+-+ <= (BPF_FP) +* BPF fp register => -64:+-+ <= (BPF_FP) *| | *| ... | BPF prog stack *| | @@ -189,8 +186,6 @@ static void build_prologue(struct jit_ctx *ctx) /* Save callee-saved register */ emit(A64_PUSH(r6, r7, A64_SP), ctx); emit(A64_PUSH(r8, r9, A64_SP), ctx); - if (ctx->tmp_used) - emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx); /* Save fp (x25) and x26. SP requires 16 bytes alignment */ emit(A64_PUSH(fp, A64_R(26), A64_SP), ctx); @@ -210,8 +205,6 @@ static void build_epilogue(struct jit_ctx *ctx) const u8 r8 = bpf2a64[BPF_REG_8]; const u8 r9 = bpf2a64[BPF_REG_9]; const u8 fp = bpf2a64[BPF_REG_FP]; - const u8 tmp1 = bpf2a64[TMP_REG_1]; - const u8 tmp2 = bpf2a64[TMP_REG_2]; /* We're done with BPF stack */ emit(A64_ADD_I(1, A64_SP, A64_SP, STACK_SIZE), ctx); @@ -220,8 +213,6 @@ static void build_epilogue(struct jit_ctx *ctx) emit(A64_POP(fp, A64_R(26), A64_SP), ctx); /* Restore callee-saved register */ - if (ctx->tmp_used) - emit(A64_POP(tmp1, tmp2, A64_SP), ctx); emit(A64_POP(r8, r9, A64_SP), ctx); emit(A64_POP(r6, r7, A64_SP), ctx); @@ -317,7 +308,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) emit(A64_UDIV(is64, dst, dst, src), ctx); break; case BPF_MOD: - ctx->tmp_used = 1; emit(A64_UDIV(is64, tmp, dst, src), ctx); emit(A64_MUL(is64, tmp, tmp, src), ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); @@ -390,49 +380,41 @@ emit_bswap_uxt: /* dst = dst OP imm */ case BPF_ALU | BPF_ADD | BPF_K: case BPF_ALU64 | BPF_ADD | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_ADD(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_SUB | BPF_K: case BPF_ALU64 | BPF_SUB | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_SUB(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_AND | BPF_K: case BPF_ALU64 | BPF_AND | BPF_K: - ctx->tmp_used = 1; emit_a64_mov_i(is64, tmp, imm, ctx); emit(A64_AND(is64, dst, dst, tmp), ctx); break; case BPF_ALU | BPF_OR | BPF_K:
CAN I TRUST YOU?
Dear Beneficiary I am Peter Douglas Director Inspection Unit United Nations Inspection Agent in Hartsfield–Jackson Atlanta International Airport Atlanta GA. We are rounding up for the last Quater of the auditing, all abandom Consignment in US Airports are being transfer to our facilities here for inspection and confiscation. During our investigation, I discovered An abandoned shipment on your name which was transferred to our facility here in Hartsfield–Jackson Atlanta International Airport and when scanned it revealed an undisclosed sum of money in a Metal Trunk Box. The consignment was abandoned because the Content was not properly declared by the consignee as money, rather it was declared as personal effect to avoid diversion by the Diplomatic Agent also the Diplomat inability to pay for Non Inspection Fees. On my assumption, the boxes will contain more that $6M and the consignment is still left in storage house till today through a Courier Dispatch Service. The Consignment is a metal box with weight of about 242LBS (Internal dimension: W61 x H156 x D73 (cm). Effective capacity: 680 L.)Approximately. The details of the consignment including your name the official document from United Nation office in London are tagged on the Metal Trunk box. I want to use my good office and clear the Consignment and deliver it to you. If you WILL ACCEPT MY CONDITION AND want us to transact the delivery for mutual benefit, you should provide your name, Phone Number and full address, to cross check if it corresponds with the address on the official document including the name of nearest Airport around you and other details. You should send the required details to me for onward delivery. All communication must be held extremely confidential. I can get everything concluded within 24 to 48 hours upon your acceptance and proceed to your address for delivery. But it must be on the condition that you will give me 30% of the amount contained in the boxes and i must get assurance from you concerning my 30% before i will proceed. I want us to transact this business and share the money, since the shipper have abandoned it and ran away. I will pay for the Non inspection fee and arrange for the boxes to be moved out of this Airport to your address, Once i am through, i will deploy the services of a secured shipping Company to provide the security it needs to your doorstep. or i can bring it by myself to avoid any more trouble. But i will share it 70% to you and 30% to me. But you have to assure me of my 30%. do respond to me if you are interested to conclude this with me. Please strictly reply to my private email; adesilgo...@gmail.com Looking forward to hear from you Best Regards, Peter Douglas INSPECTION OFFICER
CAN I TRUST YOU?
Dear Beneficiary I am Peter Douglas Director Inspection Unit United Nations Inspection Agent in Hartsfield–Jackson Atlanta International Airport Atlanta GA. We are rounding up for the last Quater of the auditing, all abandom Consignment in US Airports are being transfer to our facilities here for inspection and confiscation. During our investigation, I discovered An abandoned shipment on your name which was transferred to our facility here in Hartsfield–Jackson Atlanta International Airport and when scanned it revealed an undisclosed sum of money in a Metal Trunk Box. The consignment was abandoned because the Content was not properly declared by the consignee as money, rather it was declared as personal effect to avoid diversion by the Diplomatic Agent also the Diplomat inability to pay for Non Inspection Fees. On my assumption, the boxes will contain more that $6M and the consignment is still left in storage house till today through a Courier Dispatch Service. The Consignment is a metal box with weight of about 242LBS (Internal dimension: W61 x H156 x D73 (cm). Effective capacity: 680 L.)Approximately. The details of the consignment including your name the official document from United Nation office in London are tagged on the Metal Trunk box. I want to use my good office and clear the Consignment and deliver it to you. If you WILL ACCEPT MY CONDITION AND want us to transact the delivery for mutual benefit, you should provide your name, Phone Number and full address, to cross check if it corresponds with the address on the official document including the name of nearest Airport around you and other details. You should send the required details to me for onward delivery. All communication must be held extremely confidential. I can get everything concluded within 24 to 48 hours upon your acceptance and proceed to your address for delivery. But it must be on the condition that you will give me 30% of the amount contained in the boxes and i must get assurance from you concerning my 30% before i will proceed. I want us to transact this business and share the money, since the shipper have abandoned it and ran away. I will pay for the Non inspection fee and arrange for the boxes to be moved out of this Airport to your address, Once i am through, i will deploy the services of a secured shipping Company to provide the security it needs to your doorstep. or i can bring it by myself to avoid any more trouble. But i will share it 70% to you and 30% to me. But you have to assure me of my 30%. do respond to me if you are interested to conclude this with me. Please strictly reply to my private email; adesilgo...@gmail.com Looking forward to hear from you Best Regards, Peter Douglas INSPECTION OFFICER