date:20180518

Re: [PATCH] remoteproc: Proxy unvote clk/regs in handover context

2018-05-18 Thread Bjorn Andersson

On Wed 25 Apr 07:50 PDT 2018, Sibi Sankar wrote:

> Introduce interrupt handler for smp2p ready interrupt and
> handle start completion. Remove the proxy votes for clocks
> and regulators in the handover interrupt context. Disable
> wdog and fatal interrupts on remoteproc device stop and
> re-enable them on remoteproc device start.

Can't the enable/disable dance be split out into a separate commit?
Making the introduction of them cleaner in the git history?

> 
> Signed-off-by: Sibi Sankar 
> ---
>  drivers/remoteproc/qcom_q6v5_pil.c | 71 +-
>  1 file changed, 60 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/remoteproc/qcom_q6v5_pil.c 
> b/drivers/remoteproc/qcom_q6v5_pil.c
> index 296eb3f8b551..7e2d04d4f2f0 100644
> --- a/drivers/remoteproc/qcom_q6v5_pil.c
> +++ b/drivers/remoteproc/qcom_q6v5_pil.c
> @@ -143,6 +143,10 @@ struct q6v5 {
>   struct qcom_smem_state *state;
>   unsigned stop_bit;
>  
> + unsigned int handover_interrupt;
> + unsigned int wdog_interrupt;
> + unsigned int fatal_interrupt;

Make these "int", and write "irq" instead of "interrupt".

> +
>   struct clk *active_clks[8];
>   struct clk *proxy_clks[4];
>   int active_clk_count;
> @@ -170,6 +174,7 @@ struct q6v5 {
>   struct qcom_rproc_ssr ssr_subdev;
>   struct qcom_sysmon *sysmon;
>   bool need_mem_protection;
> + bool unvoted_flag;
>   int mpss_perm;
>   int mba_perm;
>   int version;
> @@ -727,6 +732,7 @@ static int q6v5_start(struct rproc *rproc)
>   int xfermemop_ret;
>   int ret;
>  
> + qproc->unvoted_flag = false;
>   ret = q6v5_regulator_enable(qproc, qproc->proxy_regs,
>   qproc->proxy_reg_count);
>   if (ret) {
> @@ -793,9 +799,16 @@ static int q6v5_start(struct rproc *rproc)
>   if (ret)
>   goto reclaim_mpss;
>  
> + enable_irq(qproc->handover_interrupt);
> + enable_irq(qproc->wdog_interrupt);
> + enable_irq(qproc->fatal_interrupt);
> +
>   ret = wait_for_completion_timeout(&qproc->start_done,
> msecs_to_jiffies(5000));
>   if (ret == 0) {
> + disable_irq(qproc->handover_interrupt);
> + disable_irq(qproc->wdog_interrupt);
> + disable_irq(qproc->fatal_interrupt);
>   dev_err(qproc->dev, "start timed out\n");
>   ret = -ETIMEDOUT;
>   goto reclaim_mpss;
> @@ -809,11 +822,6 @@ static int q6v5_start(struct rproc *rproc)
>   "Failed to reclaim mba buffer system may become 
> unstable\n");
>   qproc->running = true;
>  
> - q6v5_clk_disable(qproc->dev, qproc->proxy_clks,
> -  qproc->proxy_clk_count);
> - q6v5_regulator_disable(qproc, qproc->proxy_regs,
> -qproc->proxy_reg_count);
> -
>   return 0;
>  
>  reclaim_mpss:
> @@ -892,6 +900,16 @@ static int q6v5_stop(struct rproc *rproc)
>   WARN_ON(ret);
>  
>   reset_control_assert(qproc->mss_restart);
> + disable_irq(qproc->handover_interrupt);
> + if (!qproc->unvoted_flag) {
> + q6v5_clk_disable(qproc->dev, qproc->proxy_clks,
> +  qproc->proxy_clk_count);
> + q6v5_regulator_disable(qproc, qproc->proxy_regs,
> +qproc->proxy_reg_count);
> + }

Perhaps break this out into a separate function and call it from the two
places?

> + disable_irq(qproc->wdog_interrupt);
> + disable_irq(qproc->fatal_interrupt);

Any particular reason why you didn't group the disable_irq() calls
together? Would look prettier than spreading them on each side of the
resource disable.

> +
>   q6v5_clk_disable(qproc->dev, qproc->active_clks,
>qproc->active_clk_count);
>   q6v5_regulator_disable(qproc, qproc->active_regs,
[..]
> @@ -1184,19 +1221,31 @@ static int q6v5_probe(struct platform_device *pdev)
>  
>   qproc->version = desc->version;
>   qproc->need_mem_protection = desc->need_mem_protection;
> - ret = q6v5_request_irq(qproc, pdev, "wdog", q6v5_wdog_interrupt);
> + ret = q6v5_request_irq(qproc, pdev, "wdog", q6v5_wdog_interrupt,
> +&qproc->wdog_interrupt);

I think it's time to inline this function instead. You can omit the
first error handling and rely on request_irq to fail if you pass it an
invalid irq number.

> + if (ret < 0)
> + goto free_rproc;
> + disable_irq(qproc->wdog_interrupt);

I presume this is to balance the IRQ enable/disable later?

> +

Regards,
Bjorn

Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

2018-05-18 Thread Alexey Brodkin

Hi Russel,

On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote:
> It's necessary.  Take a moment to think carefully about this:
> 
> dma_map_single(, dir)
> 
> dma_sync_single_for_cpu(, dir)
> 
> dma_sync_single_for_device(, dir)
> 
> dma_unmap_single(, dir)
> 
> In the case of a DMA-incoherent architecture, the operations done at each
> stage depend on the direction argument:
> 
> map for_cpu for_device  unmap
> TO_DEV  writeback   nonewriteback   none
> TO_CPU  invalidate  invalidate* invalidate  invalidate*
> BIDIR   writeback   invalidate  writeback   invalidate
> 
> * - only necessary if the CPU speculatively prefetches.

I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even
if CPU doesn't preferch - what if we reuse the same buffer for multiple
reads from DMA device?

> The multiple invalidations for the TO_CPU case handles different
> conditions that can result in data corruption, and for some CPUs, all
> four are necessary.

I would agree that map()/unmap() a quite a special cases and so depending
on direction we need to execute in them either for_cpu() or for_device()
call-backs depending on direction.

As for invalidation in case of for_device(TO_CPU) I still don't see
a rationale behind it. Would be interesting to see a real example where
we benefit from this.

> This is what is implemented for 32-bit ARM, depending on the CPU
> capabilities, as we have DMA incoherent devices and we have CPUs that
> speculatively prefetch data, and so may load data into the caches while
> DMA is in operation.
> 
> 
> Things get more interesting if the implementation behind the DMA API has
> to copy data between the buffer supplied to the mapping and some DMA
> accessible buffer:
> 
> map for_cpu for_device  unmap
> TO_DEV  copy to dma nonecopy to dma none
> TO_CPU  nonecopy to cpu nonecopy to cpu
> BIDIR   copy to dma copy to cpu copy to dma copy to cpu
> 
> So, in both cases, the value of the direction argument defines what you
> need to do in each call.

Interesting enough in your seond table (which describes more complicated
case indeed) you set "none" for for_device(TO_CPU) which looks logical
to me.

So IMHO that's what make sense:
>8-
map for_cpu for_device  unmap
TO_DEV  writeback   nonewriteback   none
TO_CPU  noneinvalidate  noneinvalidate*
BIDIR   writeback   invalidate  writeback   invalidate*
>8-

* is the case for prefetching CPU.

-Alexey

drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1001:52-53: asic_setup: first occurrence line 1004, second occurrence line 1031 (fwd)

2018-05-18 Thread Julia Lawall

The structure has two initializations of the field asic_setup.

julia

-- Forwarded message --
Date: Sat, 19 May 2018 02:35:04 +0800
From: kbuild test robot 
To: kbu...@01.org
Cc: Julia Lawall 
Subject: drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1001:52-53:
asic_setup: first occurrence line 1004, second occurrence line 1031

CC: kbuild-...@01.org
CC: linux-kernel@vger.kernel.org
TO: Rex Zhu 
CC: Alex Deucher 

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   3acf4e395260e3bd30a6fa29ba7eada4bf7566ca
commit: c425688520990d6cec769faaa97f4af45d361fd1 drm/amd/pp: Replace rv_* with 
smu10_*
date:   9 weeks ago
:: branch date: 20 hours ago
:: commit date: 9 weeks ago

>> drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1001:52-53: asic_setup: 
>> first occurrence line 1004, second occurrence line 1031

# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c425688520990d6cec769faaa97f4af45d361fd1
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git remote update linus
git checkout c425688520990d6cec769faaa97f4af45d361fd1
vim +1001 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c

b01a4f48 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.cEric Huang 
2018-02-06  1000
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06 @1001  static const struct pp_hwmgr_func smu10_hwmgr_funcs = {
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1002  .backend_init = smu10_hwmgr_backend_init,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1003  .backend_fini = smu10_hwmgr_backend_fini,
a960d61c drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.cRex Zhu
2017-05-11 @1004  .asic_setup = NULL,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1005  .apply_state_adjust_rules = 
smu10_apply_state_adjust_rules,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1006  .force_dpm_level = smu10_dpm_force_dpm_level,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1007  .get_power_state_size = smu10_get_power_state_size,
a960d61c drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.cRex Zhu
2017-05-11  1008  .powerdown_uvd = NULL,
a960d61c drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.cRex Zhu
2017-05-11  1009  .powergate_uvd = NULL,
a960d61c drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.cRex Zhu
2017-05-11  1010  .powergate_vce = NULL,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1011  .get_mclk = smu10_dpm_get_mclk,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1012  .get_sclk = smu10_dpm_get_sclk,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1013  .patch_boot_state = smu10_dpm_patch_boot_state,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1014  .get_pp_table_entry = smu10_dpm_get_pp_table_entry,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1015  .get_num_of_pp_table_entries = 
smu10_dpm_get_num_of_pp_table_entries,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1016  .set_cpu_power_state = smu10_set_cpu_power_state,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1017  .store_cc6_data = smu10_store_cc6_data,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1018  .force_clock_level = smu10_force_clock_level,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1019  .print_clock_levels = smu10_print_clock_levels,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1020  .get_dal_power_level = smu10_get_dal_power_level,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1021  .get_performance_level = smu10_get_performance_level,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1022  .get_current_shallow_sleep_clocks = 
smu10_get_current_shallow_sleep_clocks,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1023  .get_clock_by_type_with_latency = 
smu10_get_clock_by_type_with_latency,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1024  .get_clock_by_type_with_voltage = 
smu10_get_clock_by_type_with_voltage,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018-03-06  1025  .get_max_high_clocks = smu10_get_max_high_clocks,
c4256885 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c Rex Zhu
2018

[PATCH] crypto: chtls - fix a missing-check bug

2018-05-18 Thread Wenwen Wang

In do_chtls_setsockopt(), the tls crypto info is first copied from the
poiner 'optval' in userspace and saved to 'tmp_crypto_info'. Then the
'version' of the crypto info is checked. If the version is not as expected,
i.e., TLS_1_2_VERSION, error code -ENOTSUPP is returned to indicate that
the provided crypto info is not supported yet. Then, the 'cipher_type'
field of the 'tmp_crypto_info' is also checked to see if it is
TLS_CIPHER_AES_GCM_128. If it is, the whole struct of
tls12_crypto_info_aes_gcm_128 is copied from the pointer 'optval' and then
the function chtls_setkey() is invoked to set the key.

Given that the 'optval' pointer resides in userspace, a malicious userspace
process can race to change the data pointed by 'optval' between the two
copies. For example, a user can provide a crypto info with TLS_1_2_VERSION
and TLS_CIPHER_AES_GCM_128. After the first copy, the user can modify the
'version' and the 'cipher_type' fields to any versions and/or cipher types
that are not allowed. This way, the user can bypass the checks, inject
bad data to the kernel, cause chtls_setkey() to set a wrong key or other
issues.

This patch reuses the data copied in the first try so as to ensure these
checks will not be bypassed.

Signed-off-by: Wenwen Wang 
---
 drivers/crypto/chelsio/chtls/chtls_main.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/chelsio/chtls/chtls_main.c 
b/drivers/crypto/chelsio/chtls/chtls_main.c
index 007c45c..39aab05 100644
--- a/drivers/crypto/chelsio/chtls/chtls_main.c
+++ b/drivers/crypto/chelsio/chtls/chtls_main.c
@@ -491,9 +491,13 @@ static int do_chtls_setsockopt(struct sock *sk, int 
optname,
 
switch (tmp_crypto_info.cipher_type) {
case TLS_CIPHER_AES_GCM_128: {
-   rc = copy_from_user(crypto_info, optval,
-   sizeof(struct
-  tls12_crypto_info_aes_gcm_128));
+   /* Obtain version and type from previous copy */
+   crypto_info[0] = tmp_crypto_info;
+   /* Now copy the following data */
+   rc = copy_from_user((char *)crypto_info + sizeof(*crypto_info),
+   optval + sizeof(*crypto_info),
+   sizeof(struct tls12_crypto_info_aes_gcm_128)
+   - sizeof(*crypto_info));
 
if (rc) {
rc = -EFAULT;
-- 
2.7.4

Re: [PATCH v4 3/3] fs: Add aio iopriority support for block_dev

2018-05-18 Thread Adam Manzanares



On 5/18/18 9:06 AM, Christoph Hellwig wrote:
> Looks fine, although I'd split it into a aio and block_dev patch.
> 
> Also please wire this up for the fs/iomap.c direct I/O code, it should
> be essentially the same sniplet as in the block_dev.c code.
> 

Will do.

Re: [PATCH v4 2/3] fs: Convert kiocb rw_hint from enum to u16

2018-05-18 Thread Adam Manzanares



On 5/18/18 9:05 AM, Christoph Hellwig wrote:
>> +/* ki_hint changed from enum to u16, make sure rw_hint fits into u16 */
> 
> I don't think this comment is very useful.
> 
>> +static inline u16 ki_hint_valid(enum rw_hint hint)
> 
> I'd call this ki_hint_validate.
> 
>> +{
>> +if (hint > MAX_KI_HINT)
>> +return 0;
>> +
>> +return hint;
> 
> Nit: kill the empty line.
> 

I'll clean this up in the next revision.

Re: [PATCH v7 2/2] PCI: mediatek: Using chained IRQ to setup IRQ handle

2018-05-18 Thread Bjorn Helgaas

On Fri, May 04, 2018 at 01:47:33PM +0800, honghui.zh...@mediatek.com wrote:
> From: Honghui Zhang 
> 
> Using irq_chip solution to setup IRQs in order to consist
> with IRQ framework.
> 
> Signed-off-by: Honghui Zhang 
> Acked-by: Ryder Lee 
> ---
>  drivers/pci/host/pcie-mediatek.c | 206 
> ++-
>  1 file changed, 115 insertions(+), 91 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-mediatek.c 
> b/drivers/pci/host/pcie-mediatek.c
> index c3dc549..dabf1086 100644
> --- a/drivers/pci/host/pcie-mediatek.c
> +++ b/drivers/pci/host/pcie-mediatek.c
> ...

> -static int mtk_pcie_msi_map(struct irq_domain *domain, unsigned int irq,
> - irq_hw_number_t hwirq)
> +static struct msi_domain_info mtk_msi_domain_info = {

I think this patch should be amended to include this:

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 0d0177ce436c..368b70d9371b 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -193,7 +193,7 @@ config PCIE_MEDIATEK
bool "MediaTek PCIe controller"
depends on (ARM || ARM64) && (ARCH_MEDIATEK || COMPILE_TEST)
depends on OF
-   depends on PCI
+   depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
help
  Say Y here if you want to enable PCIe controller support on


Lorenzo, if you want to fold that in and update your branch, I can pull it.
If not, I can add a patch on top, which should only break compile-testing
bisection.

Re: [PATCH] mtd: spi-nor: add support for Microchip 25LC256

2018-05-18 Thread Marek Vasut

On 05/18/2018 03:00 PM, Radu Pirea wrote:
> 
> 
> On 05/18/2018 01:03 PM, Marek Vasut wrote:
>> On 05/18/2018 11:50 AM, Radu Pirea wrote:
>>>
>>>
>>> On 05/16/2018 04:47 PM, Marek Vasut wrote:
 On 05/16/2018 12:05 PM, Radu Pirea wrote:
> On Wed, 2018-05-16 at 00:17 +0200, Marek Vasut wrote:
>> On 05/15/2018 06:22 PM, Radu Pirea wrote:
>>> On Fri, 2018-05-04 at 20:40 +0200, Boris Brezillon wrote:
 On Fri, 4 May 2018 18:54:04 +0300
 Radu Pirea  wrote:

> Added geometry description for Microchip 25LC256 memory.

 Same as for the dataflash stuff you posted a few weeks ago: I
 don't
 think this device belongs in the SPI NOR framework.
>>>
>>> Hi Boris,
>>>
>>> 25lc256 memory is similar with mr25h256, the only difference is the
>>> page size(64 vs 256). Because mr25h256 is already in SPI NOR
>>> framework
>>> I added here 25lc256.
>>
>> I think I must be reading the wrong datasheet, but can you show me
>> how
>> does it support things like READID opcode ?
>>
> Hi Marek,
>
> I read the datasheet for 25lc256 and for mr25h256 and none of them
> supports READID. Is this required for a chip to be included in spi-nor
> framework? I just followed the mr25h256 as an example.

 So I thought until you pointed out the MR25 devices.

 Does the 25LC device need erase or not ? I think the MR25s didn't,
 but I
 might be wrong.
>>>
>>> You are right. MR25s does not need erase and the same thing is true for
>>> 25LC.
>>
>> Oh. And the command set is (except for readid) comparable to SPI NOR ?
>>
 Maybe the framework could support the 25LC afterall.

>>> Yes, this was my impression too.
>>
>> I am still thinking that the AT25 driver might be a better fit for such
>> devices. Can you take a look ?
> 
> I tested the memory with at25 driver and it works with no line changed
> in driver. :)

So AT25 it is then ?

-- 
Best regards,
Marek Vasut

Re: [PATCH] audit: use existing session info function

2018-05-18 Thread Paul Moore

On Thu, May 17, 2018 at 10:01 PM, Richard Guy Briggs  wrote:
> Use the existing audit_log_session_info() function rather than
> hardcoding its functionality.
>
> Signed-off-by: Richard Guy Briggs 
> ---
>  kernel/auditfilter.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Merged into audit/next, thanks.

> diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> index d7a807e..9e87377 100644
> --- a/kernel/auditfilter.c
> +++ b/kernel/auditfilter.c
> @@ -1089,8 +1089,6 @@ static void audit_list_rules(int seq, struct 
> sk_buff_head *q)
>  static void audit_log_rule_change(char *action, struct audit_krule *rule, 
> int res)
>  {
> struct audit_buffer *ab;
> -   uid_t loginuid = from_kuid(&init_user_ns, 
> audit_get_loginuid(current));
> -   unsigned int sessionid = audit_get_sessionid(current);
>
> if (!audit_enabled)
> return;
> @@ -1098,7 +1096,7 @@ static void audit_log_rule_change(char *action, struct 
> audit_krule *rule, int re
> ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> if (!ab)
> return;
> -   audit_log_format(ab, "auid=%u ses=%u" ,loginuid, sessionid);
> +   audit_log_session_info(ab);
> audit_log_task_context(ab);
> audit_log_format(ab, " op=%s", action);
> audit_log_key(ab, rule->filterkey);
> --
> 1.8.3.1
>



-- 
paul moore
www.paul-moore.com

Re: [PATCH INTERNAL 3/3] PCI: iproc: Disable MSI parsing in certain PAXC blocks

2018-05-18 Thread Ray Jui


Hi Lorenzo,

On 5/18/2018 6:56 AM, Lorenzo Pieralisi wrote:

On Fri, May 18, 2018 at 02:53:41PM +0530, p...@codeaurora.org wrote:

On 2018-05-17 22:51, Ray Jui wrote:

The internal MSI parsing logic in certain revisions of PAXC root
complexes does not work properly and can casue corruptions on the
writes. They need to be disabled

Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
drivers/pci/host/pcie-iproc.c | 34 --
1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index 3c76c5f..b906d80 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -1144,10 +1144,22 @@ static int iproc_pcie_paxb_v2_msi_steer(struct
iproc_pcie *pcie, u64 msi_addr)
return ret;
}

-static void iproc_pcie_paxc_v2_msi_steer(struct iproc_pcie *pcie, u64
msi_addr)
+static void iproc_pcie_paxc_v2_msi_steer(struct iproc_pcie *pcie, u64
msi_addr,
+bool enable)
{
u32 val;

+   if (!enable) {
+   /*
+* Disable PAXC MSI steering. All write transfers will be
+* treated as non-MSI transfers
+*/
+   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_MSI_EN_CFG);
+   val &= ~MSI_ENABLE_CFG;
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_MSI_EN_CFG, val);
+   return;
+   }
+
/*
 * Program bits [43:13] of address of GITS_TRANSLATER register into
 * bits [30:0] of the MSI base address register.  In fact, in all iProc
@@ -1201,7 +1213,7 @@ static int iproc_pcie_msi_steer(struct iproc_pcie
*pcie,
return ret;
break;
case IPROC_PCIE_PAXC_V2:
-   iproc_pcie_paxc_v2_msi_steer(pcie, msi_addr);
+   iproc_pcie_paxc_v2_msi_steer(pcie, msi_addr, true);


Are you calling iproc_pcie_paxc_v2_msi_steer() anywhere else with 'false' ?
I see its getting called only from one place in current code
iproc_pcie_msi_steer().


It is called below with the false field to disable MSIs. That's probably
one reason more to create a function to enable/disable MSIs instead of
adding a parameter to iproc_pcie_paxc_v2_msi_steer().


Correct. Thanks for helping to explain.



Which brings me to the question: what happens to those MSIs writes
when MSIs are disabled according to this patch ? Are they terminated
at the root complex ?


Note the PAXC block parses MSI writes from our internally connected 
endpoints (i.e., an embedded network processor). The PAXC block examines 
these MSI writes and modifies the memory attributes (to DEVICE) of these 
data and then send them out to the AXI bus. These MSI writes will then 
be forwarded to the GIC (e.g., GICv2m, GICv3-ITS from ARM) for further 
processing. This is saying, PAXC itself does not process these MSI 
writes. They are processed by the GIC and associated interrupt will be 
generated form the GIC. PAXC's job is to parse and tag them properly so 
these MSI writes can reach the GIC, and at the same, reach the GIC at 
the right order.


On some of these problematic PAXCs, we are being forced to disable this 
PAXC internal parsing logic. In this case, we set up static mapping with 
the IOMMU to modify the memory attributes of these MSI writes. These MSI 
writes are always designated towards a specific memory address (e.g., on 
the GICv3 based system, it's the address of the translator register), 
and that's why static mapping table can be set up to work around this.




Lorenzo


break;
default:
return -EINVAL;
@@ -1427,6 +1439,24 @@ int iproc_pcie_remove(struct iproc_pcie *pcie)
}
EXPORT_SYMBOL(iproc_pcie_remove);

+/*
+ * The MSI parsing logic in certain revisions of Broadcom PAXC based root
+ * complex does not work and needs to be disabled
+ */
+static void quirk_paxc_disable_msi_parsing(struct pci_dev *pdev)
+{
+   struct iproc_pcie *pcie = iproc_data(pdev->bus);
+
+   if (pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+   iproc_pcie_paxc_v2_msi_steer(pcie, 0, false);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16f0,
+   quirk_paxc_disable_msi_parsing);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802,
+   quirk_paxc_disable_msi_parsing);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804,
+   quirk_paxc_disable_msi_parsing);
+
MODULE_AUTHOR("Ray Jui ");
MODULE_DESCRIPTION("Broadcom iPROC PCIe common driver");
MODULE_LICENSE("GPL v2");

Re: [PATCH] isdn: eicon: fix a missing-check bug

2018-05-18 Thread Wenwen Wang

Thanks for your suggestion, David! I will revise the patch and resubmit it.

Wenwen

On Fri, May 11, 2018 at 2:50 PM, David Miller  wrote:
> From: Wenwen Wang 
> Date: Sat,  5 May 2018 14:32:46 -0500
>
>> To avoid such issues, this patch adds a check after the second copy in the
>> function diva_xdi_write(). If the adapter number is not equal to the one
>> obtained in the first copy, (-4) will be returned to divas_write(), which
>> will then return an error code -EINVAL.
>
> Better fix is to copy the msg header once into an on-stack buffer supplied
> by diva_write() to diva_xdi_open_adapter(), which is then passed on to
> diva_xdi_write() with an adjusted src pointer and length.

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread hpa

On May 18, 2018 12:36:20 PM PDT, Nadav Amit  wrote:
>h...@zytor.com wrote:
>
>> On May 18, 2018 12:21:00 PM PDT, Linus Torvalds
> wrote:
>>> On Fri, May 18, 2018 at 12:18 PM Nadav Amit 
>wrote:
>>> 
 Gnu ASM manual says: "Each time you run as it assembles exactly one
>>> source
 program. The source program is made up of one or more files.”
>>> 
>>> Ok, that counts as documentation, although it's confusing as hell.
>Is
>>> it
>>> one source program or multiple files again? I see what they are
>trying
>>> to
>>> say, but it really could be phrased more clearly ;)
>>> 
>>>Linus
>> 
>> At least we have a solution!
>
>Thanks for all your help. I will try to do it over over the weekend.
>
>Funny, I see that this “trick” (-Wa,[filename.s]) is already being used
>to
>include arch/x86/boot/code16gcc.h so I feel “safer” to use it.
>
>Regards,
>Nadav

Oh. I don't remember when we did that... might even be my doing and then I just 
forgot about it :)
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH v6 17/28] x86/asm: use SYM_INNER_LABEL instead of GLOBAL

2018-05-18 Thread Andy Lutomirski

On Fri, May 18, 2018 at 2:17 AM Jiri Slaby  wrote:

> GLOBAL had several meanings and is going away. In this patch, convert
> all the inner function labels marked with GLOBAL to use SYM_INNER_LABEL
> instead.

> Note that retint_user needs not be global, perhaps since commit
> 2ec67971facc ("x86/entry/64/compat: Remove most of the fast system call
> machinery"), where entry_64_compat's caller was removed. So mark the
> label as LOCAL.

> -GLOBAL(entry_SYSCALL_64_after_hwframe)
> +SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)

I've missed all the context here.   I agree that GLOBAL is misleading, and
"inner label" is nice.  But this is a rather wordy macro.  Would:

INNER_LABEL_GLOBAL(name)

be better?  (With just INNER_LABEL(name) for the local version?)

Re: [PATCH v4 2/2] vfio/mdev: Re-order sysfs attribute creation

2018-05-18 Thread Kirti Wankhede



On 5/19/2018 12:40 AM, Alex Williamson wrote:
> There exists a gap at the end of mdev_device_create() where the device
> is visible to userspace, but we're not yet ready to handle removal, as
> triggered through the 'remove' attribute.  We handle this properly in
> mdev_device_remove() with an -EAGAIN return, but we can marginally
> reduce this gap by adding this attribute as a final step of our sysfs
> setup.
> 
> Signed-off-by: Alex Williamson 

Looks good.

Reviewed by: Kirti Wankhede 

> ---
>  drivers/vfio/mdev/mdev_sysfs.c |   14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
> index 802df210929b..249472f05509 100644
> --- a/drivers/vfio/mdev/mdev_sysfs.c
> +++ b/drivers/vfio/mdev/mdev_sysfs.c
> @@ -257,24 +257,24 @@ int  mdev_create_sysfs_files(struct device *dev, struct 
> mdev_type *type)
>  {
>   int ret;
>  
> - ret = sysfs_create_files(&dev->kobj, mdev_device_attrs);
> - if (ret)
> - return ret;
> -
>   ret = sysfs_create_link(type->devices_kobj, &dev->kobj, dev_name(dev));
>   if (ret)
> - goto device_link_failed;
> + return ret;
>  
>   ret = sysfs_create_link(&dev->kobj, &type->kobj, "mdev_type");
>   if (ret)
>   goto type_link_failed;
>  
> + ret = sysfs_create_files(&dev->kobj, mdev_device_attrs);
> + if (ret)
> + goto create_files_failed;
> +
>   return ret;
>  
> +create_files_failed:
> + sysfs_remove_link(&dev->kobj, "mdev_type");
>  type_link_failed:
>   sysfs_remove_link(type->devices_kobj, dev_name(dev));
> -device_link_failed:
> - sysfs_remove_files(&dev->kobj, mdev_device_attrs);
>   return ret;
>  }
>  
>

Re: [PATCH v4 0/2] vfio/mdev: Device namespace protection

2018-05-18 Thread Kirti Wankhede



On 5/19/2018 12:40 AM, Alex Williamson wrote:
> v4: Fix the 'create' racing 'remove' gap noted by Kirti by moving
> removal from mdev_list to mdev_device_release().  Fix missing
> mdev_put_parent() cases in mdev_device_create(), also noted
> by Kirti.  Added documention update regarding serialization as
> noted by Cornelia.  Added additional commit log comment about
> -EAGAIN vs -ENODEV for 'remove' racing 'create'.  Added second
> patch to re-order sysfs attributes, with this my targeted
> scripts can no longer hit the gap where -EAGAIN is regurned.
> BTW, the gap where the current code returns -ENODEV in this
> race condition is about 50% easier to hit than it exists in
> this series with patch 1 alone.
> 

Thanks for fixing this. This patch set looks good to me.

Thanks,
Kirti

> Thanks,
> Alex
> 
> ---
> 
> Alex Williamson (2):
>   vfio/mdev: Check globally for duplicate devices
>   vfio/mdev: Re-order sysfs attribute creation
> 
> 
>  Documentation/vfio-mediated-device.txt |5 ++
>  drivers/vfio/mdev/mdev_core.c  |  102 
> +++-
>  drivers/vfio/mdev/mdev_private.h   |2 -
>  drivers/vfio/mdev/mdev_sysfs.c |   14 ++--
>  4 files changed, 49 insertions(+), 74 deletions(-)
>

Re: [PATCH v4 1/2] vfio/mdev: Check globally for duplicate devices

2018-05-18 Thread Kirti Wankhede



On 5/19/2018 12:40 AM, Alex Williamson wrote:
> When we create an mdev device, we check for duplicates against the
> parent device and return -EEXIST if found, but the mdev device
> namespace is global since we'll link all devices from the bus.  We do
> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but
> with it comes a kernel warning and stack trace for trying to create
> duplicate sysfs links, which makes it an undesirable response.
> 
> Therefore we should really be looking for duplicates across all mdev
> parent devices, or as implemented here, against our mdev device list.
> Using mdev_list to prevent duplicates means that we can remove
> mdev_parent.lock, but in order not to serialize mdev device creation
> and removal globally, we add mdev_device.active which allows UUIDs to
> be reserved such that we can drop the mdev_list_lock before the mdev
> device is fully in place.
> 
> Two behavioral notes; first, mdev_parent.lock had the side-effect of
> serializing mdev create and remove ops per parent device.  This was
> an implementation detail, not an intentional guarantee provided to
> the mdev vendor drivers.  Vendor drivers can trivially provide this
> serialization internally if necessary.  Second, review comments note
> the new -EAGAIN behavior when the device, and in particular the remove
> attribute, becomes visible in sysfs.  If a remove is triggered prior
> to completion of mdev_device_create() the user will see a -EAGAIN
> error.  While the errno is different, receiving an error during this
> period is not, the previous implementation returned -ENODEV for the
> same condition.  Furthermore, the consistency to the user is improved
> in the case where mdev_device_remove_ops() returns error.  Previously
> concurrent calls to mdev_device_remove() could see the device
> disappear with -ENODEV and return in the case of error.  Now a user
> would see -EAGAIN while the device is in this transitory state.
> 
> Signed-off-by: Alex Williamson 

Looks good to me.

Reviewed by: Kirti Wankhede 

> ---
>  Documentation/vfio-mediated-device.txt |5 ++
>  drivers/vfio/mdev/mdev_core.c  |  102 
> +++-
>  drivers/vfio/mdev/mdev_private.h   |2 -
>  3 files changed, 42 insertions(+), 67 deletions(-)
> 
> diff --git a/Documentation/vfio-mediated-device.txt 
> b/Documentation/vfio-mediated-device.txt
> index 1b3950346532..c3f69bcaf96e 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -145,6 +145,11 @@ The functions in the mdev_parent_ops structure are as 
> follows:
>  * create: allocate basic resources in a driver for a mediated device
>  * remove: free resources in a driver when a mediated device is destroyed
>  
> +(Note that mdev-core provides no implicit serialization of create/remove
> +callbacks per mdev parent device, per mdev type, or any other categorization.
> +Vendor drivers are expected to be fully asynchronous in this respect or
> +provide their own internal resource protection.)
> +
>  The callbacks in the mdev_parent_ops structure are as follows:
>  
>  * open: open callback of mediated device
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index 126991046eb7..0212f0ee8aea 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev)
>  }
>  EXPORT_SYMBOL(mdev_uuid);
>  
> -static int _find_mdev_device(struct device *dev, void *data)
> -{
> - struct mdev_device *mdev;
> -
> - if (!dev_is_mdev(dev))
> - return 0;
> -
> - mdev = to_mdev_device(dev);
> -
> - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0)
> - return 1;
> -
> - return 0;
> -}
> -
> -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid)
> -{
> - struct device *dev;
> -
> - dev = device_find_child(parent->dev, &uuid, _find_mdev_device);
> - if (dev) {
> - put_device(dev);
> - return true;
> - }
> -
> - return false;
> -}
> -
>  /* Should be called holding parent_list_lock */
>  static struct mdev_parent *__find_parent_device(struct device *dev)
>  {
> @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct 
> mdev_parent_ops *ops)
>   }
>  
>   kref_init(&parent->ref);
> - mutex_init(&parent->lock);
>  
>   parent->dev = dev;
>   parent->ops = ops;
> @@ -297,6 +268,10 @@ static void mdev_device_release(struct device *dev)
>  {
>   struct mdev_device *mdev = to_mdev_device(dev);
>  
> + mutex_lock(&mdev_list_lock);
> + list_del(&mdev->next);
> + mutex_unlock(&mdev_list_lock);
> +
>   dev_dbg(&mdev->dev, "MDEV: destroying\n");
>   kfree(mdev);
>  }
> @@ -304,7 +279,7 @@ static void mdev_device_release(struct device *dev)
>  int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le 
> uuid)
>  {
>   int ret;
> -

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Nadav Amit

h...@zytor.com wrote:

> On May 18, 2018 12:21:00 PM PDT, Linus Torvalds 
>  wrote:
>> On Fri, May 18, 2018 at 12:18 PM Nadav Amit  wrote:
>> 
>>> Gnu ASM manual says: "Each time you run as it assembles exactly one
>> source
>>> program. The source program is made up of one or more files.”
>> 
>> Ok, that counts as documentation, although it's confusing as hell. Is
>> it
>> one source program or multiple files again? I see what they are trying
>> to
>> say, but it really could be phrased more clearly ;)
>> 
>>Linus
> 
> At least we have a solution!

Thanks for all your help. I will try to do it over over the weekend.

Funny, I see that this “trick” (-Wa,[filename.s]) is already being used to
include arch/x86/boot/code16gcc.h so I feel “safer” to use it.

Regards,
Nadav

Re: [PATCH] perf tests: Fix regex for record+probe_libc_inet_pton.sh

2018-05-18 Thread Arnaldo Carvalho de Melo

Em Fri, May 18, 2018 at 04:29:16PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, May 18, 2018 at 04:21:02PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Here it went from failing with:
> > 
> > [root@seventh ~]# perf test -v pton
> > 64: probe libc's inet_pton & backtrace it with ping   :
> > --- start ---
> > test child forked, pid 22590
> > ping 22607 [001] 12782.425689: probe_libc:inet_pton: (7f8686da4e40)
> > 7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> > FAIL: expected backtrace entry 1 
> > ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.26.so|inlined\)$" got 
> > "7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)"
> > test child finished with -1
> >  end 
> > probe libc's inet_pton & backtrace it with ping: FAILED!
> > [root@seventh ~]# 
> > 
> > To failing with:
> > 
> > [root@seventh ~]# perf test -v pton
> > 64: probe libc's inet_pton & backtrace it with ping   :
> > --- start ---
> > test child forked, pid 28954
> > ping 28971 [002] 14277.711200: probe_libc:inet_pton: (7fc9d66e3e40)
> > 7fc9d66e3e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> > 7fc9d66b02b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
> > 56075cb98f40 [unknown] (/usr/bin/ping)
> > FAIL: expected backtrace entry 3 
> > ".*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" got "56075cb98f40 
> > [unknown] (/usr/bin/ping)"
> > test child finished with -1
> >  end 
> > probe libc's inet_pton & backtrace it with ping: FAILED!
> > [root@seventh ~]# 
> > 
> > Trying to figure this out...
> 
> [root@seventh perf]# perf script
> ping 29170 [001] 14644.810782: probe_libc:inet_pton: (7f0bac71be40)
> 7f0bac71be40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> 7f0bac6e82b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
> 5585e10adf40 [unknown] (/usr/bin/ping)
> 
> [root@seventh perf]# perf report --mmaps
> #  pid  tid ppid  comm
>  00   -1 |swapper
>  2917029170   -1 |ping
> 5585e10ab000-5585e12dd000 r-xp  
> 2771393 /usr/bin/ping
> 7f0babfcf000-7f0bac1ed000 r-xp  
> 2762913 /usr/lib64/libpthread-2.26.so
> 7f0bac1ed000-7f0bac3f1000 r-xp  
> 2753363 /usr/lib64/libdl-2.26.so
> 7f0bac3f1000-7f0bac608000 r-xp  
> 2760078 /usr/lib64/libz.so.1.2.11
> 7f0bac608000-7f0bac9be000 r-xp  
> 2753359 /usr/lib64/libc-2.26.so
> 7f0bac9be000-7f0bacd09000 r-xp  
> 2762901 /usr/lib64/libm-2.26.so
> 7f0bacd09000-7f0bacf2 r-xp  
> 2762915 /usr/lib64/libresolv-2.26.so
> 7f0bacf2-7f0bad3a8000 r-xp  
> 2764213 /usr/lib64/libcrypto.so.1.1.0h
> 7f0bad3a8000-7f0bad5dc000 r-xp  
> 2761070 /usr/lib64/libidn.so.11.6.18
> 7f0bad5dc000-7f0bad7e1000 r-xp  
> 2760189 /usr/lib64/libcap.so.2.25
> 7f0bad7e1000-7f0bada08000 r-xp  
> 2753353 /usr/lib64/ld-2.26.so
> 7ffe23d99000-7ffe23d9b000 r-xp  0 
> [vdso]
> [root@seventh perf]# 
> 
> So it is there, but I don't have debuginfo for ping, lets see if I add
> it...
> 
> [root@seventh perf]# rpm -qf `which ping`
> iputils-20161105-7.fc27.x86_64
> [root@seventh perf]# 
> [root@seventh perf]# dnf debuginfo-install iputils
> 
> [root@seventh perf]# perf script
> ping 29170 [001] 14644.810782: probe_libc:inet_pton: (7f0bac71be40)
> 7f0bac71be40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> 7f0bac6e82b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
> 5585e10adf40 main+0x880 (/usr/bin/ping)
> 
> [root@seventh perf]# 
> 
> So we need to check if debuginfo is available, and if so, expect it to
> resolve that entry to main, if not, expect it to _not_ resolve and have
> [unknown]  instead...

But tis is not related to your recent patches, so I'm folding this last
one with the one adding the offsets and go on from there...

- Arnaldo

Re: [PATCH] perf tests: Fix regex for record+probe_libc_inet_pton.sh

2018-05-18 Thread Arnaldo Carvalho de Melo

Em Fri, May 18, 2018 at 04:21:02PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, May 18, 2018 at 12:54:17PM +0530, Sandipan Das escreveu:
> > This test currently fails because the regular expressions for
> > matching the output of perf script do not consider the symbol
> > offsets to be part of the output.
> > 
> > The symbol offsets are seen because of the default behaviour
> > introduced by commit 4140d2ea74b3 ("perf script: Show symbol
> > offsets by default").
> > 
> > Before applying this patch:
> > 
> >   # perf test -v "probe libc's inet_pton & backtrace it with ping"
> > 
> >   62: probe libc's inet_pton & backtrace it with ping   :
> >   --- start ---
> >   test child forked, pid 30389
> >   ping 30406 [002] 307144.280983: probe_libc:inet_pton: (7f4117adf220)
> >   7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
> >   FAIL: expected backtrace entry 1 
> > ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.25.so|inlined\)$" got 
> > "7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)"
> >   test child finished with -1
> >    end 
> >   probe libc's inet_pton & backtrace it with ping: FAILED!
> > 
> > After applying this patch:
> > 
> >   # perf test -v "probe libc's inet_pton & backtrace it with ping"
> > 
> >   62: probe libc's inet_pton & backtrace it with ping   :
> >   --- start ---
> >   test child forked, pid 30539
> >   ping 30556 [003] 307254.313217: probe_libc:inet_pton: (7fe19ab10220)
> >   7fe19ab10220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
> >   7fe19aad5ebd getaddrinfo+0x11d (/usr/lib64/libc-2.25.so)
> >   56351e3c1c71 main+0x891 (/usr/bin/ping)
> >   test child finished with 0
> >    end 
> >   probe libc's inet_pton & backtrace it with ping: Ok
> 
> Here it went from failing with:
> 
> [root@seventh ~]# perf test -v pton
> 64: probe libc's inet_pton & backtrace it with ping   :
> --- start ---
> test child forked, pid 22590
> ping 22607 [001] 12782.425689: probe_libc:inet_pton: (7f8686da4e40)
> 7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> FAIL: expected backtrace entry 1 
> ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.26.so|inlined\)$" got 
> "7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)"
> test child finished with -1
>  end 
> probe libc's inet_pton & backtrace it with ping: FAILED!
> [root@seventh ~]# 
> 
> To failing with:
> 
> [root@seventh ~]# perf test -v pton
> 64: probe libc's inet_pton & backtrace it with ping   :
> --- start ---
> test child forked, pid 28954
> ping 28971 [002] 14277.711200: probe_libc:inet_pton: (7fc9d66e3e40)
> 7fc9d66e3e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> 7fc9d66b02b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
> 56075cb98f40 [unknown] (/usr/bin/ping)
> FAIL: expected backtrace entry 3 
> ".*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" got "56075cb98f40 
> [unknown] (/usr/bin/ping)"
> test child finished with -1
>  end 
> probe libc's inet_pton & backtrace it with ping: FAILED!
> [root@seventh ~]# 
> 
> Trying to figure this out...

[root@seventh perf]# perf script
ping 29170 [001] 14644.810782: probe_libc:inet_pton: (7f0bac71be40)
7f0bac71be40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
7f0bac6e82b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
5585e10adf40 [unknown] (/usr/bin/ping)

[root@seventh perf]# perf report --mmaps
#  pid  tid ppid  comm
 00   -1 |swapper
 2917029170   -1 |ping
5585e10ab000-5585e12dd000 r-xp  2771393 
/usr/bin/ping
7f0babfcf000-7f0bac1ed000 r-xp  2762913 
/usr/lib64/libpthread-2.26.so
7f0bac1ed000-7f0bac3f1000 r-xp  2753363 
/usr/lib64/libdl-2.26.so
7f0bac3f1000-7f0bac608000 r-xp  2760078 
/usr/lib64/libz.so.1.2.11
7f0bac608000-7f0bac9be000 r-xp  2753359 
/usr/lib64/libc-2.26.so
7f0bac9be000-7f0bacd09000 r-xp  2762901 
/usr/lib64/libm-2.26.so
7f0bacd09000-7f0bacf2 r-xp  2762915 
/usr/lib64/libresolv-2.26.so
7f0bacf2-7f0bad3a8000 r-xp  2764213 
/usr/lib64/libcrypto.so.1.1.0h
7f0bad3a8000-7f0bad5dc000 r-xp  2761070 
/usr/lib64/libidn.so.11.6.18
7f0bad5dc000-7f0bad7e1000 r-xp  2760189 
/usr/lib64/libcap.so.2.25
7f0bad7e1000-7f0bada08000 r-xp  2753353 
/usr/lib64/ld-2.26.so
7ffe23d99000-7ffe23d9b000 r-xp  0 [vdso]
[root@seventh perf]# 

So it is there, but I don't have debuginfo for ping, lets see if I add
it...

[root@seventh perf]# rpm -qf `which ping`
iputils-20161105-7.fc27.x86_64
[root@seventh perf]# 
[root@seventh perf]# dnf debuginfo-install ipu

Re: [PATCH] perf tests: Fix regex for record+probe_libc_inet_pton.sh

2018-05-18 Thread Sandipan Das

Hi Arnaldo,

On 05/19/2018 12:51 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 18, 2018 at 12:54:17PM +0530, Sandipan Das escreveu:
>> This test currently fails because the regular expressions for
>> matching the output of perf script do not consider the symbol
>> offsets to be part of the output.
>>
>> The symbol offsets are seen because of the default behaviour
>> introduced by commit 4140d2ea74b3 ("perf script: Show symbol
>> offsets by default").
>>
>> Before applying this patch:
>>
>>   # perf test -v "probe libc's inet_pton & backtrace it with ping"
>>
>>   62: probe libc's inet_pton & backtrace it with ping   :
>>   --- start ---
>>   test child forked, pid 30389
>>   ping 30406 [002] 307144.280983: probe_libc:inet_pton: (7f4117adf220)
>>   7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
>>   FAIL: expected backtrace entry 1 
>> ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.25.so|inlined\)$" got 
>> "7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)"
>>   test child finished with -1
>>    end 
>>   probe libc's inet_pton & backtrace it with ping: FAILED!
>>
>> After applying this patch:
>>
>>   # perf test -v "probe libc's inet_pton & backtrace it with ping"
>>
>>   62: probe libc's inet_pton & backtrace it with ping   :
>>   --- start ---
>>   test child forked, pid 30539
>>   ping 30556 [003] 307254.313217: probe_libc:inet_pton: (7fe19ab10220)
>>   7fe19ab10220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
>>   7fe19aad5ebd getaddrinfo+0x11d (/usr/lib64/libc-2.25.so)
>>   56351e3c1c71 main+0x891 (/usr/bin/ping)
>>   test child finished with 0
>>    end 
>>   probe libc's inet_pton & backtrace it with ping: Ok
> 
> Here it went from failing with:
> 
> [root@seventh ~]# perf test -v pton
> 64: probe libc's inet_pton & backtrace it with ping   :
> --- start ---
> test child forked, pid 22590
> ping 22607 [001] 12782.425689: probe_libc:inet_pton: (7f8686da4e40)
> 7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> FAIL: expected backtrace entry 1 
> ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.26.so|inlined\)$" got 
> "7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)"
> test child finished with -1
>  end 
> probe libc's inet_pton & backtrace it with ping: FAILED!
> [root@seventh ~]# 
> 
> To failing with:
> 
> [root@seventh ~]# perf test -v pton
> 64: probe libc's inet_pton & backtrace it with ping   :
> --- start ---
> test child forked, pid 28954
> ping 28971 [002] 14277.711200: probe_libc:inet_pton: (7fc9d66e3e40)
> 7fc9d66e3e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
> 7fc9d66b02b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
> 56075cb98f40 [unknown] (/usr/bin/ping)
> FAIL: expected backtrace entry 3 
> ".*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" got "56075cb98f40 
> [unknown] (/usr/bin/ping)"
> test child finished with -1
>  end 
> probe libc's inet_pton & backtrace it with ping: FAILED!
> [root@seventh ~]# 
> 
> Trying to figure this out...
> 

Looks like perf failed to resolve the symbol's name for the last entry
in the callchain. I did not consider this case. So, if this happens, we
would be better off using the original regex for the last line:

expected[3]=".*\(.*/bin/ping.*\)$"

- Sandipan

Re: [PATCH v2 1/2] i2c: core-smbus: fix a potential uninitialization bug

2018-05-18 Thread Wenwen Wang

Yes, this patch does not aim to "fix" all potential driver bugs but
adds an additional protection in case the implementation of
.master_xfer is incorrect.

>From this perspective, it is still necessary to apply this patch, as
pointed out by Peter.

Thanks,
Wenwen

On Mon, May 14, 2018 at 3:31 PM, Peter Rosin  wrote:
> On 2018-05-10 13:17, Wolfram Sang wrote:
>> On Sat, May 05, 2018 at 07:57:10AM -0500, Wenwen Wang wrote:
>>> In i2c_smbus_xfer_emulated(), there are two buffers: msgbuf0 and msgbuf1,
>>> which are used to save a series of messages, as mentioned in the comment.
>>> According to the value of the variable 'size', msgbuf0 is initialized to
>>> various values. In contrast, msgbuf1 is left uninitialized until the
>>> function i2c_transfer() is invoked. However, msgbuf1 is not always
>>> initialized on all possible execution paths (implementation) of
>>> i2c_transfer(). Thus, it is possible that msgbuf1 may still be
>>> uninitialized even after the invocation of the function i2c_transfer(),
>>> especially when the return value of ic2_transfer() is not checked properly.
>>> In the following execution, the uninitialized msgbuf1 will be used, such as
>>> for security checks. Since uninitialized values can be random and
>>> arbitrary, this will cause undefined behaviors or even check bypass. For
>>> example, it is expected that if the value of 'size' is
>>> I2C_SMBUS_BLOCK_PROC_CALL, the value of data->block[0] should not be larger
>>> than I2C_SMBUS_BLOCK_MAX. But, at the end of i2c_smbus_xfer_emulated(), the
>>> value read from msgbuf1 is assigned to data->block[0], which can
>>> potentially lead to invalid block write size, as demonstrated in the error
>>> message.
>>>
>>> This patch initializes the first byte of msgbuf1 with 0 to avoid such
>>> undefined behaviors or security issues.
>>>
>>> Signed-off-by: Wenwen Wang 
>>
>> From what I can tell, this patch is not needed anymore after patch 2 is
>> applied. Correct?
>
> AFAIU, it is only needed if there are bugs elsewhere. I.e. it's for extra
> protection. If all drivers implement .master_xfer correctly, msgbuf1 will
> be filled in and the return value will be the number of messages (i.e. 2)
> OR you get a negative return value and the msgbuf1 content will not matter.
>
> The patch does not magically fix all possible driver bugs, so in that
> sense this patch is still "needed".
>
> Also - again AFAIU - there is no known bug that actually gets caught by
> this extra check.
>
> Cheers,
> Peter

Re: [PATCH 4.16 00/55] 4.16.10-stable review

2018-05-18 Thread Naresh Kamboju

On 18 May 2018 at 13:44, Greg Kroah-Hartman  wrote:
> This is the start of the stable review cycle for the 4.16.10 release.
> There are 55 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun May 20 08:14:42 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.10-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.16.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

Summary


kernel: 4.16.10-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.16.y
git commit: de63ee0383b7163aa4305ab30fd66d3e99ebfebe
git describe: v4.16.9-56-gde63ee0383b7
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.16-oe/build/v4.16.9-56-gde63ee0383b7


No regressions (compared to build v4.16.9-55-g2b695e770deb)

Boards, architectures and test suites:
-

dragonboard-410c - arm64
* boot - pass: 20, fail: 1,
* kselftest - pass: 40, fail: 1, skip: 26
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1017, skip: 133
* ltp-timers-tests - pass: 13,

hi6220-hikey - arm64
* boot - pass: 20,
* kselftest - pass: 45, skip: 22
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1016, skip: 134
* ltp-timers-tests - pass: 13,

juno-r2 - arm64
* boot - pass: 21,
* kselftest - pass: 44, skip: 24
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1017, skip: 133
* ltp-timers-tests - pass: 13,

qemu_arm
* boot - pass: 20,
* kselftest - pass: 37, skip: 31
* libhugetlbfs - pass: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 63, skip: 18
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 58, skip: 5
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 7, skip: 7
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1055, fail: 3, skip: 92
* ltp-timers-tests - pass: 13,

qemu_arm64
* boot - pass: 20,
* kselftest - pass: 41, skip: 29
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 8, skip: 6
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 993, fail: 2, skip: 155
* ltp-timers-tests - pass: 13,

qemu_x86_64
* boot - pass: 20,
* kselftest - pass: 50, skip: 3

Re: [PATCHv2][SMB3] Add kernel trace support

2018-05-18 Thread Ralph Böhme

On Thu, May 17, 2018 at 09:36:36PM -0500, Steve French via samba-technical 
wrote:
> Patch updated with additional tracepoint locations and some formatting
> improvements. There are some obvious additional tracepoints that could
> be added, but this should be a reasonable group to start with.
> 
> From edc02d6f9dc24963d510c7ef59067428d3b082d3 Mon Sep 17 00:00:00 2001
> From: Steve French 
> Date: Thu, 17 May 2018 21:16:55 -0500
> Subject: [PATCH] smb3: Add ftrace tracepoints for improved SMB3 debugging
> 
> Although dmesg logs and wireshark network traces can be
> helpful, being able to dynamically enable/disable tracepoints
> (in this case via the kernel ftrace mechanism) can also be
> helpful in more quickly debugging problems, and more
> selectively tracing the events related to the bug report.
> 
> This patch adds 12 ftrace tracepoints to cifs.ko for SMB3 events
> in some obvious locations.  Subsequent patches will add more
> as needed.
> 
> Example use:
>trace-cmd record -e cifs
>
>trace-cmd show

pardon my ignorance, but are these tracepoints usable with other tracing
frameworks like Systemtap?

Last time I checked, Systemtap looked like *the* tool. Is there a generic trace
point infrastructure that tracing tools can consume, so we're not tied to
ftrace?

Thanks!
-slow

-- 
Ralph Boehme, Samba Team   https://samba.org/
Samba Developer, SerNet GmbH   https://sernet.de/en/samba/
GPG Key Fingerprint:   FAE2 C608 8A24 2520 51C5
   59E4 AA1E 9B71 2639 9E46

Re: [PATCH v2] selftests/cgroup: memory controller self-tests

2018-05-18 Thread Shuah Khan

On 05/11/2018 02:22 PM, Shuah Khan wrote:
> On 05/11/2018 02:02 PM, Tejun Heo wrote:
>> On Fri, May 11, 2018 at 01:33:26PM -0600, Shuah Khan wrote:
>>> Yeah. I see that. You have a switch for the KSFT_ values. Since there is no
>>> dependency on the cgroup tree, I would recommend having this patch go 
>>> through
>>> kselftest tree which is the normal process for tests anyway.
>>>
>>> This version is good and I can apply this to linux-kselftest next. I ran a
>>> quick test and the Skip case looks good.
>>>
>>> TAP version 13
>>> selftests: cgroup: test_memcontrol
>>> 
>>> 1..0 # Skipped: memory controller isn't available
>>> not ok 1..1 selftests: cgroup: test_memcontrol [SKIP]
>>>
>>>
>>> Tejun! Please send me your Ack.
>>
>> Sure, please feel free to add my ack.  I'll revert the original patch
>> from cgroup tree.
>>
>> Thanks.
>>
> 
> Thanks. I will apply this to linux-kselftest next for 4.18-rc1 with
> you Ack.
> 
> thanks,
> -- Shuah
> 

Hi Roman,

Could you please send me the fix for

[bug,report] selftests: cgroup: add memory controller self-tests

It is cc'ed to kselftest mailing list and not to me. Please send
it directly for me to apply it.

thanks,
-- Shuah

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread hpa

On May 18, 2018 12:21:00 PM PDT, Linus Torvalds  
wrote:
>On Fri, May 18, 2018 at 12:18 PM Nadav Amit  wrote:
>
>> Gnu ASM manual says: "Each time you run as it assembles exactly one
>source
>> program. The source program is made up of one or more files.”
>
>Ok, that counts as documentation, although it's confusing as hell. Is
>it
>one source program or multiple files again? I see what they are trying
>to
>say, but it really could be phrased more clearly ;)
>
> Linus

At least we have a solution!
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH] kernel: sys: fix potential Spectre v1

2018-05-18 Thread Gustavo A. R. Silva




On 05/18/2018 02:04 PM, Gustavo A. R. Silva wrote:



On 05/15/2018 05:57 PM, Dan Williams wrote:
On Tue, May 15, 2018 at 3:29 PM, Thomas Gleixner  
wrote:

On Tue, 15 May 2018, Andrew Morton wrote:
On Mon, 14 May 2018 22:00:38 -0500 "Gustavo A. R. Silva" 
 wrote:



resource can be controlled by user-space, hence leading to a
potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential
spectre issue 'get_current()->signal->rlim' (local cap)
kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre 
issue

'get_current()->signal->rlim' (local cap)

Fix this by sanitizing *resource* before using it to index
current->signal->rlim

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].


hm.  Not my area, but I'm always willing to learn ;)


--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -69,6 +69,9 @@
  #include 
  #include 

+/* Hardening for Spectre-v1 */
+#include 
+
  #include "uid16.h"

  #ifndef SET_UNALIGN_CTL
@@ -1451,6 +1454,7 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, 
resource,

 if (resource >= RLIM_NLIMITS)
 return -EINVAL;

+   resource = array_index_nospec(resource, RLIM_NLIMITS);
 task_lock(current->group_leader);
 x = current->signal->rlim[resource];


Can the speculation proceed past the task_lock()?  Or is the policy to
ignore such happy happenstances even if they are available?


Locks are not in the way of speculation. Speculation has almost no 
limits

except serializing instructions. At least they respect the magic AND
limitation in array_index_nospec().


I'd say it another way, because they don't respect the magic AND, we
just make the result in the speculation path safe. So, it's controlled
speculation.



Dan,

What do you think about adding the following function to the nospec API:

diff --git a/include/linux/nospec.h b/include/linux/nospec.h
index e791ebc..81e9a77 100644
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -55,4 +55,17 @@ static inline unsigned long 
array_index_mask_nospec(unsigned long index,

     \
     (typeof(_i)) (_i & _mask);  \
  })
+
+
+#ifndef sanitize_index_nospec
+inline bool sanitize_index_nospec(unsigned long index,
+ unsigned long size)
+{
+   if (index >= size)
+   return false;
+   index = array_index_nospec(index, size);
+
+   return true;
+}
+#endif
  #endif /* _LINUX_NOSPEC_H */



Oops, it seems I sent the wrong patch. The function would look like this:

#ifndef sanitize_index_nospec
inline bool sanitize_index_nospec(unsigned long *index,
  unsigned long size)
{
if (*index >= size)
return false;
*index = array_index_nospec(*index, size);

return true;
}
#endif

Thanks
--
Gustavo

Re: [PATCH] perf tests: Fix regex for record+probe_libc_inet_pton.sh

2018-05-18 Thread Arnaldo Carvalho de Melo

Em Fri, May 18, 2018 at 12:54:17PM +0530, Sandipan Das escreveu:
> This test currently fails because the regular expressions for
> matching the output of perf script do not consider the symbol
> offsets to be part of the output.
> 
> The symbol offsets are seen because of the default behaviour
> introduced by commit 4140d2ea74b3 ("perf script: Show symbol
> offsets by default").
> 
> Before applying this patch:
> 
>   # perf test -v "probe libc's inet_pton & backtrace it with ping"
> 
>   62: probe libc's inet_pton & backtrace it with ping   :
>   --- start ---
>   test child forked, pid 30389
>   ping 30406 [002] 307144.280983: probe_libc:inet_pton: (7f4117adf220)
>   7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
>   FAIL: expected backtrace entry 1 
> ".*inet_pton[[:space:]]\(/usr/lib64/libc-2.25.so|inlined\)$" got 
> "7f4117adf220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)"
>   test child finished with -1
>    end 
>   probe libc's inet_pton & backtrace it with ping: FAILED!
> 
> After applying this patch:
> 
>   # perf test -v "probe libc's inet_pton & backtrace it with ping"
> 
>   62: probe libc's inet_pton & backtrace it with ping   :
>   --- start ---
>   test child forked, pid 30539
>   ping 30556 [003] 307254.313217: probe_libc:inet_pton: (7fe19ab10220)
>   7fe19ab10220 __GI___inet_pton+0x0 (/usr/lib64/libc-2.25.so)
>   7fe19aad5ebd getaddrinfo+0x11d (/usr/lib64/libc-2.25.so)
>   56351e3c1c71 main+0x891 (/usr/bin/ping)
>   test child finished with 0
>    end 
>   probe libc's inet_pton & backtrace it with ping: Ok

Here it went from failing with:

[root@seventh ~]# perf test -v pton
64: probe libc's inet_pton & backtrace it with ping   :
--- start ---
test child forked, pid 22590
ping 22607 [001] 12782.425689: probe_libc:inet_pton: (7f8686da4e40)
7f8686da4e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
FAIL: expected backtrace entry 1 
".*inet_pton[[:space:]]\(/usr/lib64/libc-2.26.so|inlined\)$" got "7f8686da4e40 
__GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)"
test child finished with -1
 end 
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@seventh ~]# 

To failing with:

[root@seventh ~]# perf test -v pton
64: probe libc's inet_pton & backtrace it with ping   :
--- start ---
test child forked, pid 28954
ping 28971 [002] 14277.711200: probe_libc:inet_pton: (7fc9d66e3e40)
7fc9d66e3e40 __GI___inet_pton+0x0 (/usr/lib64/libc-2.26.so)
7fc9d66b02b4 getaddrinfo+0x124 (/usr/lib64/libc-2.26.so)
56075cb98f40 [unknown] (/usr/bin/ping)
FAIL: expected backtrace entry 3 
".*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" got "56075cb98f40 [unknown] 
(/usr/bin/ping)"
test child finished with -1
 end 
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@seventh ~]# 

Trying to figure this out...

- Arnaldo

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Linus Torvalds

On Fri, May 18, 2018 at 12:18 PM Nadav Amit  wrote:

> Gnu ASM manual says: "Each time you run as it assembles exactly one source
> program. The source program is made up of one or more files.”

Ok, that counts as documentation, although it's confusing as hell. Is it
one source program or multiple files again? I see what they are trying to
say, but it really could be phrased more clearly ;)

 Linus

Re: [PATCH] selftest: intel_pstate: debug support message from aperf.c and return value fix.

2018-05-18 Thread Shuah Khan

On 05/18/2018 08:05 AM, Jeffrin Jose T wrote:
> Additional message along with an error message which is more
> verbose for debug support from aperf.c and updated with the
> new return value "KSFT_SKIP".
> 
> Signed-off-by: Jeffrin Jose T 
> ---
>  tools/testing/selftests/intel_pstate/aperf.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/intel_pstate/aperf.c 
> b/tools/testing/selftests/intel_pstate/aperf.c
> index d21edea9c560..f6cd03a87493 100644
> --- a/tools/testing/selftests/intel_pstate/aperf.c
> +++ b/tools/testing/selftests/intel_pstate/aperf.c
> @@ -9,6 +9,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include "../kselftest.h"
>  
>  void usage(char *name) {
>   printf ("Usage: %s cpunum\n", name);
> @@ -41,8 +43,8 @@ int main(int argc, char **argv) {
>   fd = open(msr_file_name, O_RDONLY);
>  
>   if (fd == -1) {
> - perror("Failed to open");
> - return 1;
> + printf("/dev/cpu/%d/msr: %s\n", cpu, strerror(errno));
> + return KSFT_SKIP;
>   }
>  
>   CPU_ZERO(&cpuset);
> 

Thanks. I will queue this up for 4.18-rc1

thanks,
-- Shuah

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Nadav Amit

Linus Torvalds  wrote:

> On Fri, May 18, 2018 at 12:02 PM Nadav Amit  wrote:
> 
>> I can add a -Wa,[filename.s] switch. It works, but sort of undocumented.
> 
> Oh, if it assembles things together, then that sounds optimal.
> 
> And yes, like hpa says, we should make sure that behavior is acknowledged
> by the GNU as people, so that they then don't come back and say "hey, now
> we assemble things as separate units".
> 
> That said, the "separate units" model really doesn't make much sense now
> that I think about it, because 'as' supports only one output file. So I
> guess the as people wouldn't have any issues with just accepting the
> "concatenate all input files" syntax.

Gnu ASM manual says: "Each time you run as it assembles exactly one source
program. The source program is made up of one or more files.”

[PATCH v4 2/2] vfio/mdev: Re-order sysfs attribute creation

2018-05-18 Thread Alex Williamson

There exists a gap at the end of mdev_device_create() where the device
is visible to userspace, but we're not yet ready to handle removal, as
triggered through the 'remove' attribute.  We handle this properly in
mdev_device_remove() with an -EAGAIN return, but we can marginally
reduce this gap by adding this attribute as a final step of our sysfs
setup.

Signed-off-by: Alex Williamson 
---
 drivers/vfio/mdev/mdev_sysfs.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index 802df210929b..249472f05509 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -257,24 +257,24 @@ int  mdev_create_sysfs_files(struct device *dev, struct 
mdev_type *type)
 {
int ret;
 
-   ret = sysfs_create_files(&dev->kobj, mdev_device_attrs);
-   if (ret)
-   return ret;
-
ret = sysfs_create_link(type->devices_kobj, &dev->kobj, dev_name(dev));
if (ret)
-   goto device_link_failed;
+   return ret;
 
ret = sysfs_create_link(&dev->kobj, &type->kobj, "mdev_type");
if (ret)
goto type_link_failed;
 
+   ret = sysfs_create_files(&dev->kobj, mdev_device_attrs);
+   if (ret)
+   goto create_files_failed;
+
return ret;
 
+create_files_failed:
+   sysfs_remove_link(&dev->kobj, "mdev_type");
 type_link_failed:
sysfs_remove_link(type->devices_kobj, dev_name(dev));
-device_link_failed:
-   sysfs_remove_files(&dev->kobj, mdev_device_attrs);
return ret;
 }

[PATCH v4 1/2] vfio/mdev: Check globally for duplicate devices

2018-05-18 Thread Alex Williamson

When we create an mdev device, we check for duplicates against the
parent device and return -EEXIST if found, but the mdev device
namespace is global since we'll link all devices from the bus.  We do
catch this later in sysfs_do_create_link_sd() to return -EEXIST, but
with it comes a kernel warning and stack trace for trying to create
duplicate sysfs links, which makes it an undesirable response.

Therefore we should really be looking for duplicates across all mdev
parent devices, or as implemented here, against our mdev device list.
Using mdev_list to prevent duplicates means that we can remove
mdev_parent.lock, but in order not to serialize mdev device creation
and removal globally, we add mdev_device.active which allows UUIDs to
be reserved such that we can drop the mdev_list_lock before the mdev
device is fully in place.

Two behavioral notes; first, mdev_parent.lock had the side-effect of
serializing mdev create and remove ops per parent device.  This was
an implementation detail, not an intentional guarantee provided to
the mdev vendor drivers.  Vendor drivers can trivially provide this
serialization internally if necessary.  Second, review comments note
the new -EAGAIN behavior when the device, and in particular the remove
attribute, becomes visible in sysfs.  If a remove is triggered prior
to completion of mdev_device_create() the user will see a -EAGAIN
error.  While the errno is different, receiving an error during this
period is not, the previous implementation returned -ENODEV for the
same condition.  Furthermore, the consistency to the user is improved
in the case where mdev_device_remove_ops() returns error.  Previously
concurrent calls to mdev_device_remove() could see the device
disappear with -ENODEV and return in the case of error.  Now a user
would see -EAGAIN while the device is in this transitory state.

Signed-off-by: Alex Williamson 
---
 Documentation/vfio-mediated-device.txt |5 ++
 drivers/vfio/mdev/mdev_core.c  |  102 +++-
 drivers/vfio/mdev/mdev_private.h   |2 -
 3 files changed, 42 insertions(+), 67 deletions(-)

diff --git a/Documentation/vfio-mediated-device.txt 
b/Documentation/vfio-mediated-device.txt
index 1b3950346532..c3f69bcaf96e 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -145,6 +145,11 @@ The functions in the mdev_parent_ops structure are as 
follows:
 * create: allocate basic resources in a driver for a mediated device
 * remove: free resources in a driver when a mediated device is destroyed
 
+(Note that mdev-core provides no implicit serialization of create/remove
+callbacks per mdev parent device, per mdev type, or any other categorization.
+Vendor drivers are expected to be fully asynchronous in this respect or
+provide their own internal resource protection.)
+
 The callbacks in the mdev_parent_ops structure are as follows:
 
 * open: open callback of mediated device
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 126991046eb7..0212f0ee8aea 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev)
 }
 EXPORT_SYMBOL(mdev_uuid);
 
-static int _find_mdev_device(struct device *dev, void *data)
-{
-   struct mdev_device *mdev;
-
-   if (!dev_is_mdev(dev))
-   return 0;
-
-   mdev = to_mdev_device(dev);
-
-   if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0)
-   return 1;
-
-   return 0;
-}
-
-static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid)
-{
-   struct device *dev;
-
-   dev = device_find_child(parent->dev, &uuid, _find_mdev_device);
-   if (dev) {
-   put_device(dev);
-   return true;
-   }
-
-   return false;
-}
-
 /* Should be called holding parent_list_lock */
 static struct mdev_parent *__find_parent_device(struct device *dev)
 {
@@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct 
mdev_parent_ops *ops)
}
 
kref_init(&parent->ref);
-   mutex_init(&parent->lock);
 
parent->dev = dev;
parent->ops = ops;
@@ -297,6 +268,10 @@ static void mdev_device_release(struct device *dev)
 {
struct mdev_device *mdev = to_mdev_device(dev);
 
+   mutex_lock(&mdev_list_lock);
+   list_del(&mdev->next);
+   mutex_unlock(&mdev_list_lock);
+
dev_dbg(&mdev->dev, "MDEV: destroying\n");
kfree(mdev);
 }
@@ -304,7 +279,7 @@ static void mdev_device_release(struct device *dev)
 int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid)
 {
int ret;
-   struct mdev_device *mdev;
+   struct mdev_device *mdev, *tmp;
struct mdev_parent *parent;
struct mdev_type *type = to_mdev_type(kobj);
 
@@ -312,21 +287,28 @@ int mdev_device_create(struct kobject *kobj, struct 
device *dev, uuid_le uuid)
if (!parent)

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Linus Torvalds

On Fri, May 18, 2018 at 12:02 PM Nadav Amit  wrote:

> I can add a -Wa,[filename.s] switch. It works, but sort of undocumented.

Oh, if it assembles things together, then that sounds optimal.

And yes, like hpa says, we should make sure that behavior is acknowledged
by the GNU as people, so that they then don't come back and say "hey, now
we assemble things as separate units".

That said, the "separate units" model really doesn't make much sense now
that I think about it, because 'as' supports only one output file. So I
guess the as people wouldn't have any issues with just accepting the
"concatenate all input files" syntax.

  Linus

Re: [PATCH 10/10] Dynamic fault injection

2018-05-18 Thread Kent Overstreet

On Fri, May 18, 2018 at 01:05:20PM -0600, Andreas Dilger wrote:
> On May 18, 2018, at 1:49 AM, Kent Overstreet  
> wrote:
> > 
> > Signed-off-by: Kent Overstreet 
> 
> I agree with Christoph that even if there was some explanation in the cover
> letter, there should be something at least as good in the patch itself.  The
> cover letter is not saved, but the commit stays around forever, and should
> explain how this should be added to code, and how to use it from userspace.
> 
> 
> That said, I think this is a useful functionality.  We have something similar
> in Lustre (OBD_FAIL_CHECK() and friends) that is necessary for being able to
> test a distributed filesystem, which is just a CPP macro with an unlikely()
> branch, while this looks more sophisticated.  This looks like it has some
> added functionality like having more than one fault enabled at a time.
> If this lands we could likely switch our code over to using this.

This is pretty much what I was looking for, I just wanted to know if this patch
was interesting enough to anyone that I should spend more time on it or just
drop it :) Agreed on documentation. I think it's also worth factoring out the
functionality for the elf section trick that dynamic debug uses too.

> Some things that are missing from this patch that is in our code:
> 
> - in addition to the basic "enabled" and "oneshot" mechanisms, we have:
>   - timeout: sleep for N msec to simulate network/disk/locking delays
>   - race: wait with one thread until a second thread hits matching check
> 
> We also have a "fail_val" that allows making the check conditional (e.g.
> only operation on server "N" should fail, only RPC opcode "N", etc).

Those all sound like good ideas... fail_val especially, I think with that we'd
have all the functionality the existing fault injection framework has (which is
way to heavyweight to actually get used, imo)

Re: [PATCH 4.14 00/45] 4.14.42-stable review

2018-05-18 Thread Naresh Kamboju

On 18 May 2018 at 13:45, Greg Kroah-Hartman  wrote:
> This is the start of the stable review cycle for the 4.14.42 release.
> There are 45 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun May 20 08:15:14 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.42-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

Summary


kernel: 4.14.42-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: 879085a4c525940c6c3b08d6a2f49fc064bf02a8
git describe: v4.14.41-46-g879085a4c525
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.41-46-g879085a4c525


No regressions (compared to build v4.14.41-44-g1134b348cdb5)

Boards, architectures and test suites:
-

dragonboard-410c
* boot - pass: 20
* kselftest - skip: 27, pass: 41
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 6, pass: 57
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 14
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 134, pass: 1016
* ltp-timers-tests - pass: 13

hi6220-hikey - arm64
* boot - pass: 20
* kselftest - skip: 23, pass: 44
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 6, pass: 57
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 4, pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 135, pass: 1015
* ltp-timers-tests - pass: 13

juno-r2 - arm64
* boot - pass: 20
* kselftest - skip: 25, pass: 43
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 6, pass: 57
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 4, pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 134, pass: 1016
* ltp-timers-tests - pass: 13

qemu_arm
* boot - pass: 21
* kselftest - fail: 2, skip: 29, pass: 37
* libhugetlbfs - pass: 2
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 18, pass: 63
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 5, pass: 58
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 7, pass: 7
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 3, skip: 92, pass: 1055
* ltp-timers-tests - pass: 13

qemu_arm64
* boot - pass: 21
* kselftest - skip: 29, pass: 41
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 6, pass: 57
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 6, pass: 8
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 2, skip: 156, pass: 992
* ltp-timers-tests - pass: 13

qemu_x86_64
* boot - pass: 20
* kselftest - skip: 30, pass: 50
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-te

[PATCH v4 0/2] vfio/mdev: Device namespace protection

2018-05-18 Thread Alex Williamson

v4: Fix the 'create' racing 'remove' gap noted by Kirti by moving
removal from mdev_list to mdev_device_release().  Fix missing
mdev_put_parent() cases in mdev_device_create(), also noted
by Kirti.  Added documention update regarding serialization as
noted by Cornelia.  Added additional commit log comment about
-EAGAIN vs -ENODEV for 'remove' racing 'create'.  Added second
patch to re-order sysfs attributes, with this my targeted
scripts can no longer hit the gap where -EAGAIN is regurned.
BTW, the gap where the current code returns -ENODEV in this
race condition is about 50% easier to hit than it exists in
this series with patch 1 alone.

Thanks,
Alex

---

Alex Williamson (2):
  vfio/mdev: Check globally for duplicate devices
  vfio/mdev: Re-order sysfs attribute creation


 Documentation/vfio-mediated-device.txt |5 ++
 drivers/vfio/mdev/mdev_core.c  |  102 +++-
 drivers/vfio/mdev/mdev_private.h   |2 -
 drivers/vfio/mdev/mdev_sysfs.c |   14 ++--
 4 files changed, 49 insertions(+), 74 deletions(-)

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread hpa

On May 18, 2018 12:02:50 PM PDT, Nadav Amit  wrote:
>h...@zytor.com wrote:
>
>> On May 18, 2018 11:50:12 AM PDT, Linus Torvalds
> wrote:
>>> On Fri, May 18, 2018 at 11:34 AM  wrote:
>>> 
 On May 18, 2018 11:25:32 AM PDT, Linus Torvalds <
>>> torva...@linux-foundation.org> wrote:
>>> 
 Unfortunately gcc doesn't guarantee that global assembly inlines
>will
>>> appear at the top of the file.
>>> 
>>> Yeah. It really would be better to do the "asm version of -inline".
>>> 
>>> We already do something like that for real *.S files on some
>>> architectures
>>> (because their assembly really wants it, eg
>>> 
>>> arch/arm/Makefile:
>>> KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_ISA) $(arch-y) $(tune-y)
>>> -include
>>> asm/unified.h -msoft-float
>>> 
>>> but I do want to point out that KBUILD_AFLAGS is *not* used for
>>> compiler-generated assembly, only for actual *.S files.
>>> 
>>> Sadly, I don't actually know any way to make gcc call the 'as' phase
>>> with
>>> particular options. We can use "-Wa,xyzzy" to pass in xyzzy to the
>>> assembler, but there is no "-include" option for GNU as afaik.
>>> 
>>> Can you perhaps define a macro symbol for "--defsym"? Probably not.
>>> 
>>>  Linus
>> 
>> I looked at this thing a long time ago; it's not there, and the best
>would probably be to get global asm() working properly in gcc.
>
>I can add a -Wa,[filename.s] switch. It works, but sort of
>undocumented.

Ah, I thought that would have assembled each file separately. We can request it 
be documented, that's usually much easier than getting a new feature 
implemented.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH 10/10] Dynamic fault injection

2018-05-18 Thread Andreas Dilger

On May 18, 2018, at 1:49 AM, Kent Overstreet  wrote:
> 
> Signed-off-by: Kent Overstreet 

I agree with Christoph that even if there was some explanation in the cover
letter, there should be something at least as good in the patch itself.  The
cover letter is not saved, but the commit stays around forever, and should
explain how this should be added to code, and how to use it from userspace.


That said, I think this is a useful functionality.  We have something similar
in Lustre (OBD_FAIL_CHECK() and friends) that is necessary for being able to
test a distributed filesystem, which is just a CPP macro with an unlikely()
branch, while this looks more sophisticated.  This looks like it has some
added functionality like having more than one fault enabled at a time.
If this lands we could likely switch our code over to using this.


Some things that are missing from this patch that is in our code:

- in addition to the basic "enabled" and "oneshot" mechanisms, we have:
  - timeout: sleep for N msec to simulate network/disk/locking delays
  - race: wait with one thread until a second thread hits matching check

We also have a "fail_val" that allows making the check conditional (e.g.
only operation on server "N" should fail, only RPC opcode "N", etc).

Cheers, Andreas

> ---
> include/asm-generic/vmlinux.lds.h |   4 +
> include/linux/dynamic_fault.h | 117 +
> lib/Kconfig.debug |   5 +
> lib/Makefile  |   2 +
> lib/dynamic_fault.c   | 760 ++
> 5 files changed, 888 insertions(+)
> create mode 100644 include/linux/dynamic_fault.h
> create mode 100644 lib/dynamic_fault.c
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 1ab0e520d6..a4c9dfcbbd 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -246,6 +246,10 @@
>   VMLINUX_SYMBOL(__start___verbose) = .;  \
>   KEEP(*(__verbose))  \
>   VMLINUX_SYMBOL(__stop___verbose) = .;   \
> + . = ALIGN(8);   \
> + VMLINUX_SYMBOL(__start___faults) = .;   \
> + *(__faults) \
> + VMLINUX_SYMBOL(__stop___faults) = .;\
>   LIKELY_PROFILE()\
>   BRANCH_PROFILE()\
>   TRACE_PRINTKS() \
> diff --git a/include/linux/dynamic_fault.h b/include/linux/dynamic_fault.h
> new file mode 100644
> index 00..6e7bb56ae8
> --- /dev/null
> +++ b/include/linux/dynamic_fault.h
> @@ -0,0 +1,117 @@
> +#ifndef _DYNAMIC_FAULT_H
> +#define _DYNAMIC_FAULT_H
> +
> +#include 
> +#include 
> +#include 
> +
> +enum dfault_enabled {
> + DFAULT_DISABLED,
> + DFAULT_ENABLED,
> + DFAULT_ONESHOT,
> +};
> +
> +union dfault_state {
> + struct {
> + unsignedenabled:2;
> + unsignedcount:30;
> + };
> +
> + struct {
> + unsignedv;
> + };
> +};
> +
> +/*
> + * An instance of this structure is created in a special
> + * ELF section at every dynamic fault callsite.  At runtime,
> + * the special section is treated as an array of these.
> + */
> +struct _dfault {
> + const char  *modname;
> + const char  *function;
> + const char  *filename;
> + const char  *class;
> +
> + const u16   line;
> +
> + unsignedfrequency;
> + union dfault_state  state;
> +
> + struct static_key   enabled;
> +} __aligned(8);
> +
> +
> +#ifdef CONFIG_DYNAMIC_FAULT
> +
> +int dfault_add_module(struct _dfault *tab, unsigned int n, const char *mod);
> +int dfault_remove_module(char *mod_name);
> +bool __dynamic_fault_enabled(struct _dfault *);
> +
> +#define dynamic_fault(_class)
> \
> +({   \
> + static struct _dfault descriptor\
> + __used __aligned(8) __attribute__((section("__faults"))) = {\
> + .modname= KBUILD_MODNAME,   \
> + .function   = __func__, \
> + .filename   = __FILE__, \
> + .line   = __LINE__, \
> + .class  = _class,   \
> + };  \
> + \
> + static_key_false(&descriptor.enabled) &&\

Re: [PATCH] kernel: sys: fix potential Spectre v1

2018-05-18 Thread Gustavo A. R. Silva




On 05/15/2018 05:57 PM, Dan Williams wrote:

On Tue, May 15, 2018 at 3:29 PM, Thomas Gleixner  wrote:

On Tue, 15 May 2018, Andrew Morton wrote:

On Mon, 14 May 2018 22:00:38 -0500 "Gustavo A. R. Silva" 
 wrote:


resource can be controlled by user-space, hence leading to a
potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential
spectre issue 'get_current()->signal->rlim' (local cap)
kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue
'get_current()->signal->rlim' (local cap)

Fix this by sanitizing *resource* before using it to index
current->signal->rlim

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].


hm.  Not my area, but I'm always willing to learn ;)


--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -69,6 +69,9 @@
  #include 
  #include 

+/* Hardening for Spectre-v1 */
+#include 
+
  #include "uid16.h"

  #ifndef SET_UNALIGN_CTL
@@ -1451,6 +1454,7 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, resource,
 if (resource >= RLIM_NLIMITS)
 return -EINVAL;

+   resource = array_index_nospec(resource, RLIM_NLIMITS);
 task_lock(current->group_leader);
 x = current->signal->rlim[resource];


Can the speculation proceed past the task_lock()?  Or is the policy to
ignore such happy happenstances even if they are available?


Locks are not in the way of speculation. Speculation has almost no limits
except serializing instructions. At least they respect the magic AND
limitation in array_index_nospec().


I'd say it another way, because they don't respect the magic AND, we
just make the result in the speculation path safe. So, it's controlled
speculation.



Dan,

What do you think about adding the following function to the nospec API:

diff --git a/include/linux/nospec.h b/include/linux/nospec.h
index e791ebc..81e9a77 100644
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -55,4 +55,17 @@ static inline unsigned long 
array_index_mask_nospec(unsigned long index,

\
(typeof(_i)) (_i & _mask);  \
 })
+
+
+#ifndef sanitize_index_nospec
+inline bool sanitize_index_nospec(unsigned long index,
+ unsigned long size)
+{
+   if (index >= size)
+   return false;
+   index = array_index_nospec(index, size);
+
+   return true;
+}
+#endif
 #endif /* _LINUX_NOSPEC_H */


And here is an example of its use:

diff --git a/drivers/media/i2c/tvp7002.c b/drivers/media/i2c/tvp7002.c
index 4599b7e..27b39c0 100644
--- a/drivers/media/i2c/tvp7002.c
+++ b/drivers/media/i2c/tvp7002.c
@@ -35,6 +35,8 @@
 #include 
 #include 

+#include 
+
 #include "tvp7002_reg.h"

 MODULE_DESCRIPTION("TI TVP7002 Video and Graphics Digitizer driver");
@@ -784,7 +786,7 @@ static int tvp7002_enum_dv_timings(struct 
v4l2_subdev *sd,

return -EINVAL;

/* Check requested format index is within range */
-   if (timings->index >= NUM_TIMINGS)
+   if (!sanitize_index_nospec(timings->index, NUM_TIMINGS))
return -EINVAL;

timings->timings = tvp7002_timings[timings->index].timings;

This patter is very common. So, it may be a good idea to unify both 
bounds checking and the serialization of instructions into a single 
function.


What do you think?

Thanks
--
Gustavo

Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices

2018-05-18 Thread Kirti Wankhede



On 5/18/2018 11:00 PM, Alex Williamson wrote:
> On Fri, 18 May 2018 12:34:03 +0530
> Kirti Wankhede  wrote:
> 
>> On 5/18/2018 3:07 AM, Alex Williamson wrote:
>>> On Fri, 18 May 2018 01:56:50 +0530
>>> Kirti Wankhede  wrote:
>>>   
 On 5/17/2018 9:50 PM, Alex Williamson wrote:  
> On Thu, 17 May 2018 21:25:22 +0530
> Kirti Wankhede  wrote:
> 
>> On 5/17/2018 1:39 PM, Cornelia Huck wrote:
>>> On Wed, 16 May 2018 21:30:19 -0600
>>> Alex Williamson  wrote:
>>>   
 When we create an mdev device, we check for duplicates against the
 parent device and return -EEXIST if found, but the mdev device
 namespace is global since we'll link all devices from the bus.  We do
 catch this later in sysfs_do_create_link_sd() to return -EEXIST, but
 with it comes a kernel warning and stack trace for trying to create
 duplicate sysfs links, which makes it an undesirable response.

 Therefore we should really be looking for duplicates across all mdev
 parent devices, or as implemented here, against our mdev device list.
 Using mdev_list to prevent duplicates means that we can remove
 mdev_parent.lock, but in order not to serialize mdev device creation
 and removal globally, we add mdev_device.active which allows UUIDs to
 be reserved such that we can drop the mdev_list_lock before the mdev
 device is fully in place.

 NB. there was never intended to be any serialization guarantee
 provided by the mdev core with respect to creation and removal of mdev
 devices, mdev_parent.lock provided this only as a side-effect of the
 implementation for locking the namespace per parent.  That
 serialization is now removed.  
>>>   
>>
>> mdev_parent.lock is to serialize create and remove of that mdev device,
>> that handles race condition that Cornelia mentioned below.
>
> Previously it was stated:
>
> On Thu, 17 May 2018 01:01:40 +0530
> Kirti Wankhede  wrote:
>> Here lock is not for create/remove routines of vendor driver, its about
>> mdev device creation and device registration, which is a common code
>> path, and so is part of mdev core module.
>
> So the race condition was handled previously, but as a side-effect of
> protecting the namespace, aiui.  I'm trying to state above that the
> serialization of create/remove was never intended as a guarantee
> provided to mdev vendor drivers.  I don't see that there's a need to
> protect "mdev device creation and device registration" beyond conflicts
> in the UUID namespace, which is done here.  Are there others?
> 

 Sorry not being elaborative in my earlier response to
  
> If we can
> show that vendor drivers handle the create/remove paths themselves,
> perhaps we can refine the locking granularity.

 mdev_device_create() function does :
 - create mdev device
 - register device
 - call vendor driver->create
 - create sysfs files.

 mdev_device_remove() removes sysfs files, unregister device and delete
 device.

 There is common code in mdev_device_create() and mdev_device_remove()
 independent of what vendor driver does in its create()/remove()
 callback. Moving this code to each vendor driver to handle create/remove
 themselves doesn't make sense to me.  
>>>
>>> I don't see where anyone is suggesting that, I'm not.
>>>
 mdev_parent.lock here does take care of race conditions that could occur
 during mdev device creation and deletion in this common code path.  
>>>
>>> Exactly what races in the common code path is mdev_parent.lock
>>> preventing?  mdev_device_create() calls:
>>>
>>> device_register()
>>> mdev_device_create_ops()
>>>   parent->ops->create()
>>>   sysfs_create_groups()
>>> mdev_create_sysfs_files()
>>>   sysfs_create_files()
>>>   sysfs_create_link()
>>>   sysfs_create_link()
>>>
>>> mdev_parent.lock is certainly not serializing all calls across the
>>> entire kernel to device_register and sysfs_create_{groups,files,link}
>>> so what is it protecting other than serializing parent->ops->create()?
>>> Locks protect data, not code.  The data we're protecting is the shared
>>> mdev_list, there is no shared data once mdev_device_create() has its
>>> mdev_device with uuid reservation placed into that list.
>>>   
> 
> Thank you for enumerating these points below.
> 
>> This lock prevents race condition that could occur due to sysfs entries
>> 1. between write on 'create' and 'remove' sysfs file of mdev device
>>   As per current code without lock, mdev_create_sysfs_files() creates
>> 'remove' sysfs, but before adding this mdev device to mdev_list, if
>> 'remove' is called, that would return -ENODEV even if the device is seen
>> in sysfs
> 
> mdev_parent.lock doesn't play a factor here

Re: [PATCH 4.9 00/33] 4.9.101-stable review

2018-05-18 Thread Naresh Kamboju

On 18 May 2018 at 13:45, Greg Kroah-Hartman  wrote:
> This is the start of the stable review cycle for the 4.9.101 release.
> There are 33 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun May 20 08:15:20 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.101-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

Summary


kernel: 4.9.101-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: b0935d42e12ff5b9efbdded322b1bb7187705bd3
git describe: v4.9.100-35-gb0935d42e12f
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.100-35-gb0935d42e12f


No regressions (compared to build v4.9.100-34-g16c619d3e043)

Boards, architectures and test suites:
-

dragonboard-410c
* boot - pass: 20,
* kselftest - pass: 34, skip: 32
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1015, skip: 135
* ltp-timers-tests - pass: 13,

hi6220-hikey - arm64
* boot - pass: 20,
* kselftest - pass: 35, skip: 28
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1014, skip: 136
* ltp-timers-tests - pass: 13,

juno-r2 - arm64
* boot - pass: 20,
* kselftest - pass: 37, skip: 29
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 10, skip: 4
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1015, skip: 135
* ltp-timers-tests - pass: 13,

qemu_arm
* boot - pass: 20,
* kselftest - pass: 34, skip: 32
* libhugetlbfs - pass: 1,
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 63, skip: 18
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 58, skip: 5
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 7, skip: 7
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1056, fail: 2, skip: 92
* ltp-timers-tests - pass: 13,

qemu_arm64
* boot - pass: 20, fail: 1,
* kselftest - pass: 35, skip: 33
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64, skip: 17
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 57, skip: 6
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 8, skip: 6
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 991, fail: 2, skip: 157
* ltp-timers-tests - pass: 13,

qemu_x86_64
* boot - pass: 20,
* kselftest - pass: 47, skip: 33
* libhugetlbfs

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Nadav Amit

h...@zytor.com wrote:

> On May 18, 2018 11:50:12 AM PDT, Linus Torvalds 
>  wrote:
>> On Fri, May 18, 2018 at 11:34 AM  wrote:
>> 
>>> On May 18, 2018 11:25:32 AM PDT, Linus Torvalds <
>> torva...@linux-foundation.org> wrote:
>> 
>>> Unfortunately gcc doesn't guarantee that global assembly inlines will
>> appear at the top of the file.
>> 
>> Yeah. It really would be better to do the "asm version of -inline".
>> 
>> We already do something like that for real *.S files on some
>> architectures
>> (because their assembly really wants it, eg
>> 
>> arch/arm/Makefile:
>> KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_ISA) $(arch-y) $(tune-y)
>> -include
>> asm/unified.h -msoft-float
>> 
>> but I do want to point out that KBUILD_AFLAGS is *not* used for
>> compiler-generated assembly, only for actual *.S files.
>> 
>> Sadly, I don't actually know any way to make gcc call the 'as' phase
>> with
>> particular options. We can use "-Wa,xyzzy" to pass in xyzzy to the
>> assembler, but there is no "-include" option for GNU as afaik.
>> 
>> Can you perhaps define a macro symbol for "--defsym"? Probably not.
>> 
>>  Linus
> 
> I looked at this thing a long time ago; it's not there, and the best would 
> probably be to get global asm() working properly in gcc.

I can add a -Wa,[filename.s] switch. It works, but sort of undocumented.

Re: Revert "dmaengine: pl330: add DMA_PAUSE feature"

2018-05-18 Thread Frank Mori Hess

On Thu, May 17, 2018 at 1:22 PM, Lars-Peter Clausen  wrote:
> On 05/17/2018 06:20 PM, Frank Mori Hess wrote:
> The problem is not so much on the software side but even more so on the
> hardware side. Not all hardware even supports aborting a transfer with no
> data loss because there is no precise measurement of how much data has been
> transferred. The residue that is reported is always the lower bound, this
> much has been transferred but it might actually have been more.

I'd just like to point out, if the pl330 driver actually did report a
upper bound on the residue (lower bound on bytes transferred) that
would also blow up Marek's samsung serial driver use case.  Instead of
being in a situation where data loss might occur rarely due to a race
condition, they would be in a situation where data loss occurs every
time they stop a transfer.  Not that such a residue reporting would be
incorrect though, the pl330 driver doesn't even advertise burst level
granularity with its residue reporting.

RE: [RFC PATCH 09/09] Introduce cache=rdma moutning option

2018-05-18 Thread Long Li

> Subject: Re: [RFC PATCH 09/09] Introduce cache=rdma moutning option
> 
> On Thu, May 17, 2018 at 05:22:14PM -0700, Long Li wrote:
> > From: Long Li 
> >
> > When cache=rdma is enabled on mount options, CIFS do not allocate
> > internal data buffer pages for I/O, data is read/writen directly to user
> memory via RDMA.
> 
> I don't think this should be an option.  For direct I/O without signing or
> encryption CIFS should always use get_user_pages, with or without RDMA.

Yes this should be done for all transport. If there are no objections, I'll 
send patches to change this.

Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn

On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
 From: Eric Dumazet 
 Date: Fri, 18 May 2018 08:30:43 -0700

> We probably need to revert Willem patch 
> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)

 Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>
>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>> ipv6 by the time it reaches ip_recv_error.
>>>
>>>   sk->sk_socket->ops = &inet_dgram_ops;
>>>   < HERE >
>>>   sk->sk_family = PF_INET;
>>>
>>> Even calling inet_recv_error to demux would not necessarily help.
>>>
>>> Safest would be to look up by skb->protocol, similar to what
>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>
>>> Or to make that function safe with PF_INET and swap the order
>>> of the above two operations.
>>>
>>> All sound needlessly complicated for this rare socket option, but
>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>> either.
>>
>> Ensuring that ip_recv_error correctly handles packets from either
>> socket and removing the warning should indeed be good.
>>
>> It is robust against v4-mapped packets from an AF_INET6 socket,
>> but see caveat on reconnect below.
>>
>> The code between ipv6_recv_error for v4-mapped addresses and
>> ip_recv_error is essentially the same, the main difference being
>> whether to return network headers as sockaddr_in with SOL_IP
>> or sockaddr_in6 with SOL_IPV6.
>>
>> There are very few other locations in the stack that explicitly test
>> sk_family in this way and thus would be vulnerable to races with
>> IPV6_ADDRFORM.
>>
>> I'm not sure whether it is possible for a udpv6 socket to queue a
>> real ipv6 packet on the error queue, disconnect, connect to an
>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>> on a true ipv6 packet. That would return buggy data, e.g., in
>> msg_name.
>
> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
> error queue is empty, and then take its lock for the duration of the
> operation.

Actually, no reason to hold the lock. This setsockopt holds the socket
lock, which connect would need, too. So testing that the queue
is empty after testing that it is connected to a v4 address is
sufficient to ensure that no ipv6 packets are queued for reception.

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 4d780c7f0130..a975d6311341 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
int level, int optname,

if (ipv6_only_sock(sk) ||
!ipv6_addr_v4mapped(&sk->sk_v6_daddr)) {
retv = -EADDRNOTAVAIL;
break;
}

+   if (!skb_queue_empty(&sk->sk_error_queue)) {
+   retv = -EBUSY;
+   break;
+   }
+
fl6_free_socklist(sk);
__ipv6_sock_mc_close(sk);

After this it should be safe to remove the warning in ip_recv_error.

Re: [PATCH] riscv: Fix the bug in memory access fixup code

2018-05-18 Thread Palmer Dabbelt


On Mon, 07 May 2018 19:59:33 PDT (-0700), alan...@andestech.com wrote:

A piece of fixup code is currently shared by __copy_user and
__clear_user.  It first disables the access to user-space memory
and then returns the "n" argument, which represents #(bytes not processed).
However,__copy_user's "n" is in register a2, while __clear_user's in a1,
and thus it causes errors for programs like setdomainname02 testcase in LTP.

This patch fixes this issue by separating their fixup code and returning
the right value for the kernel to handle a relative fault properly.

Signed-off-by: Alan Kao 
Cc: Greentime Hu 
Cc: Zong Li 
Cc: Vincent Chen 
---
 arch/riscv/lib/uaccess.S | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
index 58fb2877c865..0173ea296baa 100644
--- a/arch/riscv/lib/uaccess.S
+++ b/arch/riscv/lib/uaccess.S
@@ -84,7 +84,7 @@ ENTRY(__clear_user)
bgeu t0, t1, 2f
bltu a0, t0, 4f
 1:
-   fixup REG_S, zero, (a0), 10f
+   fixup REG_S, zero, (a0), 11f
addi a0, a0, SZREG
bltu a0, t1, 1b
 2:
@@ -96,12 +96,12 @@ ENTRY(__clear_user)
li a0, 0
ret
 4: /* Edge case: unalignment */
-   fixup sb, zero, (a0), 10f
+   fixup sb, zero, (a0), 11f
addi a0, a0, 1
bltu a0, t0, 4b
j 1b
 5: /* Edge case: remainder */
-   fixup sb, zero, (a0), 10f
+   fixup sb, zero, (a0), 11f
addi a0, a0, 1
bltu a0, a3, 5b
j 3b
@@ -109,9 +109,14 @@ ENDPROC(__clear_user)

.section .fixup,"ax"
.balign 4
+   /* Fixup code for __copy_user(10) and __clear_user(11) */
 10:
/* Disable access to user memory */
csrs sstatus, t6
-   sub a0, a3, a0
+   mv a0, a2
+   ret
+11:
+   csrs sstatus, t6
+   mv a0, a1
ret
.previous


Thanks!

Re: Revert "dmaengine: pl330: add DMA_PAUSE feature"

2018-05-18 Thread Frank Mori Hess

On Fri, May 18, 2018 at 12:03 AM, Vinod  wrote:
>
> You are simply mixing things up!

It certainly feels like I'm mixed up.  If I have to resolve this, I'd
like to be a little less mixed up before I submit more patches which
are going to inevitably result in subtly broken code suddenly becoming
prominently and unignorably broken code.  Unfortunately I get the
impression I'm exhausting your patience to answer my questions, and
I've failed to fully communicate what the question is.

> On Pause we don't expect data loss, as user can
> resume the transfer. This means as you rightly guessed, the DMA HW should not 
> drop
> any data, nor should SW.
>
> Now if you want to read residue at this point it is perfectly valid. But if 
> you
> decide to terminate the channel (yes it is terminate_all API), we abort and 
> don't
> have context to report back!

I understand the residue can't be read after terminate, that's why
reading the residue is step 2 in pause/residue/terminate.  My question
was whether the entire sequence pause/residue/terminate taken as a
whole can or cannot lose data.  Saying that individual steps can or
can't lose data is not enough, context is required.  The key point is
whether pause flushes in-flight data to its destination or not.  If it
does, and our residue is accurate, the terminate cannot cause data
loss.  If pause doesn't flush, an additional step of flush_sync as
Lars suggested is required.  So pause/flush_sync/residue/terminate
would be the safe sequence that cannot lose data.

> As Lars rightly pointed out, residue calculation are very tricky, DMA fifo may
> have data, some data may be in device FIFO, so residue is always from DMA 
> point
> of view and may differ from device view (more or less depending upon 
> direction)
>
> Now if you require to add more features for your usecase, please do feel free 
> to
> send a patch. The framework can always be improved, we haven't solved world
> hunger yet!

World hunger?  I don't see how asking questions about a dma engine's
data integrity guarantees is either unreasonable or out of scope.

-- 
Frank

[PATCH v2] schedutil: Allow cpufreq requests to be made even when kthread kicked

2018-05-18 Thread Joel Fernandes (Google.)

From: "Joel Fernandes (Google)" 

Currently there is a chance of a schedutil cpufreq update request to be
dropped if there is a pending update request. This pending request can
be delayed if there is a scheduling delay of the irq_work and the wake
up of the schedutil governor kthread.

A very bad scenario is when a schedutil request was already just made,
such as to reduce the CPU frequency, then a newer request to increase
CPU frequency (even sched deadline urgent frequency increase requests)
can be dropped, even though the rate limits suggest that its Ok to
process a request. This is because of the way the work_in_progress flag
is used.

This patch improves the situation by allowing new requests to happen
even though the old one is still being processed. Note that in this
approach, if an irq_work was already issued, we just update next_freq
and don't bother to queue another request so there's no extra work being
done to make this happen.

I had brought up this issue at the OSPM conference and Claudio had a
discussion RFC with an alternate approach [1]. I prefer the approach as
done in the patch below since it doesn't need any new flags and doesn't
cause any other extra overhead.

[1] https://patchwork.kernel.org/patch/10384261/

LGTMed-by: Viresh Kumar 
LGTMed-by: Juri Lelli 
CC: Viresh Kumar 
CC: Rafael J. Wysocki 
CC: Peter Zijlstra 
CC: Ingo Molnar 
CC: Patrick Bellasi 
CC: Juri Lelli 
Cc: Luca Abeni 
CC: Joel Fernandes 
CC: Todd Kjos 
CC: clau...@evidence.eu.com
CC: kernel-t...@android.com
CC: linux...@vger.kernel.org
Signed-off-by: Joel Fernandes (Google) 
---
v1 -> v2: Minor style related changes.

 kernel/sched/cpufreq_schedutil.c | 34 
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index e13df951aca7..5c482ec38610 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy 
*sg_policy, u64 time)
!cpufreq_can_do_remote_dvfs(sg_policy->policy))
return false;
 
-   if (sg_policy->work_in_progress)
-   return false;
-
if (unlikely(sg_policy->need_freq_update)) {
sg_policy->need_freq_update = false;
/*
@@ -128,7 +125,7 @@ static void sugov_update_commit(struct sugov_policy 
*sg_policy, u64 time,
 
policy->cur = next_freq;
trace_cpu_frequency(next_freq, smp_processor_id());
-   } else {
+   } else if (!sg_policy->work_in_progress) {
sg_policy->work_in_progress = true;
irq_work_queue(&sg_policy->irq_work);
}
@@ -291,6 +288,13 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
 
ignore_dl_rate_limit(sg_cpu, sg_policy);
 
+   /*
+* For slow-switch systems, single policy requests can't run at the
+* moment if update is in progress, unless we acquire update_lock.
+*/
+   if (sg_policy->work_in_progress)
+   return;
+
if (!sugov_should_update_freq(sg_policy, time))
return;
 
@@ -382,13 +386,27 @@ sugov_update_shared(struct update_util_data *hook, u64 
time, unsigned int flags)
 static void sugov_work(struct kthread_work *work)
 {
struct sugov_policy *sg_policy = container_of(work, struct 
sugov_policy, work);
+   unsigned int freq;
+   unsigned long flags;
+
+   /*
+* Hold sg_policy->update_lock shortly to handle the case where:
+* incase sg_policy->next_freq is read here, and then updated by
+* sugov_update_shared just before work_in_progress is set to false
+* here, we may miss queueing the new update.
+*
+* Note: If a work was queued after the update_lock is released,
+* sugov_work will just be called again by kthread_work code; and the
+* request will be proceed before the sugov thread sleeps.
+*/
+   raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
+   freq = sg_policy->next_freq;
+   sg_policy->work_in_progress = false;
+   raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
 
mutex_lock(&sg_policy->work_lock);
-   __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
-   CPUFREQ_RELATION_L);
+   __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
mutex_unlock(&sg_policy->work_lock);
-
-   sg_policy->work_in_progress = false;
 }
 
 static void sugov_irq_work(struct irq_work *irq_work)
-- 
2.17.0.441.gb46fe60e1d-goog

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread hpa

On May 18, 2018 11:50:12 AM PDT, Linus Torvalds  
wrote:
>On Fri, May 18, 2018 at 11:34 AM  wrote:
>
>> On May 18, 2018 11:25:32 AM PDT, Linus Torvalds <
>torva...@linux-foundation.org> wrote:
>
>> Unfortunately gcc doesn't guarantee that global assembly inlines will
>appear at the top of the file.
>
>Yeah. It really would be better to do the "asm version of -inline".
>
>We already do something like that for real *.S files on some
>architectures
>(because their assembly really wants it, eg
>
>  arch/arm/Makefile:
>KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_ISA) $(arch-y) $(tune-y)
>-include
>asm/unified.h -msoft-float
>
>but I do want to point out that KBUILD_AFLAGS is *not* used for
>compiler-generated assembly, only for actual *.S files.
>
>Sadly, I don't actually know any way to make gcc call the 'as' phase
>with
>particular options. We can use "-Wa,xyzzy" to pass in xyzzy to the
>assembler, but there is no "-include" option for GNU as afaik.
>
>Can you perhaps define a macro symbol for "--defsym"? Probably not.
>
>   Linus

I looked at this thing a long time ago; it's not there, and the best would 
probably be to get global asm() working properly in gcc.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Linus Torvalds

On Fri, May 18, 2018 at 11:34 AM  wrote:

> On May 18, 2018 11:25:32 AM PDT, Linus Torvalds <
torva...@linux-foundation.org> wrote:

> Unfortunately gcc doesn't guarantee that global assembly inlines will
appear at the top of the file.

Yeah. It really would be better to do the "asm version of -inline".

We already do something like that for real *.S files on some architectures
(because their assembly really wants it, eg

  arch/arm/Makefile:
KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_ISA) $(arch-y) $(tune-y) -include
asm/unified.h -msoft-float

but I do want to point out that KBUILD_AFLAGS is *not* used for
compiler-generated assembly, only for actual *.S files.

Sadly, I don't actually know any way to make gcc call the 'as' phase with
particular options. We can use "-Wa,xyzzy" to pass in xyzzy to the
assembler, but there is no "-include" option for GNU as afaik.

Can you perhaps define a macro symbol for "--defsym"? Probably not.

   Linus

[PULL REQUEST] i2c for 4.17

2018-05-18 Thread Wolfram Sang

Linus,

I2C has a bunch of driver bugfixes and a MAINTAINERS addition.

Please pull.

Thanks,

   Wolfram


The following changes since commit 75bc37fefc4471e718ba8e651aa74673d4e0a9eb:

  Linux 4.17-rc4 (2018-05-06 16:57:38 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git 
i2c/for-current-fixed

for you to fetch changes up to 22aac3eb0c465dd9ea7f06ee1ed8ad933890f2a3:

  MAINTAINERS: add entry for STM32 I2C driver (2018-05-17 16:02:19 +0200)


Alexander Monakov (1):
  i2c: designware: fix poll-after-enable regression

Bartosz Golaszewski (1):
  eeprom: at24: fix retrieving the at24_chip_data structure

Hans de Goede (2):
  i2c: core: ACPI: Improve OpRegion read errors
  i2c: core: ACPI: Log device not acking errors at dbg loglevel

Peter Rosin (3):
  i2c: pmcmsp: return message count on master_xfer success
  i2c: pmcmsp: fix error return from master_xfer
  i2c: viperboard: return message count on master_xfer success

Pierre-Yves MORDRET (1):
  MAINTAINERS: add entry for STM32 I2C driver

Wolfram Sang (1):
  Merge tag 'at24-4.17-rc5-fixes-for-wolfram' of 
git://git.kernel.org/.../brgl/linux into i2c/for-current


with much appreciated quality assurance from

Ben Gardner (1):
  (Test) i2c: designware: fix poll-after-enable regression

Mika Westerberg (2):
  (Rev.) i2c: core: ACPI: Log device not acking errors at dbg loglevel
  (Rev.) i2c: core: ACPI: Improve OpRegion read errors

 MAINTAINERS|  6 ++
 drivers/i2c/busses/i2c-designware-master.c |  5 -
 drivers/i2c/busses/i2c-pmcmsp.c|  4 ++--
 drivers/i2c/busses/i2c-viperboard.c|  2 +-
 drivers/i2c/i2c-core-acpi.c| 13 ++---
 drivers/misc/eeprom/at24.c |  2 +-
 6 files changed, 24 insertions(+), 8 deletions(-)


signature.asc
Description: PGP signature

Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn

On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>>> From: Eric Dumazet 
>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>
 We probably need to revert Willem patch 
 (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>
>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>
>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>> atomic operation, so that the socket is neither fully ipv4 nor fully
>> ipv6 by the time it reaches ip_recv_error.
>>
>>   sk->sk_socket->ops = &inet_dgram_ops;
>>   < HERE >
>>   sk->sk_family = PF_INET;
>>
>> Even calling inet_recv_error to demux would not necessarily help.
>>
>> Safest would be to look up by skb->protocol, similar to what
>> ipv6_recv_error does to handle v4-mapped-v6.
>>
>> Or to make that function safe with PF_INET and swap the order
>> of the above two operations.
>>
>> All sound needlessly complicated for this rare socket option, but
>> I don't have a better idea yet. Dropping on the floor is not nice,
>> either.
>
> Ensuring that ip_recv_error correctly handles packets from either
> socket and removing the warning should indeed be good.
>
> It is robust against v4-mapped packets from an AF_INET6 socket,
> but see caveat on reconnect below.
>
> The code between ipv6_recv_error for v4-mapped addresses and
> ip_recv_error is essentially the same, the main difference being
> whether to return network headers as sockaddr_in with SOL_IP
> or sockaddr_in6 with SOL_IPV6.
>
> There are very few other locations in the stack that explicitly test
> sk_family in this way and thus would be vulnerable to races with
> IPV6_ADDRFORM.
>
> I'm not sure whether it is possible for a udpv6 socket to queue a
> real ipv6 packet on the error queue, disconnect, connect to an
> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> on a true ipv6 packet. That would return buggy data, e.g., in
> msg_name.

In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
error queue is empty, and then take its lock for the duration of the
operation.

Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn

On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>> From: Eric Dumazet 
>> Date: Fri, 18 May 2018 08:30:43 -0700
>>
>>> We probably need to revert Willem patch 
>>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>
>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>
> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> atomic operation, so that the socket is neither fully ipv4 nor fully
> ipv6 by the time it reaches ip_recv_error.
>
>   sk->sk_socket->ops = &inet_dgram_ops;
>   < HERE >
>   sk->sk_family = PF_INET;
>
> Even calling inet_recv_error to demux would not necessarily help.
>
> Safest would be to look up by skb->protocol, similar to what
> ipv6_recv_error does to handle v4-mapped-v6.
>
> Or to make that function safe with PF_INET and swap the order
> of the above two operations.
>
> All sound needlessly complicated for this rare socket option, but
> I don't have a better idea yet. Dropping on the floor is not nice,
> either.

Ensuring that ip_recv_error correctly handles packets from either
socket and removing the warning should indeed be good.

It is robust against v4-mapped packets from an AF_INET6 socket,
but see caveat on reconnect below.

The code between ipv6_recv_error for v4-mapped addresses and
ip_recv_error is essentially the same, the main difference being
whether to return network headers as sockaddr_in with SOL_IP
or sockaddr_in6 with SOL_IPV6.

There are very few other locations in the stack that explicitly test
sk_family in this way and thus would be vulnerable to races with
IPV6_ADDRFORM.

I'm not sure whether it is possible for a udpv6 socket to queue a
real ipv6 packet on the error queue, disconnect, connect to an
ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
on a true ipv6 packet. That would return buggy data, e.g., in
msg_name.

Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more

2018-05-18 Thread Palmer Dabbelt

On Sun, 06 May 2018 07:57:27 PDT (-0700), mi...@kernel.org wrote:
>
> * Andrea Parri  wrote:
>
>> Hi Ingo,
>>
>> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
>> > From: Ingo Molnar 
>> > Date: Sat, 5 May 2018 10:23:23 +0200
>> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h 
>> > some more
>> >
>> > Before:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v) atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v) 
>> > atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v) 
>> > atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v) 
>> > atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxedatomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquireatomic_fetch_dec
>> >  #  define atomic_fetch_dec_releaseatomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec_acquire
>> >  #  define atomic_fetch_dec_acquire(...)   
>> > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec_release
>> >  #  define atomic_fetch_dec_release(...)   
>> > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)   
>> > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > After:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v) atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v) 
>> > atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v) 
>> > atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v) 
>> > atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxedatomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquireatomic_fetch_dec
>> >  #  define atomic_fetch_dec_releaseatomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)   
>> > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_acquire(...)   
>> > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_release(...)   
>> > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > The idea is that because we already group these APIs by certain defines
>> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
>> > branches - we can do the same in the secondary branch as well.
>> >
>> > ( Also remove some unnecessarily duplicate comments, as the API
>> >   group defines are now pretty much self-documenting. )
>> >
>> > No change in functionality.
>> >
>> > Cc: Peter Zijlstra 
>> > Cc: Linus Torvalds 
>> > Cc: Andrew Morton 
>> > Cc: Thomas Gleixner 
>> > Cc: Paul E. McKenney 
>> > Cc: Will Deacon 
>> > Cc: linux-kernel@vger.kernel.org
>> > Signed-off-by: Ingo Molnar 
>>
>> This breaks compilation on RISC-V. (For some of its atomics, the arch
>> currently defines the _relaxed and the full variants and it relies on
>> the generic definitions for the _acquire and the _release variants.)
>
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

Arnd added RISC-V to the cross compiler list a month or two ago when he updated 
them all.  I use the "make.cross" script from the Intel test robot, which will 
fetch the cross compilers for you.  It looks like I made a Git Hub pull request 
to update the script for RISC-V, it fetches from kernel.org


https://github.com/palmer-dabbelt/lkp-tests/blob/e14f4236ccd0572f4b87ffd480fecefee412dedc/sbin/make.cross
http://cdn.kernel.org/pub/tools/crosstool/files/bin/

http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/7.3.0/x86_64-gcc-7.3.0-nolibc_riscv64-linux.tar.gz

> Could you please send a patch that defines those variants against Linus's 
> tree,
> like the PowerPC patch that does something similar:
>
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to 
> asm/cmpxchg.h and define the full set of cmpxchg APIs
>
> ?
>
> ... and I'll integrate it into the proper place to make it all bisectable, 
> etc.

Sorry, I got buried in email again.  Did this get merged, or is there a current 
version of the patch set I should look at?

Tasks RCU vs Preempt RCU

2018-05-18 Thread Joel Fernandes

Hi,

I was thinking about tasks-RCU and why its needed. Since preempt-RCU allows
tasks to be preempted in read-sections, can we not just reuse that mechanism
for the trampolines since we track all preempted tasks so we would wait on
all tasks preempted within a trampoline?

I am trying to understand what will _not_ work if we did that.. I'm guessing
the answer is that that would mean the trampoline has to be wrapped with
rcu_read_{lock,unlock} which may add some overhead, but please let me know
if I'm missing something else..

The advantage I guess is possible elimination of an RCU variant, and also
possibly eliminating the tasks RCU thread that monitors.. Anyway I was
thinking more in terms of the effort of reduction of the RCU flavors etc and
reducing complexity ideas.

thanks!

- Joel

Re: seq_putc is preferable to seq_puts

2018-05-18 Thread Joe Perches

On Fri, 2018-05-18 at 10:50 -0700, Joe Perches wrote:
> and a trivial script to convert them all treewide
> 
> $ git grep -P --name-only 
> '\b(?:seq_puts|seq_printf)\s*\(\s*[^,]+,\s*"(?:\\.|.)"\s*\)\s*;' | \
>   xargs perl -i -e 'local $/; while (<>) { 
> s/\b(?:seq_puts|seq_printf)\s*\(\s*([^,]+),\s*"(\\.|.)"\s*\)\s*;/seq_putc(\1, 
> '"'\2'"')/g; print;}'

Hmm, pity this was missing a trailing semicolon in the substitution

$ git grep -P --name-only 
'\b(?:seq_puts|seq_printf)\s*\(\s*[^,]+,\s*"(?:\\.|.)"\s*\)\s*;' | \
  xargs perl -i -e 'local $/; while (<>) { 
s/\b(?:seq_puts|seq_printf)\s*\(\s*([^,]+),\s*"(\\.|.)"\s*\)\s*;/seq_putc(\1, 
'"'\2'"');/g; print;}'

Re: [PATCH v11 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-18 Thread hpa

On May 18, 2018 11:00:05 AM PDT, Bart Van Assche  wrote:
>The next patch in this series introduces a call to cmpxchg64()
>in the block layer core for those architectures on which this
>functionality is available. Make it possible to test whether
>cmpxchg64() is available by introducing CONFIG_ARCH_HAVE_CMPXCHG64.
>
>Signed-off-by: Bart Van Assche 
>Cc: Catalin Marinas 
>Cc: Will Deacon 
>Cc: Tony Luck 
>Cc: Fenghua Yu 
>Cc: Geert Uytterhoeven 
>Cc: "James E.J. Bottomley" 
>Cc: Helge Deller 
>Cc: Benjamin Herrenschmidt 
>Cc: Paul Mackerras 
>Cc: Michael Ellerman 
>Cc: Martin Schwidefsky 
>Cc: Heiko Carstens 
>Cc: David S. Miller 
>Cc: Thomas Gleixner 
>Cc: Ingo Molnar 
>Cc: H. Peter Anvin 
>Cc: Chris Zankel 
>Cc: Max Filippov 
>Cc: Arnd Bergmann 
>Cc: Jonathan Corbet 
>---
>.../features/locking/cmpxchg64/arch-support.txt| 33
>++
> arch/Kconfig   |  4 +++
> arch/arm/Kconfig   |  1 +
> arch/ia64/Kconfig  |  1 +
> arch/m68k/Kconfig  |  1 +
> arch/mips/Kconfig  |  1 +
> arch/parisc/Kconfig|  1 +
> arch/riscv/Kconfig |  1 +
> arch/sparc/Kconfig |  1 +
> arch/x86/Kconfig   |  1 +
> arch/xtensa/Kconfig|  1 +
> 11 files changed, 46 insertions(+)
>create mode 100644
>Documentation/features/locking/cmpxchg64/arch-support.txt
>
>diff --git a/Documentation/features/locking/cmpxchg64/arch-support.txt
>b/Documentation/features/locking/cmpxchg64/arch-support.txt
>new file mode 100644
>index ..84bfef7242b2
>--- /dev/null
>+++ b/Documentation/features/locking/cmpxchg64/arch-support.txt
>@@ -0,0 +1,33 @@
>+#
>+# Feature name:  cmpxchg64
>+# Kconfig:   ARCH_HAVE_CMPXCHG64
>+# description:   arch supports the cmpxchg64() API
>+#
>+---
>+| arch |status|
>+---
>+|   alpha: |  ok  |
>+| arc: |  ..  |
>+| arm: |  ok  |
>+|   arm64: |  ok  |
>+| c6x: |  ..  |
>+|   h8300: |  ..  |
>+| hexagon: |  ..  |
>+|ia64: |  ok  |
>+|m68k: |  ok  |
>+|  microblaze: |  ..  |
>+|mips: |  ok  |
>+|   nds32: |  ..  |
>+|   nios2: |  ..  |
>+|openrisc: |  ..  |
>+|  parisc: |  ok  |
>+| powerpc: |  ok  |
>+|   riscv: |  ok  |
>+|s390: |  ok  |
>+|  sh: |  ..  |
>+|   sparc: |  ok  |
>+|  um: |  ..  |
>+|   unicore32: |  ..  |
>+| x86: |  ok  |
>+|  xtensa: |  ok  |
>+---
>diff --git a/arch/Kconfig b/arch/Kconfig
>index 8e0d665c8d53..9840b2577af1 100644
>--- a/arch/Kconfig
>+++ b/arch/Kconfig
>@@ -358,6 +358,10 @@ config HAVE_ALIGNED_STRUCT_PAGE
> on a struct page for better performance. However selecting this
> might increase the size of a struct page by a word.
> 
>+config ARCH_HAVE_CMPXCHG64
>+  bool
>+  default y if 64BIT
>+
> config HAVE_CMPXCHG_LOCAL
>   bool
> 
>diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>index a7f8e7f4b88f..02c75697176e 100644
>--- a/arch/arm/Kconfig
>+++ b/arch/arm/Kconfig
>@@ -13,6 +13,7 @@ config ARM
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
>   select ARCH_HAS_STRICT_MODULE_RWX if MMU
>   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>+  select ARCH_HAVE_CMPXCHG64 if !THUMB2_KERNEL
>   select ARCH_HAVE_CUSTOM_GPIO_H
>   select ARCH_HAS_GCOV_PROFILE_ALL
>   select ARCH_MIGHT_HAVE_PC_PARPORT
>diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
>index bbe12a038d21..31c49e1482e2 100644
>--- a/arch/ia64/Kconfig
>+++ b/arch/ia64/Kconfig
>@@ -41,6 +41,7 @@ config IA64
>   select GENERIC_PENDING_IRQ if SMP
>   select GENERIC_IRQ_SHOW
>   select GENERIC_IRQ_LEGACY
>+  select ARCH_HAVE_CMPXCHG64
>   select ARCH_HAVE_NMI_SAFE_CMPXCHG
>   select GENERIC_IOMAP
>   select GENERIC_SMP_IDLE_THREAD
>diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
>index 785612b576f7..7b87cda3bbed 100644
>--- a/arch/m68k/Kconfig
>+++ b/arch/m68k/Kconfig
>@@ -11,6 +11,7 @@ config M68K
>   select GENERIC_ATOMIC64
>   select HAVE_UID16
>   select VIRT_TO_BUS
>+  select ARCH_HAVE_CMPXCHG64
>   select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
>   select GENERIC_CPU_DEVICES
>   select GENERIC_IOMAP
>diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
>index 225c95da23ce..088bca0fd9f2 100644
>--- a/arch/mips/Kconfig
>+++ b/arch/mips/Kconfig
>@@ -7,6 +7,7 @@ config MIPS
>   select ARCH_DISCARD_MEMBLOCK
>   select ARCH_HAS_ELF_RANDOMIZE
>   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>+  select

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread hpa

On May 18, 2018 11:25:32 AM PDT, Linus Torvalds  
wrote:
>On Fri, May 18, 2018 at 10:24 AM Nadav Amit  wrote:
>
>> Will it be ok just to use a global inline asm to set an “.include”
>directive
>> that gas would later process? (I can probably wrap it in a C macro so
>it
>> won’t be too disgusting)
>
>Maybe. I'd almost prefer it to be the automatic kind of thing that we
>already do for C files using "-include".
>
>  Linus

Unfortunately gcc doesn't guarantee that global assembly inlines will appear at 
the top of the file.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[PATCH] arm64: dts: stingray: use NUM_SATA to configure number of sata ports

2018-05-18 Thread Scott Branden

Move remaining sata configuration to stingray-sata.dtsi and enable
ports based on NUM_SATA defined.
Now, all that needs to be done is define NUM_SATA per board.

Signed-off-by: Scott Branden 
---
 .../boot/dts/broadcom/stingray/bcm958742-base.dtsi | 64 
 .../boot/dts/broadcom/stingray/bcm958742k.dts  |  2 +
 .../boot/dts/broadcom/stingray/bcm958742t.dts  |  2 +
 .../boot/dts/broadcom/stingray/stingray-sata.dtsi  | 68 ++
 4 files changed, 72 insertions(+), 64 deletions(-)

diff --git a/arch/arm64/boot/dts/broadcom/stingray/bcm958742-base.dtsi 
b/arch/arm64/boot/dts/broadcom/stingray/bcm958742-base.dtsi
index 8862ec9..cacc25e 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/bcm958742-base.dtsi
+++ b/arch/arm64/boot/dts/broadcom/stingray/bcm958742-base.dtsi
@@ -72,70 +72,6 @@
  <0x0008 0x8000 0x1 0x8000>; /* 6G @ 34G */
 };
 
-&sata0 {
-   status = "okay";
-};
-
-&sata_phy0{
-   status = "okay";
-};
-
-&sata1 {
-   status = "okay";
-};
-
-&sata_phy1{
-   status = "okay";
-};
-
-&sata2 {
-   status = "okay";
-};
-
-&sata_phy2{
-   status = "okay";
-};
-
-&sata3 {
-   status = "okay";
-};
-
-&sata_phy3{
-   status = "okay";
-};
-
-&sata4 {
-   status = "okay";
-};
-
-&sata_phy4{
-   status = "okay";
-};
-
-&sata5 {
-   status = "okay";
-};
-
-&sata_phy5{
-   status = "okay";
-};
-
-&sata6 {
-   status = "okay";
-};
-
-&sata_phy6{
-   status = "okay";
-};
-
-&sata7 {
-   status = "okay";
-};
-
-&sata_phy7{
-   status = "okay";
-};
-
 &mdio_mux_iproc {
mdio@10 {
gphy0: eth-phy@10 {
diff --git a/arch/arm64/boot/dts/broadcom/stingray/bcm958742k.dts 
b/arch/arm64/boot/dts/broadcom/stingray/bcm958742k.dts
index 77efa28..a515346 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/bcm958742k.dts
+++ b/arch/arm64/boot/dts/broadcom/stingray/bcm958742k.dts
@@ -32,6 +32,8 @@
 
 /dts-v1/;
 
+#define NUM_SATA   8
+
 #include "bcm958742-base.dtsi"
 
 / {
diff --git a/arch/arm64/boot/dts/broadcom/stingray/bcm958742t.dts 
b/arch/arm64/boot/dts/broadcom/stingray/bcm958742t.dts
index 5084b03..6a4d19e 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/bcm958742t.dts
+++ b/arch/arm64/boot/dts/broadcom/stingray/bcm958742t.dts
@@ -32,6 +32,8 @@
 
 /dts-v1/;
 
+#define NUM_SATA   8
+
 #include "bcm958742-base.dtsi"
 
 / {
diff --git a/arch/arm64/boot/dts/broadcom/stingray/stingray-sata.dtsi 
b/arch/arm64/boot/dts/broadcom/stingray/stingray-sata.dtsi
index 8c68e0c..7f6d176 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/stingray-sata.dtsi
+++ b/arch/arm64/boot/dts/broadcom/stingray/stingray-sata.dtsi
@@ -43,7 +43,11 @@
interrupts = ;
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 0)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata0_port0: sata-port@0 {
reg = <0>;
@@ -58,7 +62,11 @@
reg-names = "phy";
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 0)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata0_phy0: sata-phy@0 {
reg = <0>;
@@ -73,7 +81,11 @@
interrupts = ;
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 1)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata1_port0: sata-port@0 {
reg = <0>;
@@ -88,7 +100,11 @@
reg-names = "phy";
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 1)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata1_phy0: sata-phy@0 {
reg = <0>;
@@ -103,7 +119,11 @@
interrupts = ;
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 2)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata2_port0: sata-port@0 {
reg = <0>;
@@ -118,7 +138,11 @@
reg-names = "phy";
#address-cells = <1>;
#size-cells = <0>;
+#if (NUM_SATA > 2)
+   status = "okay";
+#else
status = "disabled";
+#endif
 
sata2_phy0: sata-phy@0 {
reg = <0>;
@@ -133,7 +157,11 @@
interrupts = ;
#address-cells = <1>;

Re: [PATCH] arm64: kvm: use -fno-jump-tables with clang

2018-05-18 Thread Nick Desaulniers

On Fri, May 18, 2018 at 11:13 AM Marc Zyngier  wrote:
> What I'd really like is to apply that patch knowing that:

> - you have checked that with a released version of the compiler, you
> don't observe any absolute address in any of the objects that are going
> to be executed at EL2 on a mainline kernel,

To verify, we should disassemble objects from arch/arm64/kvm/hyp/*.o and
make sure we don't see absolute addresses?  I can work with Sami to get a
sense of what the before and after of this patch looks like in disassembly,
then verify those changes are pervasive.

> - you have successfully run guests with a mainline kernel,

I believe Andrey has already done this.  If he can verify (maybe during
working hours next week), then maybe we can add his Tested-by to this
patches commit message?

> - it works for a reasonable set of common kernel configurations
> (defconfig and some of the most useful debug options),

It's easy for us to test our kernel configs for Android, ChromeOS, and
defconfig.  I'd be curious to know the shortlist of "most useful debug
options" just to be a better kernel developer, personally.

> - I can reproduce your findings with the same released compiler.

Lets wait for Andrey to confirm his test setup.  On the Android side, I
think you should be able to get by with a released version, but I'd be
curious to hear from Andrey.

> Is that the case? I don't think any of the above is completely outlandish.

These are all reasonable. Thanks for the feedback.
-- 
Thanks,
~Nick Desaulniers

Re: [PATCH 00/10] RFC: assorted bcachefs patches

2018-05-18 Thread Kent Overstreet

On Fri, May 18, 2018 at 02:03:25PM -0400, Josef Bacik wrote:
> There's nothing stopping us from doing that, it just uses a kprobe to override
> the function with our helper, so we could conceivably put it anywhere in the
> function.  The reason I limited it to individual functions was because it was
> easier than trying to figure out the side-effects of stopping mid-function.  
> If
> I needed to fail mid-function I just added a helper where I needed it and 
> failed
> that instead.  I imagine safety is going to be of larger concern if we allow 
> bpf
> scripts to randomly return anywhere inside a function, even if the function is
> marked as allowing error injection.  Thanks,

Ahh no, that's not what I want... here's an example:

https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/btree_cache.c#n674

Here we've got to do this thing which can race - which is fine, we just need to
check for and handle the race, on line 709 - but actually exercising that with a
test is difficult since it requires a heavily multithreaded workload with btree
nodes getting evicted to see it happen, so - it pretends the race happened if
race_fault() returns true. The race_fault() invocation shows up in debugfs,
where userspace can tell it to fire.

the way it works is dynamic_fault() is a macro that expands to a static struct
dfault_descriptor, stuck in a particular linker section so the dynamic fault
code can find them and stick them in debugfs (which is also the way dynamic
debug works).

#define dynamic_fault(_class)   \
({  \
static struct _dfault descriptor\
__used __aligned(8) __attribute__((section("__faults"))) = {\
.modname= KBUILD_MODNAME,   \
.function   = __func__, \
.filename   = __FILE__, \
.line   = __LINE__, \
.class  = _class,   \
};  \
\
static_key_false(&descriptor.enabled) &&\
__dynamic_fault_enabled(&descriptor);   \
})

Honestly it still seems like the cleanest and safest way of doing it to me...

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Linus Torvalds

On Fri, May 18, 2018 at 10:24 AM Nadav Amit  wrote:

> Will it be ok just to use a global inline asm to set an “.include”
directive
> that gas would later process? (I can probably wrap it in a C macro so it
> won’t be too disgusting)

Maybe. I'd almost prefer it to be the automatic kind of thing that we
already do for C files using "-include".

  Linus

Re: [PATCH] perf tests: Fix regex for record+probe_libc_inet_pton.sh

2018-05-18 Thread Arnaldo Carvalho de Melo

Em Fri, May 18, 2018 at 11:14:45AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, May 18, 2018 at 01:09:48PM +0200, Thomas-Mich Richter escreveu:
> > On 05/18/2018 12:29 PM, Sandipan Das wrote:
> > > Can you please apply these two patches as well and then re-test?
> 
> > > [1] https://lkml.org/lkml/2018/5/17/112
> > > [2] https://lkml.org/lkml/2018/5/17/113
> 
> > Ahhh, yes that helped. Must have missed it. Thanks for the pointer.
> 
> Cool, can I take that as a Tested-by: Thomas?

Actually I'll fold this patch into the patch that made this test break,
so that we keep the tree bisectable wrt that specific 'perf test' entry.

- Arnaldo

[PATCH] selftests: bpf: config: enable NET_SCH_INGRESS for xdp_meta.sh

2018-05-18 Thread Anders Roxell

When running bpf's selftest test_xdp_meta.sh it fails:
./test_xdp_meta.sh
Error: Specified qdisc not found.
selftests: test_xdp_meta [FAILED]

Need to enable CONFIG_NET_SCH_INGRESS and CONFIG_NET_CLS_ACT to get the
test to pass.

Fixes: 22c8852624fc ("bpf: improve selftests and add tests for meta pointer")
Signed-off-by: Anders Roxell 
---
 tools/testing/selftests/bpf/config | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/bpf/config 
b/tools/testing/selftests/bpf/config
index 983dd25d49f4..1eefe211a4a8 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -5,3 +5,5 @@ CONFIG_BPF_EVENTS=y
 CONFIG_TEST_BPF=m
 CONFIG_CGROUP_BPF=y
 CONFIG_NETDEVSIM=m
+CONFIG_NET_CLS_ACT=y
+CONFIG_NET_SCH_INGRESS=y
-- 
2.17.0

Re: [PATCH] crypto: nx: fix spelling mistake: "seqeunce" -> "sequence"

2018-05-18 Thread Herbert Xu

On Wed, May 09, 2018 at 10:16:36AM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CSB_ERR error message text
> 
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH] crypto: reorder paes test lexicographically

2018-05-18 Thread Herbert Xu

On Fri, May 11, 2018 at 09:04:06AM +0100, Gilad Ben-Yossef wrote:
> Due to a snafu "paes" testmgr tests were not ordered
> lexicographically, which led to boot time warnings.
> Reorder the tests as needed.
> 
> Fixes: a794d8d ("crypto: ccree - enable support for hardware keys")
> Reported-by: Abdul Haleem 
> Signed-off-by: Gilad Ben-Yossef 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH] hwrng: n2: fix spelling mistake: "restesting" -> "retesting"

2018-05-18 Thread Herbert Xu

Colin King  wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in dev_err error message
> 
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: Sometimes unusable i2c-hid devices in 4.17-rcX

2018-05-18 Thread Benjamin Tissoires

On Thu, May 17, 2018 at 4:44 PM,   wrote:
>> -Original Message-
>> From: Benjamin Tissoires [mailto:benjamin.tissoi...@gmail.com]
>> Sent: Thursday, May 17, 2018 9:28 AM
>> To: Limonciello, Mario
>> Cc: linux-input; linux-kernel@vger.kernel.org
>> Subject: Re: Sometimes unusable i2c-hid devices in 4.17-rcX
>>
>> Hi Mario,
>>
>> On Wed, May 16, 2018 at 10:00 PM,   wrote:
>> > Hi All,
>> >
>> > I've been running 4.16-rc7 on an XPS 9365 for some time and recently moved 
>> > up
>> to 4.17-rc5.
>> > Immediately I noticed that i2c-hid devices (both touchscreen and touchpad) 
>> > were
>> not working.
>> > Also when shutting the system down or rebooting it would just hang. (magic 
>> > sysrq
>> still worked).
>> >
>> > I figured it was an easy to identify regression so I started a bisect but 
>> > it came up
>> with garbage
>> > that ended in selftests shortly after 4.17-rc2.  I realized that's because 
>> > is still will
>> fail on 4.17-rc2
>> > occasionally, seemingly after trying something newer and warm rebooting.
>> > So it seems like it's "worse" after 4.17-rc2 (doesn't work at all) but semi
>> reproducible on 4.17-rc2.
>> >
>> > Not sure if I'm chasing some initialization race, but wanted to see if 
>> > anyone else
>> was running into this
>> > or has some ideas?
>>
>> I am reliably running a v4.17-rc3 with a merge on Jiri's tree on the 9360.
>>
>> I doubt it's related to the event processing as I am not encountering
>> those issues.
>>
>> It *could* be related to the interrupts not being properly raised.
>>
>> Could you monitor /proc/interrupts and check if the ones associated
>> with your i2c-hid devices are increasing when you are using them?
>> Also, does the device emits raw HID events? (you can use hid-recorder
>> to check on the hidraw nodes.)
>

Sorry, I couldn't get to it today. Monday is a public holiday here, so
I'll check on this Tuesday.

> I checked both, /proc/interrupts isn't incrementing at all with the 
> DLL077A:01 device.
> Hid-recorder is showing output from the raw HID node.

I don't really understand how the hidraw node can send data while the
interrupts are not raised.

Could you share the output of hid-recorder?

>
> Same thing for the touchscreen, no incrementing for it on the 
> i2c_designware.0 device.
>
> Something notable however;
> When in this bad state hid-recorder didn't show /dev/hidraw1 for the 
> touchscreen (which
> Happens to be a Wacom touch screen).
> It only showed /dev/hidraw0 for the touchpad.

This explains why the touchscreen doesn't increment the interrupts.
Something I missed in the first email is that the hidraw0 node
disappear for the wacom device as the touchpad gets the hidraw0 name.

Could you provide the output  of a working kernel configuration of:
sudo hid-recorder /dev/hidraw*

This should provide me the whole logs at the same time of all the
hidraw nodes, and will allow me to reproduce the combination of
wacom/hid-multitouch you are experiencing.

Cheers,
Benjamin

>
>
>>
>> Cheers,
>> Benjamin
>>
>> >
>> > #dmesg | grep 'i2c\|hid' doesn't show any obvious errors when in this 
>> > state of
>> non functional hid stuff.
>> > [2.398649] i2c /dev entries driver
>> > [2.881651] hidraw: raw HID events driver (C) Jiri Kosina
>> > [3.683583] ish-hid {33AECD58-B679-4E54-9BD9-A04D34F0C226}: [hid-ish]:
>> enum_devices_done OK, num_hid_devices=5
>> > [3.701259] hid-generic 001F:8086:22D8.0001: hidraw0:  HID
>> v2.00 Device [hid-ishtp 8086:22D8] on
>> > [3.702204] hid-generic 001F:8086:22D8.0002: hidraw1:  HID
>> v2.00 Device [hid-ishtp 8086:22D8] on
>> > [3.703063] hid-generic 001F:8086:22D8.0003: hidraw2:  HID
>> v2.00 Device [hid-ishtp 8086:22D8] on
>> > [3.704276] hid-generic 001F:8086:22D8.0004: hidraw3:  HID
>> v2.00 Device [hid-ishtp 8086:22D8] on
>> > [3.704557] hid-generic 001F:8086:22D8.0005: hidraw4:  HID
>> v2.00 Device [hid-ishtp 8086:22D8] on
>> > [3.750710] psmouse serio1: synaptics: Your touchpad (PNP: DLL077a 
>> > PNP0f13)
>> says it can support a different bus. If i2c-hid and hid-rmi are not used, 
>> you might
>> want to try setting psmouse.synaptics_intertouch to 1 and report this to 
>> linux-
>> in...@vger.kernel.org.
>> > [7.030446] acpi INT33D5:00: intel-hid: created platform device
>> > [7.199178] i2c_hid i2c-WCOM482F:00: i2c-WCOM482F:00 supply vdd not
>> found, using dummy regulator
>> > [7.246638] input: WCOM482F:00 056A:482F as
>> /devices/pci:00/:00:15.0/i2c_designware.0/i2c-6/i2c-
>> WCOM482F:00/0018:056A:482F.0006/input/input11
>> > [7.246873] hid-generic 0018:056A:482F.0006: input,hidraw0: I2C HID 
>> > v1.00
>> Mouse [WCOM482F:00 056A:482F] on i2c-WCOM482F:00
>> > [7.275279] i2c_hid i2c-DLL077A:01: i2c-DLL077A:01 supply vdd not found,
>> using dummy regulator
>> > [7.304107] input: DLL077A:01 06CB:76AF as
>> /devices/pci:00/:00:15.1/i2c_designware.1/i2c-7/i2c-
>> DLL077A:01/0018:06CB:76AF.0007/input/input14
>> > [7.304212] hid-generic 0018

Re: [PATCH 07/14] powerpc: Add support for restartable sequences

2018-05-18 Thread Mathieu Desnoyers

- On May 17, 2018, at 7:50 PM, Boqun Feng boqun.f...@gmail.com wrote:
[...]
>> > I think you're right. So we have to introduce callsite to rseq_syscall()
>> > in syscall path, something like:
>> > 
>> > diff --git a/arch/powerpc/kernel/entry_64.S 
>> > b/arch/powerpc/kernel/entry_64.S
>> > index 51695608c68b..a25734a96640 100644
>> > --- a/arch/powerpc/kernel/entry_64.S
>> > +++ b/arch/powerpc/kernel/entry_64.S
>> > @@ -222,6 +222,9 @@ system_call_exit:
>> >mtmsrd  r11,1
>> > #endif /* CONFIG_PPC_BOOK3E */
>> > 
>> > +  addir3,r1,STACK_FRAME_OVERHEAD
>> > +  bl  rseq_syscall
>> > +
>> >ld  r9,TI_FLAGS(r12)
>> >li  r11,-MAX_ERRNO
>> >andi.
>> >
>> > r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
>> > 

By the way, I think this is not the right spot to call rseq_syscall, because
interrupts are disabled. I think we should move this hunk right after 
system_call_exit.

Would you like to implement and test an updated patch adding those calls for 
ppc 32 and 64 ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Re: [PATCHv2][SMB3] Add kernel trace support

2018-05-18 Thread Steve French

On Fri, May 18, 2018 at 1:00 AM, ronnie sahlberg
 wrote:
> Fair enough.
>
> Onto a second point then, for future patches.
>
> There are a lot of places where we do not (yet) pass a tcon as argument.
> Can we make a policy decision to mandate that every tracepoint MUST
> log tid, sid?
>
> And thus, if you need a new tracepoint in a function where tcon is not
> available you then first
> have to do the plumbing to pass the tcon all the way down to the tracepoint?

I think this can be relaxed in a few cases (not just session setup),
but is a good
general rule.  If only the parent function needs to be changed it is
easy decision,
but if changes more than two layers, perhaps not.

> Ideally we should pass tcon into any and every leaf function for
> tracepoints as well as debug logging.

We also have to
- log the tid and session id somewhere in human readable form
- perhaps allow an IOCtl to retrieve the UNC name from the tid, and similarly
something human readable for the session id (server and username)
(or add the tid and session id to what is displayed in
/proc/fs/cifs/DebugData so
we can extract it from the DebugData output)

> On Fri, May 18, 2018 at 4:19 PM, Steve French  wrote:
>> On Thu, May 17, 2018 at 10:28 PM, Ronnie Sahlberg  
>> wrote:
>>> Very nice.
>>>
>>> Acked-by: Ronnie Sahlberg 
>>>
>>> Possibly change the output from
>>> pid=6633 tid=0x0 sid=0x0 cmd=0 mid=0
>>> to
>>> cmd=0 mid=0 pid=6633 tid=0x0 sid=0x0
>>>
>>> just to make it easier for human-searching. I think the cmd will be useful 
>>> much more often than pid/tid/sid
>>> and this would make it easier to look for as all cmd= entries will be 
>>> aligned to the same column.
>>
>> My instinct is to preserve the consistency by beginning with the the
>> fields that will be in 90% of the commands: tree id and session id
>> (tid and sid), which would cause pid to move after sid or after cmd,
>> but I would prefer to wait on reordering fields and fixing alignment
>> till we add another set of tracepoints (e.g. in FreeXid, and in
>> SMB2_open and in the caller of negprot/sessionsetup) - we should then
>> have a better idea what formatting would make it slightly more
>> consistent and readable.



-- 
Thanks,

Steve

Re: [PATCH] arm64: kvm: use -fno-jump-tables with clang

2018-05-18 Thread Marc Zyngier

On 18/05/18 18:56, Nick Desaulniers wrote:
> + Andrey
> 
> On Fri, May 18, 2018 at 10:45 AM Marc Zyngier  wrote:
>> On 18/05/18 18:40, Nick Desaulniers wrote:
>>> On Fri, May 18, 2018 at 10:30 AM Marc Zyngier 
> wrote:
 I'm going to ask the question I've asked before when this patch cropped
 up (must be the 4th time now):
> 
> The previous threads for context:
> https://patchwork.kernel.org/patch/10060381/
> https://lkml.org/lkml/2018/3/16/434
> 
 Is it guaranteed that this is the only case where LLVM/clang is going
> to
 generate absolute addresses instead of using relative addressing?
>>>
>>> It seems like if there's requirements that only relative addressing be
>>> used, then the compiler should be told explicitly about this
> restriction,
>>> no?
> 
>> Certainly. What's the rune?
> 
> It seems like -fno-jump-tables covers all known issues and unblocks people
> from doing further work.  It sounds like you'd like some kind of stronger
> guarantee? Wont those cases still "crop up" as far as needing to annotate
> either the code, or build scripts?

What I'd really like is to apply that patch knowing that:

- you have checked that with a released version of the compiler, you
don't observe any absolute address in any of the objects that are going
to be executed at EL2 on a mainline kernel,

- you have successfully run guests with a mainline kernel,

- it works for a reasonable set of common kernel configurations
(defconfig and some of the most useful debug options),

- I can reproduce your findings with the same released compiler.

Is that the case? I don't think any of the above is completely outlandish.

M.
-- 
Jazz is not dead. It just smells funny...

Re: [PATCH 00/10] RFC: assorted bcachefs patches

2018-05-18 Thread Josef Bacik

On Fri, May 18, 2018 at 01:49:12PM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2018 at 01:45:36PM -0400, Josef Bacik wrote:
> > On Fri, May 18, 2018 at 03:48:58AM -0400, Kent Overstreet wrote:
> > > These are all the remaining patches in my bcachefs tree that touch stuff 
> > > outside
> > > fs/bcachefs. Not all of them are suitable for inclusion as is, I wanted 
> > > to get
> > > some discussion first.
> > > 
> > >  * pagecache add lock
> > > 
> > > This is the only one that touches existing code in nontrivial ways.  The 
> > > problem
> > > it's solving is that there is no existing general mechanism for shooting 
> > > down
> > > pages in the page and keeping them removed, which is a real problem if 
> > > you're
> > > doing anything that modifies file data and isn't buffered writes.
> > > 
> > > Historically, the only problematic case has been direct IO, and people 
> > > have been
> > > willing to say "well, if you mix buffered and direct IO you get what you
> > > deserve", and that's probably not unreasonable. But now we have fallocate 
> > > insert
> > > range and collapse range, and those are broken in ways I frankly don't 
> > > want to
> > > think about if they can't ensure consistency with the page cache.
> > > 
> > > Also, the mechanism truncate uses (i_size and sacrificing a goat) has
> > > historically been rather fragile, IMO it might be a good think if we 
> > > switched it
> > > to a more general rigorous mechanism.
> > > 
> > > I need this solved for bcachefs because without this mechanism, the page 
> > > cache
> > > inconsistencies lead to various assertions popping (primarily when we 
> > > didn't
> > > think we need to get a disk reservation going by page cache state, but 
> > > then do
> > > the actual write and disk space accounting says oops, we did need one). 
> > > And
> > > having to reason about what can happen without a locking mechanism for 
> > > this is
> > > not something I care to spend brain cycles on.
> > > 
> > > That said, my patch is kind of ugly, and it requires filesystem changes 
> > > for
> > > other filesystems to take advantage of it. And unfortunately, since one 
> > > of the
> > > code paths that needs locking is readahead, I don't see any realistic way 
> > > of
> > > implementing the locking within just bcachefs code.
> > > 
> > > So I'm hoping someone has an idea for something cleaner (I think I recall
> > > Matthew Wilcox saying he had an idea for how to use xarray to solve 
> > > this), but
> > > if not I'll polish up my pagecache add lock patch and see what I can do 
> > > to make
> > > it less ugly, and hopefully other people find it palatable or at least 
> > > useful.
> > > 
> > >  * lglocks
> > > 
> > > They were removed by Peter Zijlstra when the last in kernel user was 
> > > removed,
> > > but I've found them useful. His commit message seems to imply he doesn't 
> > > think
> > > people should be using them, but I'm not sure why. They are a bit niche 
> > > though,
> > > I can move them to fs/bcachefs if people would prefer. 
> > > 
> > >  * Generic radix trees
> > > 
> > > This is a very simple radix tree implementation that can store types of
> > > arbitrary size, not just pointers/unsigned long. It could probably replace
> > > flex arrays.
> > > 
> > >  * Dynamic fault injection
> > > 
> > 
> > I've not looked at this at all so this may not cover your usecase, but I
> > implemeted a bpf_override_return() to do focused error injection a year 
> > ago.  I
> > have this script
> > 
> > https://github.com/josefbacik/debug-scripts/blob/master/inject-error.py
> > 
> > that does it generically, all you have to do is tag the function you want 
> > to be
> > error injectable with ALLOW_ERROR_INJECTION() and then you get all these 
> > nice
> > things like a debugfs interface to trigger them or use the above script to
> > trigger specific errors and such.  Thanks,
> 
> That sounds pretty cool...
> 
> What about being able to add a random fault injection point in the middle of 
> an
> existing function? Being able to stick race_fault() in random places was a
> pretty big win in terms of getting good code coverage out of realistic tests.

There's nothing stopping us from doing that, it just uses a kprobe to override
the function with our helper, so we could conceivably put it anywhere in the
function.  The reason I limited it to individual functions was because it was
easier than trying to figure out the side-effects of stopping mid-function.  If
I needed to fail mid-function I just added a helper where I needed it and failed
that instead.  I imagine safety is going to be of larger concern if we allow bpf
scripts to randomly return anywhere inside a function, even if the function is
marked as allowing error injection.  Thanks,

Josef

Re: [PATCHv2] drm/i2c: tda998x: Remove VLA usage

2018-05-18 Thread Kees Cook

On Tue, Apr 10, 2018 at 6:03 PM, Laura Abbott  wrote:
> There's an ongoing effort to remove VLAs[1] from the kernel to eventually
> turn on -Wvla. The vla in reg_write_range is based on the length of data
> passed. The one use of a non-constant size for this range is bounded by
> the size buffer passed to hdmi_infoframe_pack which is a fixed size.
> Switch to this upper bound.
>
> [1] https://lkml.org/lkml/2018/3/7/621
>
> Signed-off-by: Laura Abbott 

Reviewed-by: Kees Cook 

Same question for this patch: who's best to take this?

Thanks!

-Kees

> ---
> v2: Switch to make the buffer size more transparent and add a bounds
> check.
> ---
>  drivers/gpu/drm/i2c/tda998x_drv.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i2c/tda998x_drv.c 
> b/drivers/gpu/drm/i2c/tda998x_drv.c
> index 9e67a7b4e3a4..c8b6029b7839 100644
> --- a/drivers/gpu/drm/i2c/tda998x_drv.c
> +++ b/drivers/gpu/drm/i2c/tda998x_drv.c
> @@ -466,13 +466,22 @@ reg_read_range(struct tda998x_priv *priv, u16 reg, char 
> *buf, int cnt)
> return ret;
>  }
>
> +#define MAX_WRITE_RANGE_BUF 32
> +
>  static void
>  reg_write_range(struct tda998x_priv *priv, u16 reg, u8 *p, int cnt)
>  {
> struct i2c_client *client = priv->hdmi;
> -   u8 buf[cnt+1];
> +   /* This is the maximum size of the buffer passed in */
> +   u8 buf[MAX_WRITE_RANGE_BUF + 1];
> int ret;
>
> +   if (cnt > MAX_WRITE_RANGE_BUF) {
> +   dev_err(&client->dev, "Fixed write buffer too small (%d)\n",
> +   MAX_WRITE_RANGE_BUF);
> +   return;
> +   }
> +
> buf[0] = REG2ADDR(reg);
> memcpy(&buf[1], p, cnt);
>
> @@ -679,7 +688,7 @@ static void
>  tda998x_write_if(struct tda998x_priv *priv, u8 bit, u16 addr,
>  union hdmi_infoframe *frame)
>  {
> -   u8 buf[32];
> +   u8 buf[MAX_WRITE_RANGE_BUF];
> ssize_t len;
>
> len = hdmi_infoframe_pack(frame, buf, sizeof(buf));
> --
> 2.14.3
>



-- 
Kees Cook
Pixel Security

[PATCH v11 0/2] blk-mq: Rework blk-mq timeout handling again

2018-05-18 Thread Bart Van Assche

Hello Jens,

This patch series reworks blk-mq timeout handling by introducing a state
machine per request. Please consider this patch series for inclusion in the
upstream kernel.

Bart.

Changes compared to v10:
- In patch 1/2, added "default y if 64BIT" to the "config ARCH_HAVE_CMPXCHG64"
  entry in arch/Kconfig. Left out the "select ARCH_HAVE_CMPXCHG64" statements
  that became superfluous due to this change (alpha, arm64, powerpc and s390).
- Also in patch 1/2, only select ARCH_HAVE_CMPXCHG64 if X86_CMPXCHG64 has been
  selected.
- In patch 2/2, moved blk_mq_change_rq_state() from blk-mq.h to blk-mq.c.
- Added a comment header above __blk_mq_requeue_request() and
  blk_mq_requeue_request().
- Documented the MQ_RQ_* state transitions in block/blk-mq.h.
- Left out the fourth argument of blk_mq_rq_set_deadline().

Changes compared to v9:
- Addressed multiple comments related to patch 1/2: added
  CONFIG_ARCH_HAVE_CMPXCHG64 for riscv, modified
  features/locking/cmpxchg64/arch-support.txt as requested and made the
  order of the symbols in the arch/*/Kconfig alphabetical where possible.

Changes compared to v8:
- Split into two patches.
- Moved the spin_lock_init() call from blk_mq_rq_ctx_init() into
  blk_mq_init_request().
- Fixed the deadline set by blk_add_timer().
- Surrounded the das_lock member with #ifndef CONFIG_ARCH_HAVE_CMPXCHG64 /
  #endif.

Changes compared to v7:
- Fixed the generation number mechanism. Note: with this patch applied the
  behavior of the block layer does not depend on the generation number.
- Added more 32-bit architectures to the list of architectures on which
  cmpxchg64() should not be used.

Changes compared to v6:
- Used a union instead of bit manipulations to store multiple values into
  a single 64-bit field.
- Reduced the size of the timeout field from 64 to 32 bits.
- Made sure that the block layer still builds with this patch applied
  for the sh and mips architectures.
- Fixed two sparse warnings that were introduced by this patch in the
  WRITE_ONCE() calls.

Changes compared to v5:
- Restored the synchronize_rcu() call between marking a request for timeout
  handling and the actual timeout handling to avoid that timeout handling
  starts while .queue_rq() is still in progress if the timeout is very short.
- Only use cmpxchg() if another context could attempt to change the request
  state concurrently. Use WRITE_ONCE() otherwise.

Changes compared to v4:
- Addressed multiple review comments from Christoph. The most important are
  that atomic_long_cmpxchg() has been changed into cmpxchg() and also that
  there is now a nice and clean split between the legacy and blk-mq versions
  of blk_add_timer().
- Changed the patch name and modified the patch description because there is
  disagreement about whether or not the v4.16 blk-mq core can complete a
  single request twice. Kept the "Cc: stable" tag because of
  https://bugzilla.kernel.org/show_bug.cgi?id=199077.

Changes compared to v3 (see also 
https://www.mail-archive.com/linux-block@vger.kernel.org/msg20073.html):
- Removed the spinlock again that was introduced to protect the request state.
  v4 uses atomic_long_cmpxchg() instead.
- Split __deadline into two variables - one for the legacy block layer and one
  for blk-mq.

Changes compared to v2 
(https://www.mail-archive.com/linux-block@vger.kernel.org/msg18338.html):
- Rebased and retested on top of kernel v4.16.

Changes compared to v1 
(https://www.mail-archive.com/linux-block@vger.kernel.org/msg18089.html):
- Removed the gstate and aborted_gstate members of struct request and used
  the __deadline member to encode both the generation and state information.

Bart Van Assche (2):
  arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64
  blk-mq: Rework blk-mq timeout handling again

 .../features/locking/cmpxchg64/arch-support.txt|  33 +++
 arch/Kconfig   |   4 +
 arch/arm/Kconfig   |   1 +
 arch/ia64/Kconfig  |   1 +
 arch/m68k/Kconfig  |   1 +
 arch/mips/Kconfig  |   1 +
 arch/parisc/Kconfig|   1 +
 arch/riscv/Kconfig |   1 +
 arch/sparc/Kconfig |   1 +
 arch/x86/Kconfig   |   1 +
 arch/xtensa/Kconfig|   1 +
 block/blk-core.c   |   6 -
 block/blk-mq-debugfs.c |   1 -
 block/blk-mq.c | 238 ++---
 block/blk-mq.h |  64 +++---
 block/blk-timeout.c| 133 
 block/blk.h|  11 +-
 include/linux/blkdev.h |  47 ++--
 18 files changed, 308 insertions(+), 238 deletions(-)
 create mode 100644 Documentation/featu

[PATCH v11 1/2] arch/*: Add CONFIG_ARCH_HAVE_CMPXCHG64

2018-05-18 Thread Bart Van Assche

The next patch in this series introduces a call to cmpxchg64()
in the block layer core for those architectures on which this
functionality is available. Make it possible to test whether
cmpxchg64() is available by introducing CONFIG_ARCH_HAVE_CMPXCHG64.

Signed-off-by: Bart Van Assche 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Geert Uytterhoeven 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: David S. Miller 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Chris Zankel 
Cc: Max Filippov 
Cc: Arnd Bergmann 
Cc: Jonathan Corbet 
---
 .../features/locking/cmpxchg64/arch-support.txt| 33 ++
 arch/Kconfig   |  4 +++
 arch/arm/Kconfig   |  1 +
 arch/ia64/Kconfig  |  1 +
 arch/m68k/Kconfig  |  1 +
 arch/mips/Kconfig  |  1 +
 arch/parisc/Kconfig|  1 +
 arch/riscv/Kconfig |  1 +
 arch/sparc/Kconfig |  1 +
 arch/x86/Kconfig   |  1 +
 arch/xtensa/Kconfig|  1 +
 11 files changed, 46 insertions(+)
 create mode 100644 Documentation/features/locking/cmpxchg64/arch-support.txt

diff --git a/Documentation/features/locking/cmpxchg64/arch-support.txt 
b/Documentation/features/locking/cmpxchg64/arch-support.txt
new file mode 100644
index ..84bfef7242b2
--- /dev/null
+++ b/Documentation/features/locking/cmpxchg64/arch-support.txt
@@ -0,0 +1,33 @@
+#
+# Feature name:  cmpxchg64
+# Kconfig:   ARCH_HAVE_CMPXCHG64
+# description:   arch supports the cmpxchg64() API
+#
+---
+| arch |status|
+---
+|   alpha: |  ok  |
+| arc: |  ..  |
+| arm: |  ok  |
+|   arm64: |  ok  |
+| c6x: |  ..  |
+|   h8300: |  ..  |
+| hexagon: |  ..  |
+|ia64: |  ok  |
+|m68k: |  ok  |
+|  microblaze: |  ..  |
+|mips: |  ok  |
+|   nds32: |  ..  |
+|   nios2: |  ..  |
+|openrisc: |  ..  |
+|  parisc: |  ok  |
+| powerpc: |  ok  |
+|   riscv: |  ok  |
+|s390: |  ok  |
+|  sh: |  ..  |
+|   sparc: |  ok  |
+|  um: |  ..  |
+|   unicore32: |  ..  |
+| x86: |  ok  |
+|  xtensa: |  ok  |
+---
diff --git a/arch/Kconfig b/arch/Kconfig
index 8e0d665c8d53..9840b2577af1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -358,6 +358,10 @@ config HAVE_ALIGNED_STRUCT_PAGE
  on a struct page for better performance. However selecting this
  might increase the size of a struct page by a word.
 
+config ARCH_HAVE_CMPXCHG64
+   bool
+   default y if 64BIT
+
 config HAVE_CMPXCHG_LOCAL
bool
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a7f8e7f4b88f..02c75697176e 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -13,6 +13,7 @@ config ARM
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+   select ARCH_HAVE_CMPXCHG64 if !THUMB2_KERNEL
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index bbe12a038d21..31c49e1482e2 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -41,6 +41,7 @@ config IA64
select GENERIC_PENDING_IRQ if SMP
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_LEGACY
+   select ARCH_HAVE_CMPXCHG64
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select GENERIC_IOMAP
select GENERIC_SMP_IDLE_THREAD
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 785612b576f7..7b87cda3bbed 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -11,6 +11,7 @@ config M68K
select GENERIC_ATOMIC64
select HAVE_UID16
select VIRT_TO_BUS
+   select ARCH_HAVE_CMPXCHG64
select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
select GENERIC_CPU_DEVICES
select GENERIC_IOMAP
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 225c95da23ce..088bca0fd9f2 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -7,6 +7,7 @@ config MIPS
select ARCH_DISCARD_MEMBLOCK
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+   select ARCH_HAVE_CMPXCHG64 if 64BIT
select ARCH_SUPPORTS_UPROBES
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
diff --git a/a

[PATCH v11 2/2] blk-mq: Rework blk-mq timeout handling again

2018-05-18 Thread Bart Van Assche

Recently the blk-mq timeout handling code was reworked. See also Tejun
Heo, "[PATCHSET v4] blk-mq: reimplement timeout handling", 08 Jan 2018
(https://www.mail-archive.com/linux-block@vger.kernel.org/msg16985.html).
This patch reworks the blk-mq timeout handling code again. The timeout
handling code is simplified by introducing a state machine per request.
This change avoids that the blk-mq timeout handling code ignores
completions that occur after blk_mq_check_expired() has been called and
before blk_mq_rq_timed_out() has been called.

Fix this race as follows:
- Replace the __deadline member of struct request by a new member
  called das that contains the generation number, state and deadline.
  Only 32 bits are used for the deadline field such that all three
  fields occupy only 64 bits. This change reduces the maximum supported
  request timeout value from (2**63/HZ) to (2**31/HZ).
- Remove all request member variables that became superfluous due to
  this change: gstate, gstate_seq and aborted_gstate_sync.
- Remove the request state information that became superfluous due to
  this patch, namely RQF_MQ_TIMEOUT_EXPIRED.
- Remove the code that became superfluous due to this change, namely
  the RCU lock and unlock statements in blk_mq_complete_request() and
  also the synchronize_rcu() call in the timeout handler.

Notes:
- A spinlock is used to protect atomic changes of rq->das on those
  architectures that do not provide a cmpxchg64() implementation.
- Atomic instructions are only used to update the request state if
  a concurrent request state change could be in progress.
- blk_add_timer() has been split into two functions - one for the
  legacy block layer and one for blk-mq.

Signed-off-by: Bart Van Assche 
Cc: Tejun Heo 
Cc: Christoph Hellwig 
Cc: Jianchao Wang 
Cc: Ming Lei 
Cc: Sebastian Ott 
Cc: Sagi Grimberg 
Cc: Israel Rukshin ,
Cc: Max Gurtovoy 
---
 block/blk-core.c   |   6 --
 block/blk-mq-debugfs.c |   1 -
 block/blk-mq.c | 238 ++---
 block/blk-mq.h |  64 ++---
 block/blk-timeout.c| 133 ++-
 block/blk.h|  11 +--
 include/linux/blkdev.h |  47 +-
 7 files changed, 262 insertions(+), 238 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 43370faee935..cee03cad99f2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -198,12 +198,6 @@ void blk_rq_init(struct request_queue *q, struct request 
*rq)
rq->internal_tag = -1;
rq->start_time_ns = ktime_get_ns();
rq->part = NULL;
-   seqcount_init(&rq->gstate_seq);
-   u64_stats_init(&rq->aborted_gstate_sync);
-   /*
-* See comment of blk_mq_init_request
-*/
-   WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 3080e18cb859..ffa622366922 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -344,7 +344,6 @@ static const char *const rqf_name[] = {
RQF_NAME(STATS),
RQF_NAME(SPECIAL_PAYLOAD),
RQF_NAME(ZONE_WRITE_LOCKED),
-   RQF_NAME(MQ_TIMEOUT_EXPIRED),
RQF_NAME(MQ_POLL_SLEPT),
 };
 #undef RQF_NAME
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4cbfd784e837..e7dfa6ed7a44 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -318,7 +318,7 @@ static struct request *blk_mq_rq_ctx_init(struct 
blk_mq_alloc_data *data,
rq->special = NULL;
/* tag was already set */
rq->extra_len = 0;
-   rq->__deadline = 0;
+   WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
 
INIT_LIST_HEAD(&rq->timeout_list);
rq->timeout = 0;
@@ -465,6 +465,39 @@ struct request *blk_mq_alloc_request_hctx(struct 
request_queue *q,
 }
 EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
 
+/**
+ * blk_mq_change_rq_state - atomically test and set request state
+ * @rq: Request pointer.
+ * @old_state: Old request state.
+ * @new_state: New request state.
+ *
+ * Returns %true if and only if the old state was @old and if the state has
+ * been changed into @new.
+ */
+static bool blk_mq_change_rq_state(struct request *rq,
+  enum mq_rq_state old_state,
+  enum mq_rq_state new_state)
+{
+   union blk_deadline_and_state das = READ_ONCE(rq->das);
+   union blk_deadline_and_state old_val = das;
+   union blk_deadline_and_state new_val = das;
+
+   WARN_ON_ONCE(new_state == MQ_RQ_IN_FLIGHT);
+
+   old_val.state = old_state;
+   new_val.state = new_state;
+   /*
+* For transitions from state in-flight to another state cmpxchg64()
+* must be used. For other state transitions it is safe to use
+* WRITE_ONCE().
+*/
+   if (old_state != MQ_RQ_IN_FLIGHT) {
+   WRITE_ONCE(rq->das.val, new_val.val);
+   return true;
+   }
+   return blk_mq_set_rq_state(rq, old_val, new_val);

Hello,

2018-05-18 Thread Mrs. Annabelle Edo.

Hello,

Good day, How are you today and your family, i hope you are in good
health, My name is Mrs. Annabelle, i saw your email on Web. Media, I
have something to discuss with you, please reply me for more
information,


Mrs. Annabelle Ed

Re: [PATCH] drm/gma500: Remove VLA

2018-05-18 Thread Kees Cook

On Mon, Apr 9, 2018 at 2:06 PM, Laura Abbott  wrote:
>
> There's an ongoing effort to remove VLAs[1] from the kernel to eventually
> turn on -Wvla. Switch to a reasonable upper bound for the VLAs in
> the gma500 driver.
>
> [1] https://lkml.org/lkml/2018/3/7/621
>
> Signed-off-by: Laura Abbott 

Reviewed-by: Kees Cook 

Daniel, can this go via you, or what's the best path for this patch?

Thanks!

-Kees

> ---
> This was a little hard to figure out but I think 32 should be a
> comfortable upper bound based on all the structures I saw. Of course I
> can't test it.
> ---
>  drivers/gpu/drm/gma500/psb_intel_sdvo.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/gma500/psb_intel_sdvo.c 
> b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
> index 84507912be84..3d4fa9f6b94c 100644
> --- a/drivers/gpu/drm/gma500/psb_intel_sdvo.c
> +++ b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
> @@ -429,13 +429,20 @@ static const char *cmd_status_names[] = {
> "Scaling not supported"
>  };
>
> +#define MAX_ARG_LEN 32
> +
>  static bool psb_intel_sdvo_write_cmd(struct psb_intel_sdvo *psb_intel_sdvo, 
> u8 cmd,
>  const void *args, int args_len)
>  {
> -   u8 buf[args_len*2 + 2], status;
> -   struct i2c_msg msgs[args_len + 3];
> +   u8 buf[MAX_ARG_LEN*2 + 2], status;
> +   struct i2c_msg msgs[MAX_ARG_LEN + 3];
> int i, ret;
>
> +   if (args_len > MAX_ARG_LEN) {
> +   DRM_ERROR("Need to increase arg length\n");
> +   return false;
> +   }
> +
> psb_intel_sdvo_debug_write(psb_intel_sdvo, cmd, args, args_len);
>
> for (i = 0; i < args_len; i++) {
> --
> 2.14.3
>



-- 
Kees Cook
Pixel Security

Re: [RFC PATCH 01/11] vfs: push __sync_blockdev calls down into sync_fs routines

2018-05-18 Thread Jeff Layton

On Fri, 2018-05-18 at 08:56 -0700, Christoph Hellwig wrote:
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1097,7 +1097,7 @@ xfs_fs_sync_fs(
> >  * Doing anything during the async pass would be counterproductive.
> >  */
> > if (!wait)
> > -   return 0;
> > +   goto out;
> >  
> > xfs_log_force(mp, XFS_LOG_SYNC);
> > if (laptop_mode) {
> > @@ -1108,8 +1108,8 @@ xfs_fs_sync_fs(
> >  */
> > flush_delayed_work(&mp->m_log->l_work);
> > }
> > -
> > -   return 0;
> > +out:
> > +   return __sync_blockdev(sb->s_bdev, wait);
> 
> XFS never uses the block device mapping for anything, so this is
> not needed.
> 

Thanks, I wasn't sure about xfs. I'll drop this hunk.

FWIW, I think pushing this call down into the sync_fs routines is still
probably the right thing to do, regardless of the state of the later
patches.

> > +/*
> > + * Many legacy filesystems don't have a sync_fs op. For them, we just flush
> > + * the block device (if there is one).
> > + */
> > +static inline int call_sync_fs(struct super_block *sb, int wait)
> > +{
> > +   if (sb->s_op->sync_fs)
> > +   return sb->s_op->sync_fs(sb, wait);
> > +   return __sync_blockdev(sb->s_bdev, wait);
> > +}
> 
> The proper name for this would be vfs_sync_fs.  And I don't think it
> warrants an inline.

I patterned the name after the call_mmap (and now-defunct call_fsync)
helpers. I'll rename it and change it to be non-inlined.

Thanks,
-- 
Jeff Layton

Re: [PATCH] arm64: kvm: use -fno-jump-tables with clang

2018-05-18 Thread Nick Desaulniers

+ Andrey

On Fri, May 18, 2018 at 10:45 AM Marc Zyngier  wrote:
> On 18/05/18 18:40, Nick Desaulniers wrote:
> > On Fri, May 18, 2018 at 10:30 AM Marc Zyngier 
wrote:
> >> I'm going to ask the question I've asked before when this patch cropped
> >> up (must be the 4th time now):

The previous threads for context:
https://patchwork.kernel.org/patch/10060381/
https://lkml.org/lkml/2018/3/16/434

> >> Is it guaranteed that this is the only case where LLVM/clang is going
to
> >> generate absolute addresses instead of using relative addressing?
> >
> > It seems like if there's requirements that only relative addressing be
> > used, then the compiler should be told explicitly about this
restriction,
> > no?

> Certainly. What's the rune?

It seems like -fno-jump-tables covers all known issues and unblocks people
from doing further work.  It sounds like you'd like some kind of stronger
guarantee? Wont those cases still "crop up" as far as needing to annotate
either the code, or build scripts?
-- 
Thanks,
~Nick Desaulniers

Re: [PATCH v2 3/9] security: define security_kernel_read_blob() wrapper

2018-05-18 Thread Mimi Zohar

On Sat, 2018-05-19 at 03:13 +1000, James Morris wrote:
> On Thu, 17 May 2018, Eric W. Biederman wrote:
> 
> > Nacked-by: "Eric W. Biederman" 
> > 
> > Nack on this sharing nonsense.  These two interfaces do not share any
> > code in their implementations other than the if statement to distinguish
> > between the two cases.
> 
> Hmm, it's not even doing that.
> 
> There's already an if(!file && read_id == X) { } check and this is another 
> one being added.
> 
> > If we want comprehensible and maintainable code in the security modules
> > we need to split these two pieces of functionality apart.
> 
> All ima_read is doing in both the old and new case is checking if there's 
> no file then if it's a certain operation, returning an error.
> 
> To echo Eric and Casey's suggestions, how about changing the name of the 
> hook to security_kernel_read_data() ?

Thanks, James.  Somehow I missed this option.  Renaming the existing
hook, would be the easiest solution.  Eric, are you in agreement with
James' naming suggestion/solution?

> Then ima_read_file() can be changed to ima_read_data(), and then instead 
> of two if (!file && read_id == X) checks, have:
> 
>   if (!file) {
>   switch (read_id) {
>   }
>   }
> 
> 
> 
>

Re: [PATCH] kvm: rename HINTS_DEDICATED to KVM_HINTS_REALTIME

2018-05-18 Thread Eduardo Habkost

On Fri, May 18, 2018 at 07:18:57PM +0200, Paolo Bonzini wrote:
> On 18/05/2018 19:13, Eduardo Habkost wrote:
> >> As much as we'd like to be helpful and validate input, you need a real
> >> time host too. I'm not sure how we'd find out - I suggest we do not
> >> bother for now.
> > I'm worried that people will start enabling the flag in all kinds
> > of scenarios where the guarantees can't be kept, and make the
> > meaning of the flag in practice completely different from its
> > documented meaning.
> 
> I don't think we should try to detect anything.  As far as QEMU is
> concerned, it's mostly garbage in, garbage out when it comes to invalid
> configurations.  It's just a bit, and using it in invalid configurations
> is okay if you're doing it (for example) for debugging.

In this case, I'd like the requirements and recommendations to be
included in QEMU documentation.  Especially to point out the most
obvious and more likely mistakes (like not ensuring memory is
pinned at all, or letting the vCPU threads be interrupted).

So, is there a known list of steps required to configure a host
to enable kvm-hints-realtime safely, already?  I'd like the
documentation to be better than "you should fiddle with the CPU
affinity on your system and also ensure memory will be pinned;
good luck".

-- 
Eduardo

htmldocs: include/linux/skbuff.h:850: warning: Function parameter or member 'ip_defrag_offset' not described in 'sk_buff'

2018-05-18 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   3acf4e395260e3bd30a6fa29ba7eada4bf7566ca
commit: bf66337140c64c27fa37222b7abca7e49d63fb57 inet: frags: get rid of 
ipfrag_skb_cb/FRAG_CB
date:   7 weeks ago
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   include/net/mac80211.h:2083: warning: bad line: >
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'rx_stats_avg' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'rx_stats_avg.signal' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'rx_stats_avg.chain_signal' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.filtered' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.retry_failed' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.retry_count' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.lost_packets' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.last_tdls_pkt_time' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'msdu_retries' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'msdu_failed' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.last_ack' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.last_ack_signal' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'status_stats.ack_signal_filled' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'tx_stats.packets' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'tx_stats.bytes' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 
'tx_stats.last_rate' not described in 'sta_info'
   net/mac80211/sta_info.h:586: warning: Function parameter or member 'msdu' 
not described in 'sta_info'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_excl.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_excl.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_excl.active' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_shared.cb' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_shared.poll' not described in 'dma_buf'
   include/linux/dma-buf.h:307: warning: Function parameter or member 
'cb_shared.active' not described in 'dma_buf'
   include/linux/dma-fence-array.h:54: warning: Function parameter or member 
'work' not described in 'dma_fence_array'
   include/linux/gpio/driver.h:142: warning: Function parameter or member 
'request_key' not described in 'gpio_irq_chip'
   include/linux/iio/iio.h:270: warning: Function parameter or member 
'scan_type.sign' not described in 'iio_chan_spec'
   include/linux/iio/iio.h:270

Re: [PATCH 2/6] x86: bug: prevent gcc distortions

2018-05-18 Thread Joe Perches

On Fri, 2018-05-18 at 14:22 +, Nadav Amit wrote:
> It is hard to make the code readable in C, readable in the generated asm,
> and to follow the coding style imposed by checkpatch (e.g., no space between
> the newline and the asm argument before it).

Ignore checkpatch when you know better.

[PATCHv8] gpio: Remove VLA from gpiolib

2018-05-18 Thread Laura Abbott

The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621) to eventually
turn on -Wvla.

Using a kmalloc array is the easy way to fix this but kmalloc is still
more expensive than stack allocation. Introduce a fast path with a
fixed size stack array to cover most chip with gpios below some fixed
amount. The slow path dynamically allocates an array to cover those
chips with a large number of gpios.

Reviewed-by: Geert Uytterhoeven 
Reviewed-by: Phil Reid 
Reviewed-and-tested-by: Lukas Wunner 
Signed-off-by: Lukas Wunner 
Signed-off-by: Laura Abbott 
---
v8: Adjust lower bound
---
 drivers/gpio/Kconfig  | 12 ++
 drivers/gpio/gpiolib.c| 76 +++
 drivers/gpio/gpiolib.h|  2 +-
 include/linux/gpio/consumer.h | 10 +++--
 4 files changed, 78 insertions(+), 22 deletions(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index b960f6f35abd..71c0ab46f216 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -22,6 +22,18 @@ menuconfig GPIOLIB
 
 if GPIOLIB
 
+config GPIOLIB_FASTPATH_LIMIT
+   int "Maximum number of GPIOs for fast path"
+   range 32 512
+   default 512
+   help
+  This adjusts the point at which certain APIs will switch from
+  using a stack allocated buffer to a dynamically allocated buffer.
+
+  You shouldn't need to change this unless you really need to
+  optimize either stack space or performance. Change this carefully
+  since setting an incorrect value could cause stack corruption.
+
 config OF_GPIO
def_bool y
depends on OF
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index d8ccb500872f..d29916cd3b74 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -61,6 +61,11 @@ static struct bus_type gpio_bus_type = {
.name = "gpio",
 };
 
+/*
+ * Number of GPIOs to use for the fast path in set array
+ */
+#define FASTPATH_NGPIO CONFIG_GPIOLIB_FASTPATH_LIMIT
+
 /* gpio_lock prevents conflicts during gpio_desc[] table updates.
  * While any GPIO is requested, its gpio_chip is not removable;
  * each GPIO's "requested" flag serves as a lock and refcount.
@@ -450,12 +455,11 @@ static long linehandle_ioctl(struct file *filep, unsigned 
int cmd,
vals[i] = !!ghd.values[i];
 
/* Reuse the array setting function */
-   gpiod_set_array_value_complex(false,
+   return gpiod_set_array_value_complex(false,
  true,
  lh->numdescs,
  lh->descs,
  vals);
-   return 0;
}
return -EINVAL;
 }
@@ -1244,6 +1248,10 @@ int gpiochip_add_data_with_key(struct gpio_chip *chip, 
void *data,
goto err_free_descs;
}
 
+   if (chip->ngpio > FASTPATH_NGPIO)
+   chip_warn(chip, "line cnt %u is greater than fast path cnt 
%u\n",
+   chip->ngpio, FASTPATH_NGPIO);
+
gdev->label = kstrdup_const(chip->label ?: "unknown", GFP_KERNEL);
if (!gdev->label) {
status = -ENOMEM;
@@ -2719,16 +2727,28 @@ int gpiod_get_array_value_complex(bool raw, bool 
can_sleep,
 
while (i < array_size) {
struct gpio_chip *chip = desc_array[i]->gdev->chip;
-   unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
-   unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
+   unsigned long fastpath[2 * BITS_TO_LONGS(FASTPATH_NGPIO)];
+   unsigned long *mask, *bits;
int first, j, ret;
 
+   if (likely(chip->ngpio <= FASTPATH_NGPIO)) {
+   mask = fastpath;
+   } else {
+   mask = kmalloc_array(2 * BITS_TO_LONGS(chip->ngpio),
+  sizeof(*mask),
+  can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+   if (!mask)
+   return -ENOMEM;
+   }
+
+   bits = mask + BITS_TO_LONGS(chip->ngpio);
+   memset(mask, 0, BITS_TO_LONGS(chip->ngpio) * sizeof(*mask));
+
if (!can_sleep)
WARN_ON(chip->can_sleep);
 
/* collect all inputs belonging to the same chip */
first = i;
-   memset(mask, 0, sizeof(mask));
do {
const struct gpio_desc *desc = desc_array[i];
int hwgpio = gpio_chip_hwgpio(desc);
@@ -2739,8 +2759,11 @@ int gpiod_get_array_value_complex(bool raw, bool 
can_sleep,
 (desc_array[i]->gdev->chip == chip));
 
ret = gpio_chip_get_multiple(chip, mask, bits);
-   if (ret)
+   if (ret) {
+   if (mask != fastpat

Re: [PATCH 1/4] arcnet: com20020: Add com20020 io mapped version

2018-05-18 Thread David Miller

From: Andrea Greco 
Date: Fri, 18 May 2018 14:18:41 +0200

> In com20020.c found this:
> /* FIXME: do this some other way! */
> if (!dev->dev_addr[0])
> dev->dev_addr[0] = arcnet_inb(ioaddr, 8);
> 
> NODE-ID, must be univoque, for all arcnet network.
> My previews idea was take random value but, this could create a
> collision over network.
> 
> A possible solution is:
> In case of collision com20020 set a bit in status register.
> Then peak a new NODE-ID and repeat this while correct NODE-ID is found.
> 
> Other ideas is pass it via DTS.
> But suppose have 2 same product in same network, same address same problem.
> For this reason i prefer left standard driver behavior.
> 
> Other ideas for solve this ?

Is there no way to obtain a unique value from the device?

If having a unique ID to talk on the ARCNET is so critical, there must
be some way to properly allocation and use a unique ID.

I guess this must be a general problem with this driver already.

You still need to address the issue of 'dev' being leaked on probe
error paths.

Thank you.

Re: [PATCH] objtool: Detect assembly code falling through to INT3 padding

2018-05-18 Thread Alexey Dobriyan

On Fri, May 18, 2018 at 09:18:14AM +0200, Ingo Molnar wrote:
> The concept of built-in kernel tooling working at the machine code level is 
> just 
> so powerful - we should have added our own KCC compiler 20 years ago.

...for two very serious reasons

* C as a language moves very slowly, last help from the comittee were
  C99 intializers which are OK, but, say, memory model was explictly
  rejected. However the project expands and becomes more complex much
  faster than C working group sets up meetings. Compiler authors help
  with extensions but ultimately can not be relied on (see "inline" saga).

  Recently everyone was celebrating new and improved min() and max()
  macros admiring creativity and knowledge of intricate language details
  (me too, don't get this wrong).

  Now this is how it can be done in a language which is not stupid:

constexpr int min(int a, int b)
{
return a < b ? a : b;
}

  That's literally all. And you can also do

template
void min(T a, char b) = delete;

template
void min(char a, T b) = delete;

  because "char" is char.

  Having control over compiler things like that can be addded more
  quickly.

* insulating the project from the whims of compiler authors who every
  once in a while use "undefined behaviour" or other kinds of language
  lawyering to do strange things.

  Other serious projects do this too. Database people use O_DIRECT
  to insulate themselves from kernel people for the very same reasons.

Re: [PATCH] usb-storage: Add quirks to make G-Technology "G-Drive" work

2018-05-18 Thread Alexander Kappner

> Was this tested with uas or usb-storage?
On both. dd works either way.  

> Are you certain Oliver's new code was executed?  If you put 
> US_FL_NO_WP_DETECT only in unusual_devs.h and not in ususual_uas.h then 
> it would not affect the uas driver.
Yes, the code ran.

Further debugging shows that the code that causes the device to hang is in 
drivers/scsi/sd.c:2698. So the reason why usb-storage works and UAS does 
not is because the device setting both skip_ms_page_3f=1 and 
skip_ms_page_8=1 is required. The US_FL_NO_WP_DETECT flag does the former, 
and usb-storage (but not UAS) by default does the latter 
(drivers/usb/storage/scsiglue.c:198):

 /*
 * A number of devices have problems with MODE SENSE for
 * page x08, so we will skip it.
 */
sdev->skip_ms_page_8 = 1;

Forcing both skip_ms_page_3f and skip_ms_page_8 to 1 results in UAS and 
usb-storage working. Oliver's code only set skip_ms_page_3f. 

Do we want a patch to introduce a quirk flag for skip_ms_page_8,  similar to 
the US_FL_NO_WP_DETECT quirk bit for skip_ms_page_3f? This may even resolve 
the issues with other devices in unusual_uas.h that currently have all UAS 
support disabled. I'd be happy to write the patch, but I'm not sure we want 
to reserve a quirk bit for what's currently only a single device known to 
have this issue.  Please advise. 

On 05/17/2018 12:13 PM, Alan Stern wrote:
> On Thu, 17 May 2018, Alexander Kappner wrote:
> 
>> Oliver and Alan,
>>
>> thank for investigating.
>>
>>> this is suspicious. You do not actually whether US_FL_NO_WP_DETECT
>>> by itself would make the device work. Can you please test that?
>>
>> US_FL_NO_WP_DETECT without US_FL_IGNORE_UAS does not make a difference, 
>> even with the patch you included applied:
> 
> Are you certain Oliver's new code was executed?  If you put 
> US_FL_NO_WP_DETECT only in unusual_devs.h and not in ususual_uas.h then 
> it would not affect the uas driver.
> 
>>> That's bizarre too.  Even though the only difference is a MODE SENSE 
>>> command, the command that actually faliled was WRITE(16).
>> It looks to me like the MODE SENSE simply hangs the drive, so anything 
>> issued after that will fail. Of course the drive says it's the "current 
>> command" that caused the failure, but I wouldn't give too much credence to 
>> that. FYI -- this device is a consumer grade rotational drive that you can 
>> get for less than $200, so I wouldn't be surprised if the implementations 
>> have issues. 
>>
>> Also, I noticed that copying onto the drive with dd works fine, whereas 
>> trying to mount a filesystem immediately crashes it. I suspect this is 
>> because check_disk_change is called on mount (which eventually calls down 
>> to sd_read_write_protect_flag, which is where the US_FL_NO_WP_DETECT flag 
>> comes into play).
> 
> Was this tested with uas or usb-storage?
> 
> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages

2018-05-18 Thread Punit Agrawal

Punit Agrawal  writes:

> Tsukada-san,
>
> I am not familiar with memcg so can't comment about whether the patchset
> is the right way to solve the problem outlined in the cover letter but
> had a couple of comments about this patch.
>
> TSUKADA Koutaro  writes:
>
>> The current memcg implementation assumes that the compound page is THP.
>> In order to be able to charge surplus hugepage, we use compound_order.
>>
>> Signed-off-by: TSUKADA Koutaro 
>
> Please move this before Patch 1/7. This is to prevent wrong accounting
> of pages to memcg for size != PMD_SIZE.

I just noticed that the default state is off so the change isn't enabled
until the sysfs node is exposed in the next patch. Please ignore this
comment.

One below still applies.

>
>> ---
>>  memcontrol.c |   10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 2bd3df3..a8f1ff8 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -4483,7 +4483,7 @@ static int mem_cgroup_move_account(struct page *page,
>> struct mem_cgroup *to)
>>  {
>>  unsigned long flags;
>> -unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
>> +unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>
> Instead of replacing calls to hpage_nr_pages(), is it possible to modify
> it to do the calculation?
>
> Thanks,
> Punit
>
>>  int ret;
>>  bool anon;
>>
>> @@ -5417,7 +5417,7 @@ int mem_cgroup_try_charge(struct page *page, struct 
>> mm_struct *mm,
>>bool compound)
>>  {
>>  struct mem_cgroup *memcg = NULL;
>> -unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
>> +unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>>  int ret = 0;
>>
>>  if (mem_cgroup_disabled())
>> @@ -5478,7 +5478,7 @@ int mem_cgroup_try_charge(struct page *page, struct 
>> mm_struct *mm,
>>  void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg,
>>bool lrucare, bool compound)
>>  {
>> -unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
>> +unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>>
>>  VM_BUG_ON_PAGE(!page->mapping, page);
>>  VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page);
>> @@ -5522,7 +5522,7 @@ void mem_cgroup_commit_charge(struct page *page, 
>> struct mem_cgroup *memcg,
>>  void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg,
>>  bool compound)
>>  {
>> -unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
>> +unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>>
>>  if (mem_cgroup_disabled())
>>  return;
>> @@ -5729,7 +5729,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct 
>> page *newpage)
>>
>>  /* Force-charge the new page. The old one will be freed soon */
>>  compound = PageTransHuge(newpage);
>> -nr_pages = compound ? hpage_nr_pages(newpage) : 1;
>> +nr_pages = compound ? (1 << compound_order(newpage)) : 1;
>>
>>  page_counter_charge(&memcg->memory, nr_pages);
>>  if (do_memsw_account())

Re: [PATCH] idr: fix invalid ptr dereference on item delete

2018-05-18 Thread Matthew Wilcox

It'd be nice if you cc'd the person who wrote the code you're patching.
You'd get a response a lot quicker than waiting until I happened to
notice the email in a different forum.

Thanks for finding the situation that leads to the bug.  Your fix is
incorrect; it's legitimate to store a NULL value at offset 0, and
your patch makes it impossible to delete.  Fortunately, the test-suite
covers that case ;-)

Andrew, can you take this through your tree for extra testing?

--- >8 ---

From: Matthew Wilcox 

If the radix tree underlying the IDR happens to be full and we attempt
to remove an id which is larger than any id in the IDR, we will call
__radix_tree_delete() with an uninitialised 'slot' pointer, at which
point anything could happen.  This was easiest to hit with a single entry
at id 0 and attempting to remove a non-0 id, but it could have happened
with 64 entries and attempting to remove an id >= 64.

Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
Reported-by: syzbot+35666cba7f0a337e2...@syzkaller.appspotmail.com
Debugged-by: Roman Kagan 
Signed-off-by: Matthew Wilcox 

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index da9e10c827df..4dd4fbc7279c 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -2036,10 +2036,12 @@ void *radix_tree_delete_item(struct radix_tree_root 
*root,
 unsigned long index, void *item)
 {
struct radix_tree_node *node = NULL;
-   void __rcu **slot;
+   void __rcu **slot = NULL;
void *entry;
 
entry = __radix_tree_lookup(root, index, &node, &slot);
+   if (!slot)
+   return NULL;
if (!entry && (!is_idr(root) || node_tag_get(root, node, IDR_FREE,
get_slot_offset(node, slot
return NULL;
diff --git a/tools/testing/radix-tree/idr-test.c 
b/tools/testing/radix-tree/idr-test.c
index 1c18617951dd..410ca58bbe9c 100644
--- a/tools/testing/radix-tree/idr-test.c
+++ b/tools/testing/radix-tree/idr-test.c
@@ -254,6 +254,13 @@ void idr_checks(void)
idr_remove(&idr, 0xfedcba98U);
idr_remove(&idr, 0);
 
+   assert(idr_alloc(&idr, DUMMY_PTR, 0, 0, GFP_KERNEL) == 0);
+   idr_remove(&idr, 1);
+   for (i = 1; i < RADIX_TREE_MAP_SIZE; i++)
+   assert(idr_alloc(&idr, DUMMY_PTR, 0, 0, GFP_KERNEL) == i);
+   idr_remove(&idr, 1 << 30);
+   idr_destroy(&idr);
+
for (i = INT_MAX - 3UL; i < INT_MAX + 1UL; i++) {
struct item *item = item_create(i, 0);
assert(idr_alloc(&idr, item, i, i + 10, GFP_KERNEL) == i);

--- original email ---

If an IDR contains a single entry at index==0, the underlying radix tree
has a single item in its root node, in which case
__radix_tree_lookup(index!=0) doesn't set its *@nodep argument (in
addition to returning NULL).

However, the tree itself is not empty, i.e. the tree root doesn't have
IDR_FREE tag.

As a result, on an attempt to remove an index!=0 entry from such an IDR,
radix_tree_delete_item doesn't return early and calls
__radix_tree_delete with invalid parameters which are then dereferenced.

Reported-by: syzbot+35666cba7f0a337e2...@syzkaller.appspotmail.com
Signed-off-by: Roman Kagan 
---
 lib/radix-tree.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index da9e10c827df..10ff1bfae952 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -2040,8 +2040,9 @@ void *radix_tree_delete_item(struct radix_tree_root *root,
void *entry;
 
entry = __radix_tree_lookup(root, index, &node, &slot);
-   if (!entry && (!is_idr(root) || node_tag_get(root, node, IDR_FREE,
-   get_slot_offset(node, slot
+   if (!entry && (!is_idr(root) || !node ||
+  node_tag_get(root, node, IDR_FREE,
+   get_slot_offset(node, slot
return NULL;
 
if (item && entry != item)

Re: seq_putc is preferable to seq_puts

2018-05-18 Thread Joe Perches

On Fri, 2018-05-18 at 07:09 -0700, Matthew Wilcox wrote:
> Hi Joe,
> 
> A nice checkpatch addition would be to look for seq_puts(x, "s")
> and recommend the author use seq_putc(x, 's') instead.  The primary
> offender is seq_puts(x, "\n") (a hundred or so, scattered throughout
> the kernel), but seq_puts(x, " ") has its adherents too.

Firstly, has Markus Elfring left the building?

Here's a grep to find the existing uses:

$ git grep -P '\b(?:seq_puts|seq_printf)\s*\(\s*[^,]+,\s*"(?:\\.|.)"\s*\)\s*;' 
| \
  wc -l
326

and a trivial script to convert them all treewide

$ git grep -P --name-only 
'\b(?:seq_puts|seq_printf)\s*\(\s*[^,]+,\s*"(?:\\.|.)"\s*\)\s*;' | \
  xargs perl -i -e 'local $/; while (<>) { 
s/\b(?:seq_puts|seq_printf)\s*\(\s*([^,]+),\s*"(\\.|.)"\s*\)\s*;/seq_putc(\1, 
'"'\2'"')/g; print;}'

Anyway, I still like a variant of Steven Rostedt's
trace_puts macro silliness hack that avoids all
this by allowing seq_printf without format arguments
to call the appropriate seq_puts/seq_putc invisibly.

https://groups.google.com/forum/#!topic/linux.kernel/YGZhTD-n4UA

It's relatively easy to expand this for seq_putc, but here's
an expanded checkpatch rule anyway that seems to work.
---
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index baddac9379f0..62b36d7bb3af 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5791,11 +5791,23 @@ sub process {
 "struct spinlock should be spinlock_t\n" . 
$herecurr);
}
 
-# check for seq_printf uses that could be seq_puts
-   if ($sline =~ /\bseq_printf\s*\(.*"\s*\)\s*;\s*$/) {
+# check for seq_printf/seq_puts uses that could be seq_puts or seq_putc
+   if ($sline =~ /\b(seq_(?:printf|puts))\s*\(.*"\s*\)\s*;\s*$/) {
+   my $call = $1;
my $fmt = get_quoted_string($line, $rawline);
+   my $fmt_nq = substr($fmt, 1, length($fmt) - 2); #strip 
outer quotes
$fmt =~ s/%%//g;
-   if ($fmt !~ /%/) {
+
+   if (length($fmt_nq) == 1 ||
+   (length($fmt_nq) == 2 && ($fmt_nq =~ /^\\/ || 
$fmt_nq =~ /\%\%/))) {
+   my $fmt_sub = $fmt_nq;
+   $fmt_sub =~ s/\%\%/%/;
+   if (WARN("PREFER_SEQ_PUTC",
+"Prefer seq_putc to $call\n" . 
$herecurr) &&
+   $fix) {
+   $fixed[$fixlinenr] =~ 
s/\b$call\s*\(\s*($FuncArg)\s*,\s*"\Q$fmt_nq\E"\s*\)/seq_putc($1, '$fmt_sub')/x;
+   }
+   } elsif ($fmt !~ /%/) {
if (WARN("PREFER_SEQ_PUTS",
 "Prefer seq_puts to seq_printf\n" . 
$herecurr) &&
$fix) {

Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

2018-05-18 Thread Russell King - ARM Linux

On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote:
> I never understood the need for this direction. And if memory serves me
> right, at that time I was seeing twice the amount of cache flushing !

It's necessary.  Take a moment to think carefully about this:

dma_map_single(, dir)

dma_sync_single_for_cpu(, dir)

dma_sync_single_for_device(, dir)

dma_unmap_single(, dir)

In the case of a DMA-incoherent architecture, the operations done at each
stage depend on the direction argument:

map for_cpu for_device  unmap
TO_DEV  writeback   nonewriteback   none
TO_CPU  invalidate  invalidate* invalidate  invalidate*
BIDIR   writeback   invalidate  writeback   invalidate

* - only necessary if the CPU speculatively prefetches.

The multiple invalidations for the TO_CPU case handles different
conditions that can result in data corruption, and for some CPUs, all
four are necessary.

This is what is implemented for 32-bit ARM, depending on the CPU
capabilities, as we have DMA incoherent devices and we have CPUs that
speculatively prefetch data, and so may load data into the caches while
DMA is in operation.

Things get more interesting if the implementation behind the DMA API has
to copy data between the buffer supplied to the mapping and some DMA
accessible buffer:

map for_cpu for_device  unmap
TO_DEV  copy to dma nonecopy to dma none
TO_CPU  nonecopy to cpu nonecopy to cpu
BIDIR   copy to dma copy to cpu copy to dma copy to cpu

So, in both cases, the value of the direction argument defines what you
need to do in each call.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

Re: [PATCH] PM / devfreq: Fix handling of min/max_freq == 0

2018-05-18 Thread Matthias Kaehlcke

On Fri, May 18, 2018 at 08:27:55AM +0900, Chanwoo Choi wrote:
> Hi,
> 
> Please resend the patch with related other patches.
> It is hard to check the dependency/sequence of your patches.

The three patches (excluding the RFC) I sent in the past days are
independent from each other and can be applied individually in no
particular order.

I encountered and fixed the issues one after another, when I sent the
first or second patch I didn't know there would be another
one. Anyway, I'll resend as a series together with other patches,
unless some of the patches get applied before.

Thanks

Matthias

Re: [PATCH] net: mvpp2: typo and cosmetic fixes

2018-05-18 Thread David Miller

From: Antoine Tenart 
Date: Fri, 18 May 2018 14:34:51 +0200

> This patch on the Marvell PPv2 driver is only cosmetic. Two typos are
> removed as well as other cosmetic fixes, such as extra new lines or tabs
> vs spaces.
> 
> Suggested-by: Stefan Chulski 
> Signed-off-by: Antoine Tenart 

Applied, thanks.

Re: [PATCH 00/10] RFC: assorted bcachefs patches

2018-05-18 Thread Kent Overstreet

On Fri, May 18, 2018 at 01:45:36PM -0400, Josef Bacik wrote:
> On Fri, May 18, 2018 at 03:48:58AM -0400, Kent Overstreet wrote:
> > These are all the remaining patches in my bcachefs tree that touch stuff 
> > outside
> > fs/bcachefs. Not all of them are suitable for inclusion as is, I wanted to 
> > get
> > some discussion first.
> > 
> >  * pagecache add lock
> > 
> > This is the only one that touches existing code in nontrivial ways.  The 
> > problem
> > it's solving is that there is no existing general mechanism for shooting 
> > down
> > pages in the page and keeping them removed, which is a real problem if 
> > you're
> > doing anything that modifies file data and isn't buffered writes.
> > 
> > Historically, the only problematic case has been direct IO, and people have 
> > been
> > willing to say "well, if you mix buffered and direct IO you get what you
> > deserve", and that's probably not unreasonable. But now we have fallocate 
> > insert
> > range and collapse range, and those are broken in ways I frankly don't want 
> > to
> > think about if they can't ensure consistency with the page cache.
> > 
> > Also, the mechanism truncate uses (i_size and sacrificing a goat) has
> > historically been rather fragile, IMO it might be a good think if we 
> > switched it
> > to a more general rigorous mechanism.
> > 
> > I need this solved for bcachefs because without this mechanism, the page 
> > cache
> > inconsistencies lead to various assertions popping (primarily when we didn't
> > think we need to get a disk reservation going by page cache state, but then 
> > do
> > the actual write and disk space accounting says oops, we did need one). And
> > having to reason about what can happen without a locking mechanism for this 
> > is
> > not something I care to spend brain cycles on.
> > 
> > That said, my patch is kind of ugly, and it requires filesystem changes for
> > other filesystems to take advantage of it. And unfortunately, since one of 
> > the
> > code paths that needs locking is readahead, I don't see any realistic way of
> > implementing the locking within just bcachefs code.
> > 
> > So I'm hoping someone has an idea for something cleaner (I think I recall
> > Matthew Wilcox saying he had an idea for how to use xarray to solve this), 
> > but
> > if not I'll polish up my pagecache add lock patch and see what I can do to 
> > make
> > it less ugly, and hopefully other people find it palatable or at least 
> > useful.
> > 
> >  * lglocks
> > 
> > They were removed by Peter Zijlstra when the last in kernel user was 
> > removed,
> > but I've found them useful. His commit message seems to imply he doesn't 
> > think
> > people should be using them, but I'm not sure why. They are a bit niche 
> > though,
> > I can move them to fs/bcachefs if people would prefer. 
> > 
> >  * Generic radix trees
> > 
> > This is a very simple radix tree implementation that can store types of
> > arbitrary size, not just pointers/unsigned long. It could probably replace
> > flex arrays.
> > 
> >  * Dynamic fault injection
> > 
> 
> I've not looked at this at all so this may not cover your usecase, but I
> implemeted a bpf_override_return() to do focused error injection a year ago.  
> I
> have this script
> 
> https://github.com/josefbacik/debug-scripts/blob/master/inject-error.py
> 
> that does it generically, all you have to do is tag the function you want to 
> be
> error injectable with ALLOW_ERROR_INJECTION() and then you get all these nice
> things like a debugfs interface to trigger them or use the above script to
> trigger specific errors and such.  Thanks,

That sounds pretty cool...

What about being able to add a random fault injection point in the middle of an
existing function? Being able to stick race_fault() in random places was a
pretty big win in terms of getting good code coverage out of realistic tests.

Re: [PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages

2018-05-18 Thread Punit Agrawal

Tsukada-san,

I am not familiar with memcg so can't comment about whether the patchset
is the right way to solve the problem outlined in the cover letter but
had a couple of comments about this patch.

TSUKADA Koutaro  writes:

> The current memcg implementation assumes that the compound page is THP.
> In order to be able to charge surplus hugepage, we use compound_order.
>
> Signed-off-by: TSUKADA Koutaro 

Please move this before Patch 1/7. This is to prevent wrong accounting
of pages to memcg for size != PMD_SIZE.

> ---
>  memcontrol.c |   10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2bd3df3..a8f1ff8 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4483,7 +4483,7 @@ static int mem_cgroup_move_account(struct page *page,
>  struct mem_cgroup *to)
>  {
>   unsigned long flags;
> - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
> + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;

Instead of replacing calls to hpage_nr_pages(), is it possible to modify
it to do the calculation?

Thanks,
Punit

>   int ret;
>   bool anon;
>
> @@ -5417,7 +5417,7 @@ int mem_cgroup_try_charge(struct page *page, struct 
> mm_struct *mm,
> bool compound)
>  {
>   struct mem_cgroup *memcg = NULL;
> - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
> + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>   int ret = 0;
>
>   if (mem_cgroup_disabled())
> @@ -5478,7 +5478,7 @@ int mem_cgroup_try_charge(struct page *page, struct 
> mm_struct *mm,
>  void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg,
> bool lrucare, bool compound)
>  {
> - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
> + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>
>   VM_BUG_ON_PAGE(!page->mapping, page);
>   VM_BUG_ON_PAGE(PageLRU(page) && !lrucare, page);
> @@ -5522,7 +5522,7 @@ void mem_cgroup_commit_charge(struct page *page, struct 
> mem_cgroup *memcg,
>  void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg,
>   bool compound)
>  {
> - unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
> + unsigned int nr_pages = compound ? (1 << compound_order(page)) : 1;
>
>   if (mem_cgroup_disabled())
>   return;
> @@ -5729,7 +5729,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct 
> page *newpage)
>
>   /* Force-charge the new page. The old one will be freed soon */
>   compound = PageTransHuge(newpage);
> - nr_pages = compound ? hpage_nr_pages(newpage) : 1;
> + nr_pages = compound ? (1 << compound_order(newpage)) : 1;
>
>   page_counter_charge(&memcg->memory, nr_pages);
>   if (do_memsw_account())

Re: [PATCH] hippi: fix spelling mistake: "Framming" -> "Framing"

2018-05-18 Thread David Miller

From: Colin King 
Date: Fri, 18 May 2018 11:09:22 +0100

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in printk message text
> 
> Signed-off-by: Colin Ian King 

Applied.

Re: [PATCH 00/10] RFC: assorted bcachefs patches

2018-05-18 Thread Josef Bacik

On Fri, May 18, 2018 at 03:48:58AM -0400, Kent Overstreet wrote:
> These are all the remaining patches in my bcachefs tree that touch stuff 
> outside
> fs/bcachefs. Not all of them are suitable for inclusion as is, I wanted to get
> some discussion first.
> 
>  * pagecache add lock
> 
> This is the only one that touches existing code in nontrivial ways.  The 
> problem
> it's solving is that there is no existing general mechanism for shooting down
> pages in the page and keeping them removed, which is a real problem if you're
> doing anything that modifies file data and isn't buffered writes.
> 
> Historically, the only problematic case has been direct IO, and people have 
> been
> willing to say "well, if you mix buffered and direct IO you get what you
> deserve", and that's probably not unreasonable. But now we have fallocate 
> insert
> range and collapse range, and those are broken in ways I frankly don't want to
> think about if they can't ensure consistency with the page cache.
> 
> Also, the mechanism truncate uses (i_size and sacrificing a goat) has
> historically been rather fragile, IMO it might be a good think if we switched 
> it
> to a more general rigorous mechanism.
> 
> I need this solved for bcachefs because without this mechanism, the page cache
> inconsistencies lead to various assertions popping (primarily when we didn't
> think we need to get a disk reservation going by page cache state, but then do
> the actual write and disk space accounting says oops, we did need one). And
> having to reason about what can happen without a locking mechanism for this is
> not something I care to spend brain cycles on.
> 
> That said, my patch is kind of ugly, and it requires filesystem changes for
> other filesystems to take advantage of it. And unfortunately, since one of the
> code paths that needs locking is readahead, I don't see any realistic way of
> implementing the locking within just bcachefs code.
> 
> So I'm hoping someone has an idea for something cleaner (I think I recall
> Matthew Wilcox saying he had an idea for how to use xarray to solve this), but
> if not I'll polish up my pagecache add lock patch and see what I can do to 
> make
> it less ugly, and hopefully other people find it palatable or at least useful.
> 
>  * lglocks
> 
> They were removed by Peter Zijlstra when the last in kernel user was removed,
> but I've found them useful. His commit message seems to imply he doesn't think
> people should be using them, but I'm not sure why. They are a bit niche 
> though,
> I can move them to fs/bcachefs if people would prefer. 
> 
>  * Generic radix trees
> 
> This is a very simple radix tree implementation that can store types of
> arbitrary size, not just pointers/unsigned long. It could probably replace
> flex arrays.
> 
>  * Dynamic fault injection
> 

I've not looked at this at all so this may not cover your usecase, but I
implemeted a bpf_override_return() to do focused error injection a year ago.  I
have this script

https://github.com/josefbacik/debug-scripts/blob/master/inject-error.py

that does it generically, all you have to do is tag the function you want to be
error injectable with ALLOW_ERROR_INJECTION() and then you get all these nice
things like a debugfs interface to trigger them or use the above script to
trigger specific errors and such.  Thanks,

Josef

Re: [PATCH 01/10] mm: pagecache add lock

2018-05-18 Thread Kent Overstreet

On Fri, May 18, 2018 at 08:53:30AM -0700, Christoph Hellwig wrote:
> On Fri, May 18, 2018 at 06:13:06AM -0700, Matthew Wilcox wrote:
> > > Historically, the only problematic case has been direct IO, and people
> > > have been willing to say "well, if you mix buffered and direct IO you
> > > get what you deserve", and that's probably not unreasonable. But now we
> > > have fallocate insert range and collapse range, and those are broken in
> > > ways I frankly don't want to think about if they can't ensure consistency
> > > with the page cache.
> > 
> > ext4 manages collapse-vs-pagefault with the ext4-specific i_mmap_sem.
> > You may get pushback on the grounds that this ought to be a
> > filesystem-specific lock rather than one embedded in the generic inode.
> 
> Honestly I think this probably should be in the core.  But IFF we move
> it to the core the existing users of per-fs locks need to be moved
> over first.  E.g. XFS as the very first one, and at least ext4 and f2fs
> that copied the approach, and probably more if you audit deep enough.

I didn't know about i_mmap_sem, thanks

But, using a rw semaphore wouldn't work for dio writes, and I do need dio writes
to block pagecache adds in bcachefs since the dio write could overwrite
uncompressed data or a reservation with compressed data, which means new writes
need a disk reservation.

Also I'm guessing ext4 takes the lock at the top of the read path? That sucks
for reads that are already cached, the generic buffered read path can do cached
reads without taking any per inode locks - that's why I pushed the lock down to
only be taken when we have to add a page to the pagecache.

Definitely less ugly doing it that way though...

Re: [PATCH] arm64: kvm: use -fno-jump-tables with clang

2018-05-18 Thread Marc Zyngier

On 18/05/18 18:40, Nick Desaulniers wrote:
> On Fri, May 18, 2018 at 10:30 AM Marc Zyngier  wrote:
>> I'm going to ask the question I've asked before when this patch cropped
>> up (must be the 4th time now):
> 
>> Is it guaranteed that this is the only case where LLVM/clang is going to
>> generate absolute addresses instead of using relative addressing?
> 
> It seems like if there's requirements that only relative addressing be
> used, then the compiler should be told explicitly about this restriction,
> no?

Certainly. What's the rune?

M.
-- 
Jazz is not dead. It just smells funny...

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 1187 matches

Mail list logo