from:"陈华才"

Re:[PATCH 00/13] Modernize Loongson64 Machine

2019-08-27 Thread 陈华才

Hi, Jiaxun,

1, To describe CPU I prefer "loongson" to "ls" because "ls" is confusing, and 
in future we will use ls2h/ls7a to describe Loongson's bridge.

2, I think it is better to use loongson64c/loongson64g than 
loongson2ef/loongson64. As we disscussed, we will use 
PRID_IMP_LOONGSON_64C/PRID_IMP_LOONGSON_64G to describe 0x6300/0xc000.

Huacai



 -- Original --
From:  "Jiaxun Yang";Date:  Tue, Aug 27, 2019 04:52 
PMTo:  "linux-mips"; Cc:  
"chenhc"; "paul.burton"; 
"tglx"; "jason"; 
"maz"; "linux-kernel"; 
"robh+dt"; "mark.rutland"; 
"devicetree"; "Jiaxun 
Yang"; Subject:  [PATCH 00/13] Modernize Loongson64 
Machine Loongson have a long history of contributing their code to mainline 
kernel.
However, it seems like recent years, they are focusing on maintain a kernel by 
themselves
rather than contribute there code to the community.

Kernel is progress rapidly too. Their code slept in mainline for a long peroid 
without proper
maintainance and became outdated.

This patchset brings modern DeviceTree and irqchip support to the Loongson64 
machine, and leaves
Loongson 2e/f alone since they are too legacy to touch.


Jiaxun Yang (13):
  MIPS: Loongson64: Rename CPU TYPES
  MIPS: Loongson64: Sepreate loongson2ef/loongson64 code
  MAINTAINERS: Fix entries for new loongson64 path
  irqchip: Add driver for Loongson-3 I/O interrupt controller
  dt-bindings: interrupt-controller: Add Loongson-3 IOINTC
  irqchip: Add driver for Loongson-3 HyperTransport interrupt controller
  dt-bindings: interrupt-controller: Add Loongson-3 HTINTC
  irqchip: i8259: Add plat-poll support
  irqchip: mips-cpu: Convert to simple domain
  MIPS: Loongson64: Drop legacy IRQ code
  dt-bindings: mips: Add loongson cpus & boards
  MIPS: Loongson64: Add generic dts
  MIPS: Loongson64: Load built-in dtbs

 .../loongson,ls3-htintc.yaml  |  53 +
 .../loongson,ls3-iointc.yaml  |  61 +
 .../bindings/mips/loongson/cpus.yaml  |  38 +++
 .../bindings/mips/loongson/devices.yaml   |  64 ++
 MAINTAINERS   |   9 +-
 arch/mips/Kbuild.platforms|   1 +
 arch/mips/Kconfig |  83 +--
 arch/mips/boot/dts/Makefile   |   1 +
 arch/mips/boot/dts/loongson/Makefile  |   8 +
 arch/mips/boot/dts/loongson/ls3-2nodes.dtsi   |   8 +
 arch/mips/boot/dts/loongson/ls3-4nodes.dtsi   |  15 ++
 arch/mips/boot/dts/loongson/ls3-cpus.dtsi | 150 
 arch/mips/boot/dts/loongson/ls3-gs464.dtsi|  18 ++
 arch/mips/boot/dts/loongson/ls3-gs464e.dtsi   |  18 ++
 .../boot/dts/loongson/ls3-rs780e-pch.dtsi |  35 +++
 arch/mips/boot/dts/loongson/ls3a-package.dtsi |  59 +
 .../boot/dts/loongson/ls3a1000_780e_1way.dts  |  12 +
 .../boot/dts/loongson/ls3a1000_780e_2way.dts  |  13 ++
 .../boot/dts/loongson/ls3a1000_780e_4way.dts  |  13 ++
 .../boot/dts/loongson/ls3a2000_780e_1way.dts  |  12 +
 .../boot/dts/loongson/ls3a2000_780e_2way.dts  |  13 ++
 .../boot/dts/loongson/ls3a2000_780e_4way.dts  |  13 ++
 .../boot/dts/loongson/ls3a3000_780e_1way.dts  |  12 +
 .../boot/dts/loongson/ls3a3000_780e_2way.dts  |  13 ++
 .../boot/dts/loongson/ls3a3000_780e_4way.dts  |  13 ++
 arch/mips/boot/dts/loongson/ls3b-package.dtsi |  59 +
 .../mips/boot/dts/loongson/ls3b_780e_1way.dts |  13 ++
 .../mips/boot/dts/loongson/ls3b_780e_2way.dts |  13 ++
 arch/mips/include/asm/bootinfo.h  |   1 -
 arch/mips/include/asm/cop2.h  |   2 +-
 arch/mips/include/asm/cpu-type.h  |   6 +-
 arch/mips/include/asm/cpu.h   |   4 +-
 arch/mips/include/asm/hazards.h   |   2 +-
 arch/mips/include/asm/io.h|   2 +-
 arch/mips/include/asm/irqflags.h  |   2 +-
 .../mach-loongson2ef/cpu-feature-overrides.h  |  45 
 .../cs5536/cs5536.h   |   0
 .../cs5536/cs5536_mfgpt.h |   0
 .../cs5536/cs5536_pci.h   |   0
 .../cs5536/cs5536_vsm.h   |   0
 .../loongson2ef.h}|  31 +--
 .../machine.h |   6 -
 .../mc146818rtc.h |   5 +-
 .../mem.h |   6 +-
 arch/mips/include/asm/mach-loongson2ef/pci.h  |  43 
 .../include/asm/mach-loongson2ef/spaces.h |  10 +
 .../asm/mach-loongson64/builtin_dtbs.h|  26 +++
 .../mach-loongson64/cpu-feature-overrides.h   |   3 -
 arch/mips/include/asm/mach-loongson64/irq.h   |   6 +-
 .../asm/mach-loongson64/kernel-entry-init.h   |  74 --
 .../include/asm/mach-loongson64/loongson64.h  |  50 
 .../mips/include/asm/mach-loongson64/mmzone.h |  16 --
 arch/mips/include/asm/mach-loongson64/pci.h   |  41 +---
 .../include/asm/mach-loongson64/workarounds.h |   4 +-
 arch/mips/include/asm/module.h|   8 +-
 arch/mips/include/asm/pgtable-bits.h  |   2 +-

Re:Re：Re: Re：[PATCH 4.9 81/96] MIPS:Loongson: Introduce and use loongson_llsc_mb()

2019-03-13 Thread 陈华才

Hi, Greg,

Patch for 4.9 (and below 4.9) is here:
https://patchwork.linux-mips.org/patch/21375/

Huacai
 
-- Original --
From:  "陈华才";
Date:  Thu, Mar 14, 2019 06:55 AM
To:  "gregkh";
Cc:  "linux-kernel"; 
"stable"; "huangpei"; "Paul 
Burton"; "Ralf Baechle"; 
"ambrosehua"; "Steven J . Hill"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "Li Xuefeng"; "Xu 
Chenghua"; "Sasha Levin";
Subject:  Re：Re: Re：[PATCH 4.9 81/96] MIPS:Loongson: Introduce and use 
loongson_llsc_mb()
 
Hi, Greg,

Just 4.9(and below) need to modify spinlock.h,   4.14, 4.19 does not need to do 
this because they have converted to qspinlock. And, sorry for my poor reply 
because I can only use mobile phone now.
---原始邮件---
发件人:"Greg Kroah-Hartman"
发送时间:2019年3月14日(星期四) 凌晨4:58
收件人:"陈华才";
主题:Re：[PATCH 4.9 81/96] MIPS:Loongson: Introduce and use loongson_llsc_mb()
On Wed, Mar 13, 2019 at 09:17:15PM +0800, 陈华才 wrote:
> Hi, GREG,
> 
> 4.9 need to modify spinlock.h, please wait my patch.
> 
> 
> 
> ---原始邮件---
> 发件人:"Greg Kroah-Hartman"
> 发送时间:2019年3月13日(星期三) 凌晨1:10
> 收件人:"linux-kernel";
> 主题:[PATCH 4.9 81/96] MIPS: Loongson: Introduce and use loongson_llsc_mb()
> 4.9-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> [ Upstream commit e02e07e3127d8aec1f4bcdfb2fc52a2d99b4859e ]
> 
> On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
> lld/scd is very weak ordering. We should add sync instructions "before
> each ll/lld" and "at the branch-target between ll/sc" to workaround.
> Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
> heavy load test with LTP).
> 
> Below is the explaination of CPU designer:
> 
> "For Loongson 3 family, when a memory access instruction (load, store,
> or prefetch)'ecuting occurs between the execution of LL and SC, the
> success or failure of SC is not predictable. Although programmer would
> not insert memory access instructions between LL and SC, the memory
> instructions before LL in program-order, may dynamically executed
> between the execution of LL/SC, so a memory fence (SYNC) is needed
> before LL/LLD to avoid this situation.
> 
> Since Loongson-3A R2 (3A2000), we have improved our hardware design to
> handle this case. But we later deduce a rarely circumstance that some
> speculatively executed memory instructions due to branch misprediction
> between LL/SC still fall into the above case, so a memory fence (SYNC)
> at branch-target (if its target is not between LL/SC) is needed for
> Loongson 3A1000, 3B1500, 3A2000 and 3A3000.
> 
> Our processor is continually evolving and we aim to to remove all these
> workaround-SYNCs around LL/SC for new-come processor."
> 
> Here is an example:
> 
> Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
> this bug cause both '' by two cpus (in atomic_add) succeed at same
> time(''urn 1), and the variable is only *added by 1*, sometimes,
> which is wrong and unacceptable(it should be added by 2).
> 
> Why disable fix-loongson3-llsc in compiler?
> Because compiler fix will cause problems in kernel'ex_table section.
> 
> This patch fix all the cases in kernel, but:
> 
> +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
> of ''ere other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
> the ll and branch-target coincidently such as atomic_sub_if_positive/
> cmpxchg/xchg, just like this one.
> 
> +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
> edac.h
> 
> +. local_ops and cmpxchg_local should not be affected by this bug since
> only the owner can write.
> 
> +. mips_atomic_set for syscall.c is deprecated and rarely used, just let
> it go
> 
> Signed-off-by: Huacai Chen 
> Signed-off-by: Huang Pei 
> [paul.bur...@mips.com:
>  - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
>  a comment describing why it'ere.
>  - Make loongson_llsc_mb() a no-op when
>  CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
>  barrier.
>  - Add a comment describing the bug & how loongson_llsc_mb() helps
>  in asm/barrier.h.]
> Signed-off-by: Paul Burton 
> Cc: Ralf Baechle 
> Cc: ambrose...@gmail.com
> Cc: Steven J . Hill 
> Cc: linux-m...@linux-mips.org
> Cc: Fuxin Zhang 
> Cc: Zhangjin Wu 
> Cc: Li Xuefeng 
> Cc: Xu Chenghua 
> Signed-off-by: Sasha Levin 
> ---
> arch/mips/Kconfig   | 15 ++
> arch/mips/include/asm/atomic.h  |  6 ++
> arch/mips/include/asm/barrier.h | 36 +
> arch/mips/include/asm/bitops.h  |  5 +
> arch/mips/include/asm/futex.h   |  3 +++
> arch/mips/include/asm/pgtable.h |  2 ++
> arch/mips/loongson64/Platform   | 23 +
> arch/mips/mm/tlbex.c| 10 +
> 8 files changed, 100 insertions(+)

Ok, I will go drop this from all stable queues now, thanks!

greg k-h

Re: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const

2019-02-18 Thread 陈华才

My lemote.com email is too weak, I'll stop top post because this time I've
cc-ed my gmail.

I've tested, this patch can fix the nvme problem, but it can't be applied
to 4.19 because of different context. And, I still think my original solution
(genirq/affinity: Assign default affinity to pre/post vectors) is correct.
There may be similar problems except nvme.
 
Huacai

-- Original --
From:  "Thomas Gleixner";
Date:  Thu, Feb 14, 2019 09:31 PM
To:  "陈华才";
Cc:  "Keith Busch"; "Bjorn Helgaas"; 
"Jens Axboe"; "Sagi Grimberg"; 
"linux-pci"; "LKML"; 
"linux-nvme"; "Ming Lei"; 
"linux-block"; "Christoph Hellwig";
Subject:  Re: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const
 
On Thu, 14 Feb 2019, 陈华才 wrote:

Please do not top post

> I'll test next week, but 4.19 has the same problem, how to fix that for 4.19?

By applying the very same patch perhaps?

Thanks,

tglx

Re: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const

2019-02-14 Thread 陈华才

I'll test next week, but 4.19 has the same problem, how to fix that for 4.19?

Huacai
 
-- Original --
From:  "Thomas Gleixner";
Date:  Thu, Feb 14, 2019 04:50 PM
To:  "Keith Busch";
Cc:  "Bjorn Helgaas"; "Jens Axboe"; "Sagi 
Grimberg"; "linux-pci"; 
"LKML"; 
"linux-nvme"; "Ming Lei"; 
"linux-block"; "Christoph Hellwig"; 
"Huacai Chen";
Subject:  Re: [PATCH V3 1/5] genirq/affinity: don't mark 'affd' as const
 
On Wed, 13 Feb 2019, Keith Busch wrote:

Cc+ Huacai Chen

> On Wed, Feb 13, 2019 at 10:41:55PM +0100, Thomas Gleixner wrote:
> > Btw, while I have your attention. There popped up an issue recently related
> > to that affinity logic.
> > 
> > The current implementation fails when:
> > 
> > /*
> >  * If there aren't any vectors left after applying the pre/post
> >  * vectors don't bother with assigning affinity.
> > */
> > if (nvecs == affd->pre_vectors + affd->post_vectors)
> > return NULL;
> > 
> > Now the discussion arised, that in that case the affinity sets are not
> > allocated and filled in for the pre/post vectors, but somehow the
> > underlying device still works and later on triggers the warning in the
> > blk-mq code because the MSI entries do not have affinity information
> > attached.
> >
> > Sure, we could make that work, but there are several issues:
> > 
> > 1) irq_create_affinity_masks() has another reason to return NULL:
> >memory allocation fails.
> > 
> > 2) Does it make sense at all.
> > 
> > Right now the PCI allocator ignores the NULL return and proceeds without
> > setting any affinities. As a consequence nothing is managed and everything
> > happens to work.
> > 
> > But that happens to work is more by chance than by design and the warning
> > is bogus if this is an expected mode of operation.
> > 
> > We should address these points in some way.
> 
> Ah, yes, that's a mistake in the nvme driver. It is assuming IO queues are
> always on managed interrupts, but that's not true if when only 1 vector
> could be allocated. This should be an appropriate fix to the warning:

Looks correct. Chen, can you please test that?

> ---
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 022ea1ee63f8..f2ccebe1c926 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -506,7 +506,7 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  * affinity), so use the regular blk-mq cpu mapping
>  */
>  map->queue_offset = qoff;
> - if (i != HCTX_TYPE_POLL)
> + if (i != HCTX_TYPE_POLL && dev->num_vecs > 1)
>  blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset);
>  else
>  blk_mq_map_queues(map);
> --
>

Re: [PATCH] genirq/affinity: Assign default affinity to pre/post vectors

2019-01-16 Thread 陈华才

陈华才
江苏中科梦兰电子科技有限公司/自主安全事业部/软件部
江苏常熟虞山镇梦兰村
 
 
 
-- Original --
From:  "Thomas Gleixner";
Date:  Wed, Jan 16, 2019 05:26 PM
To:  "陈华才";
Cc:  "linux-kernel"; "Fuxin 
Zhang"; "wuzhangjin"; 
"stable"; "Christoph Hellwig"; "Michael 
Hernandez";
Subject:  Re: [PATCH] genirq/affinity: Assign default affinity to pre/post 
vectors
 
Chen,

On Wed, 16 Jan 2019, 陈华才 wrote:

> please do not top-post and use line breaks around 78 char.

> I'm not removing all return NULL of irq_create_affinity_masks(), so the ..

> > Moved content to the place where it belongs so the context is preserved.

> -- Original --
> From:  "Thomas Gleixner";
> Date:  Wed, Jan 16, 2019 03:10 AM
> To:  "Huacai Chen";
> Cc:  "linux-kernel"; "Fuxin 
> Zhang"; "wuzhangjin"; 
> "stable"; "Christoph Hellwig"; "Michael 
> Hernandez";
> Subject:  Re: [PATCH] genirq/affinity: Assign default affinity to pre/post 
> vectors

> And please configure your e-mail client to NOT copy the full headers into
> the reply.

> > > On Mon, 31 Dec 2018, Huacai Chen wrote:
> > > 
> > > > Generally, irq_create_affinity_masks() assign default affinity to pre/
> > > > post vectors correctly. However, it ignore the case that there are only
> > > > pre/post vectors (when nvecs == affd->pre_vectors + affd->post_vectors)
> > > > and return NULL. This case usually happens when nvecs = 1 (e.g. in nvme
> > > > driver when MSI-X is unavailable and fallback to MSI) and will trigger
> > > > the warning in pci_irq_get_affinity(). This patch fix the corner case.
> > > 
> > > Errm. This is just wrong. When this function returns NULL, then it has
> > > failed and the caller or any subsequent code is not supposed to use the
> > > result.
> > > 
> > > The function can return NULL for other reasons, e.g. when the memory
> > > allocation failed. How are you going to duct tape that one?
> >
> > I'm not removing all return NULL of irq_create_affinity_masks(), so the 
> > memory allocation failure still return NULL. I just handle the case that
> > there are not enough irq vectors. E.g. in nvme driver, the caller may call
> > irq_create_affinity_masks() with nvecs=1,pre_vectors=1,post_vectors=0. In
> > this case, the only one vector's default affinity assigning is skipped.
> 
> I did not say that you removed all NULL returns. I said that this function
> can return NULL for other reasons and then the same situation will happen.
> 
> If the masks pointer returned is NULL then the calling code or any
> subsequent usage needs to handle it properly. Yes, I understand that this
> change makes the warning go away for that particular case, but that's not
> making it any more correct.

Hi, Thomas,

I don't think "nvecs == affd->pre_vectors + affd->post_vectors" is an ERROR,
so it should be different with "return NULL for other reasons" to the caller. If
the caller fallback from MSI-X to MSI, it is probably "nvecs=1,pre_vectors=1,
post_vectors=0". The caller can work perfectly, if pre/post vectors are filled
with the default affinity.

> Thanks,

> tglx

Re: [PATCH] genirq/affinity: Assign default affinity to pre/post vectors

2019-01-15 Thread 陈华才

Hi, Thomas,

I'm not removing all return NULL of  irq_create_affinity_masks(), so the memory 
allocation failure still return NULL. I just handle the case that there are not 
enough irq vectors. E.g. in nvme driver, the caller may call 
irq_create_affinity_masks() with nvecs=1,pre_vectors=1,post_vectors=0. In this 
case, the only one vector's default affinity assigning is skipped.

Huacai 

-- Original --
From:  "Thomas Gleixner";
Date:  Wed, Jan 16, 2019 03:10 AM
To:  "Huacai Chen";
Cc:  "linux-kernel"; "Fuxin 
Zhang"; "wuzhangjin"; 
"stable"; "Christoph Hellwig"; "Michael 
Hernandez";
Subject:  Re: [PATCH] genirq/affinity: Assign default affinity to pre/post 
vectors

On Mon, 31 Dec 2018, Huacai Chen wrote:

> Generally, irq_create_affinity_masks() assign default affinity to pre/
> post vectors correctly. However, it ignore the case that there are only
> pre/post vectors (when nvecs == affd->pre_vectors + affd->post_vectors)
> and return NULL. This case usually happens when nvecs = 1 (e.g. in nvme
> driver when MSI-X is unavailable and fallback to MSI) and will trigger
> the warning in pci_irq_get_affinity(). This patch fix the corner case.

Errm. This is just wrong. When this function returns NULL, then it has
failed and the caller or any subsequent code is not supposed to use the
result.

The function can return NULL for other reasons, e.g. when the memory
allocation failed. How are you going to duct tape that one?

Thanks,

tglx

Re: [[PATCH]] mips: Fix switch to NO_BOOTMEM for SGI-IP27/loongons3 NUMA

2018-11-08 Thread 陈华才

Hi,

It seems the patch below can solve many problems after switched to NO_BOOTMEM, 
because the memory allocation behavior is more similar as before.

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 070234b..7a449d9 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -839,6 +839,7 @@ static void __init arch_mem_init(char **cmdline_p)

/* call board setup routine */
plat_mem_setup();
+   memblock_set_bottom_up(true);

/*
 * Make sure all kernel memory is in the maps.  The "UP" and
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 0f852e1..15e103c 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2260,10 +2260,8 @@ void __init trap_init(void)
unsigned long size = 0x200 + VECTORSPACING*64;
phys_addr_t ebase_pa;

-   memblock_set_bottom_up(true);
ebase = (unsigned long)
memblock_alloc_from(size, 1 << fls(size), 0);
-   memblock_set_bottom_up(false);

/*
 * Try to ensure ebase resides in KSeg0 if possible.
@@ -2307,6 +2305,7 @@ void __init trap_init(void)
if (board_ebase_setup)
board_ebase_setup();
per_cpu_trap_init(true);
+   memblock_set_bottom_up(false);

/*
 * Copy the generic exception handlers to their final destination.

 
-- Original --
From:  "Mike Rapoport";
Date:  Fri, Nov 9, 2018 02:01 AM
To:  "Thomas Bogendoerfer";
Cc:  "Ralf Baechle"; "Paul Burton"; 
"James Hogan"; "Huacai Chen"; 
"linux-mips"; 
"linux-kernel"; "rppt";
Subject:  Re: [[PATCH]] mips: Fix switch to NO_BOOTMEM for SGI-IP27/loongons3 
NUMA
 


On November 8, 2018 6:52:17 PM GMT+02:00, Thomas Bogendoerfer 
 wrote:
>On Thu, 8 Nov 2018 18:18:23 +0200
>Mike Rapoport  wrote:
>
>> On Thu, Nov 08, 2018 at 03:44:28PM +0100, Thomas Bogendoerfer wrote:
>> > Commit bcec54bf3118 ("mips: switch to NO_BOOTMEM") broke SGI-IP27
>> > and NUMA enabled loongson3 by doing memblock_set_current_limit()
>> > before max_low_pfn has been evaluated. Both platforms need to do
>the
>> > memblock_set_current_limit() in platform specific code. For
>> > consistency the call to memblock_set_current_limit() is moved
>> > to the common bootmem_init(), where max_low_pfn is calculated
>> > for non NUMA enabled platforms.
>> [..]
>> 
>> As for SGI-IP27, the initialization of max_low_pfn as late as in
>> paging_init() seems to be broken because it's value is used in
>> arch_mem_init() and in finalize_initrd() anyway.
>
>well, the patch is tested on real hardware and the first caller of
>a memblock_alloc* function is in a function called by
>free_area_init_nodes().
 
Then, apparently, I've missed something else.
The Onyx2 I worked on is dead for a couple of years now ;-)

>> AFAIU, both platforms set max_low_pfn to last available pfn, so it
>seems we
>> can simply do
>> 
>> max_low_pfn = PFN_PHYS(memblock_end_of_DRAM())
>>

Should have been PHYS_PFN, sorry.

>> in the prom_meminit() function for both platforms and drop the loop
>> evaluating max_low_pfn in paging_init().
>
>sounds like a better plan. I'll prepare a new patch.
>
>Thomas.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [[PATCH]] mips: Fix switch to NO_BOOTMEM for SGI-IP27/loongons3 NUMA

2018-11-08 Thread 陈华才

Hi,

It seems the patch below can solve many problems after switched to NO_BOOTMEM, 
because the memory allocation behavior is more similar as before.

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 070234b..7a449d9 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -839,6 +839,7 @@ static void __init arch_mem_init(char **cmdline_p)

/* call board setup routine */
plat_mem_setup();
+   memblock_set_bottom_up(true);

/*
 * Make sure all kernel memory is in the maps.  The "UP" and
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 0f852e1..15e103c 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2260,10 +2260,8 @@ void __init trap_init(void)
unsigned long size = 0x200 + VECTORSPACING*64;
phys_addr_t ebase_pa;

-   memblock_set_bottom_up(true);
ebase = (unsigned long)
memblock_alloc_from(size, 1 << fls(size), 0);
-   memblock_set_bottom_up(false);

/*
 * Try to ensure ebase resides in KSeg0 if possible.
@@ -2307,6 +2305,7 @@ void __init trap_init(void)
if (board_ebase_setup)
board_ebase_setup();
per_cpu_trap_init(true);
+   memblock_set_bottom_up(false);

/*
 * Copy the generic exception handlers to their final destination.

 
-- Original --
From:  "Mike Rapoport";
Date:  Fri, Nov 9, 2018 02:01 AM
To:  "Thomas Bogendoerfer";
Cc:  "Ralf Baechle"; "Paul Burton"; 
"James Hogan"; "Huacai Chen"; 
"linux-mips"; 
"linux-kernel"; "rppt";
Subject:  Re: [[PATCH]] mips: Fix switch to NO_BOOTMEM for SGI-IP27/loongons3 
NUMA
 


On November 8, 2018 6:52:17 PM GMT+02:00, Thomas Bogendoerfer 
 wrote:
>On Thu, 8 Nov 2018 18:18:23 +0200
>Mike Rapoport  wrote:
>
>> On Thu, Nov 08, 2018 at 03:44:28PM +0100, Thomas Bogendoerfer wrote:
>> > Commit bcec54bf3118 ("mips: switch to NO_BOOTMEM") broke SGI-IP27
>> > and NUMA enabled loongson3 by doing memblock_set_current_limit()
>> > before max_low_pfn has been evaluated. Both platforms need to do
>the
>> > memblock_set_current_limit() in platform specific code. For
>> > consistency the call to memblock_set_current_limit() is moved
>> > to the common bootmem_init(), where max_low_pfn is calculated
>> > for non NUMA enabled platforms.
>> [..]
>> 
>> As for SGI-IP27, the initialization of max_low_pfn as late as in
>> paging_init() seems to be broken because it's value is used in
>> arch_mem_init() and in finalize_initrd() anyway.
>
>well, the patch is tested on real hardware and the first caller of
>a memblock_alloc* function is in a function called by
>free_area_init_nodes().
 
Then, apparently, I've missed something else.
The Onyx2 I worked on is dead for a couple of years now ;-)

>> AFAIU, both platforms set max_low_pfn to last available pfn, so it
>seems we
>> can simply do
>> 
>> max_low_pfn = PFN_PHYS(memblock_end_of_DRAM())
>>

Should have been PHYS_PFN, sorry.

>> in the prom_meminit() function for both platforms and drop the loop
>> evaluating max_low_pfn in paging_init().
>
>sounds like a better plan. I'll prepare a new patch.
>
>Thomas.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH] cacheinfo: Keep the old value if of_property_read_u32fails

2018-10-19 Thread 陈华才

Hi, Sudeep,

I use MIPS, and there is no "size" in 
/sys/devices/system/cpu/cpuX/cache/indexX/ file after your patch. Because the 
DT node only has "next-level-cache = <>;"  but has no "size" information.

Huacai

-- Original --
From:  "Sudeep Holla";
Date:  Thu, Oct 18, 2018 05:15 PM
To:  "Huacai Chen";
Cc:  "Greg Kroah-Hartman"; "Rafael J . 
Wysocki"; "LKML"; 
"zhangfx"; "wuzhangjin"; "Sudeep 
Holla";
Subject:  Re: [PATCH] cacheinfo: Keep the old value if of_property_read_u32fails

Hi Huacai,

On Thu, Oct 18, 2018 at 09:28:11AM +0800, Huacai Chen wrote:
> Hi, Sudeep,
> 
> Please see this call-graph:
> 
> static int detect_cache_attributes(unsigned int cpu)
> 
> ret = populate_cache_leaves(cpu);
> 
> ret = cache_shared_cpu_map_setup(cpu); -->
> cache_setup_of_node() --> cache_of_set_props() -->
> cache_size()/cache_nr_sets
> 
> populate_cache_leaves() is arch-specific, All archs except arm64 have
> provide initial values to cacheinfo:size and cacheinfo:number_of_sets.
> 
> Related files:
> arch/nds32/kernel/cacheinfo.c
> arch/mips/kernel/cacheinfo.c

Only above 2 could be affected, but I want to be sure. I wasn't aware
of MIPS arch setting values elsewhere, assumed DT, my bad.
You have still not answered my question. Which arch are you facing issue ?
Or are you proposing this by code inspection ? I changes look fine, but
want to be sure if the issue you are seeing is in above architectures.

--
Regards,
Sudeep

Re: [PATCH] cacheinfo: Keep the old value if of_property_read_u32fails

2018-10-19 Thread 陈华才

Hi, Sudeep,

I use MIPS, and there is no "size" in 
/sys/devices/system/cpu/cpuX/cache/indexX/ file after your patch. Because the 
DT node only has "next-level-cache = <>;"  but has no "size" information.

Huacai

-- Original --
From:  "Sudeep Holla";
Date:  Thu, Oct 18, 2018 05:15 PM
To:  "Huacai Chen";
Cc:  "Greg Kroah-Hartman"; "Rafael J . 
Wysocki"; "LKML"; 
"zhangfx"; "wuzhangjin"; "Sudeep 
Holla";
Subject:  Re: [PATCH] cacheinfo: Keep the old value if of_property_read_u32fails

Hi Huacai,

On Thu, Oct 18, 2018 at 09:28:11AM +0800, Huacai Chen wrote:
> Hi, Sudeep,
> 
> Please see this call-graph:
> 
> static int detect_cache_attributes(unsigned int cpu)
> 
> ret = populate_cache_leaves(cpu);
> 
> ret = cache_shared_cpu_map_setup(cpu); -->
> cache_setup_of_node() --> cache_of_set_props() -->
> cache_size()/cache_nr_sets
> 
> populate_cache_leaves() is arch-specific, All archs except arm64 have
> provide initial values to cacheinfo:size and cacheinfo:number_of_sets.
> 
> Related files:
> arch/nds32/kernel/cacheinfo.c
> arch/mips/kernel/cacheinfo.c

Only above 2 could be affected, but I want to be sure. I wasn't aware
of MIPS arch setting values elsewhere, assumed DT, my bad.
You have still not answered my question. Which arch are you facing issue ?
Or are you proposing this by code inspection ? I changes look fine, but
want to be sure if the issue you are seeing is in above architectures.

--
Regards,
Sudeep

Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-20 Thread 陈华才

Hi, Paul,

SFB can improve the memory bandwidth as much as 30%, and we are planning to 
enable SFB by default. So, we want to control cpu_relax() under 
CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

Huacai

-- Original --
From:  "Paul Burton";
Date:  Fri, Jul 20, 2018 05:15 AM
To:  "Huacai Chen"; 
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "stable"; "Alan 
Stern"; "Andrea 
Parri"; "Will Deacon"; 
"Peter Zijlstra"; "Boqun Feng"; 
"Nicholas Piggin"; "David Howells"; 
"Jade Alglave"; "Luc Maranget"; 
"Paul E. McKenney"; "Akira 
Yokosawa"; "LKML"; 
Subject:  Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul

Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-20 Thread 陈华才

Hi, Paul,

SFB can improve the memory bandwidth as much as 30%, and we are planning to 
enable SFB by default. So, we want to control cpu_relax() under 
CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

Huacai

-- Original --
From:  "Paul Burton";
Date:  Fri, Jul 20, 2018 05:15 AM
To:  "Huacai Chen"; 
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "stable"; "Alan 
Stern"; "Andrea 
Parri"; "Will Deacon"; 
"Peter Zijlstra"; "Boqun Feng"; 
"Nicholas Piggin"; "David Howells"; 
"Jade Alglave"; "Luc Maranget"; 
"Paul E. McKenney"; "Akira 
Yokosawa"; "LKML"; 
Subject:  Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul

Re: [PATCH 4.9 03/32] MIPS: Use asyncIPIsforarch_trigger_cpumask_backtrace()

2018-07-17 Thread 陈华才

Hi, Greg,

I have resend patches and cc stable list.

This patch is needed for 4.4.y but need restruction, as Paul said:
Strictly speaking the faulty MIPS implementation can be
traced further back to commit 856839b76836 ("MIPS: Add
arch_trigger_all_cpu_backtrace() function") in v3.19

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Tue, Jul 17, 2018 03:20 PM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use 
asyncIPIsforarch_trigger_cpumask_backtrace()
 
On Tue, Jul 17, 2018 at 02:53:38PM +0800, 陈华才 wrote:
> Hi, Greg,
> 
> I have backported and tested it to 4.9 and 4.4:
> https://patchwork.linux-mips.org/patch/19862/
> https://patchwork.linux-mips.org/patch/19863/

I can't do anything with random web links.  Please properly cc: the
stable list with the backports and I can take them from there.

And are you sure they are needed for 4.4.y?

thanks,

greg k-h

Re: [PATCH 4.9 03/32] MIPS: Use asyncIPIsforarch_trigger_cpumask_backtrace()

2018-07-17 Thread 陈华才

Hi, Greg,

I have resend patches and cc stable list.

This patch is needed for 4.4.y but need restruction, as Paul said:
Strictly speaking the faulty MIPS implementation can be
traced further back to commit 856839b76836 ("MIPS: Add
arch_trigger_all_cpu_backtrace() function") in v3.19

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Tue, Jul 17, 2018 03:20 PM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use 
asyncIPIsforarch_trigger_cpumask_backtrace()
 
On Tue, Jul 17, 2018 at 02:53:38PM +0800, 陈华才 wrote:
> Hi, Greg,
> 
> I have backported and tested it to 4.9 and 4.4:
> https://patchwork.linux-mips.org/patch/19862/
> https://patchwork.linux-mips.org/patch/19863/

I can't do anything with random web links.  Please properly cc: the
stable list with the backports and I can take them from there.

And are you sure they are needed for 4.4.y?

thanks,

greg k-h

Re: [PATCH 4.9 03/32] MIPS: Use async IPIsforarch_trigger_cpumask_backtrace()

2018-07-17 Thread 陈华才

Hi, Greg,

I have backported and tested it to 4.9 and 4.4:
https://patchwork.linux-mips.org/patch/19862/
https://patchwork.linux-mips.org/patch/19863/

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Tue, Jul 17, 2018 02:34 AM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use async 
IPIsforarch_trigger_cpumask_backtrace()
 
On Mon, Jul 16, 2018 at 12:46:21PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Jul 16, 2018 at 05:46:12PM +0800, 陈华才 wrote:
> > Just change "call_single_data_t" to "struct call_single_data" and 
> > everything is OK.
> 
> Ok, I've done that now, thanks.

And I messed it up.  So I've just dropped it entirely.  Can someone send
me a properly backported patch please?

thanks,

greg k-h

Re: [PATCH 4.9 03/32] MIPS: Use async IPIsforarch_trigger_cpumask_backtrace()

2018-07-17 Thread 陈华才

Hi, Greg,

I have backported and tested it to 4.9 and 4.4:
https://patchwork.linux-mips.org/patch/19862/
https://patchwork.linux-mips.org/patch/19863/

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Tue, Jul 17, 2018 02:34 AM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use async 
IPIsforarch_trigger_cpumask_backtrace()
 
On Mon, Jul 16, 2018 at 12:46:21PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Jul 16, 2018 at 05:46:12PM +0800, 陈华才 wrote:
> > Just change "call_single_data_t" to "struct call_single_data" and 
> > everything is OK.
> 
> Ok, I've done that now, thanks.

And I messed it up.  So I've just dropped it entirely.  Can someone send
me a properly backported patch please?

thanks,

greg k-h

Re: [PATCH 4.9 03/32] MIPS: Use async IPIs forarch_trigger_cpumask_backtrace()

2018-07-16 Thread 陈华才

Just change "call_single_data_t" to "struct call_single_data" and everything is 
OK.

Huacai

-- Original --
From:  "Greg Kroah-Hartman";
Date:  Mon, Jul 16, 2018 05:40 PM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan"; "Ralf Baechle"; 
"linux-mips";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use async IPIs 
forarch_trigger_cpumask_backtrace()
 
On Mon, Jul 16, 2018 at 05:29:05PM +0800, 陈华才 wrote:
> Hi, Greg,
> 
> kernel-4.9 doesn't have call_single_data_t, we should use struct 
> call_single_data instead.

Can you send me a patch to merge with this one with that change so that
I know I get it right?

thanks,

greg k-h

Re: [PATCH 4.9 03/32] MIPS: Use async IPIs forarch_trigger_cpumask_backtrace()

2018-07-16 Thread 陈华才

Just change "call_single_data_t" to "struct call_single_data" and everything is 
OK.

Huacai

-- Original --
From:  "Greg Kroah-Hartman";
Date:  Mon, Jul 16, 2018 05:40 PM
To:  "陈华才";
Cc:  "linux-kernel"; 
"stable"; "Paul Burton"; "James 
Hogan"; "Ralf Baechle"; 
"linux-mips";
Subject:  Re: [PATCH 4.9 03/32] MIPS: Use async IPIs 
forarch_trigger_cpumask_backtrace()
 
On Mon, Jul 16, 2018 at 05:29:05PM +0800, 陈华才 wrote:
> Hi, Greg,
> 
> kernel-4.9 doesn't have call_single_data_t, we should use struct 
> call_single_data instead.

Can you send me a patch to merge with this one with that change so that
I know I get it right?

thanks,

greg k-h

Re:[PATCH 4.9 03/32] MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()

2018-07-16 Thread 陈华才

Hi, Greg,

kernel-4.9 doesn't have call_single_data_t, we should use struct 
call_single_data instead.

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Mon, Jul 16, 2018 03:36 PM
To:  "linux-kernel";
Cc:  "Greg Kroah-Hartman"; 
"stable"; "Paul Burton"; "James 
Hogan"; "Ralf Baechle"; "Huacai 
Chen"; "linux-mips";
Subject:  [PATCH 4.9 03/32] MIPS: Use async IPIs for 
arch_trigger_cpumask_backtrace()
 
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Paul Burton 

commit b63e132b6433a41cf311e8bc382d33fd2b73b505 upstream.

The current MIPS implementation of arch_trigger_cpumask_backtrace() is
broken because it attempts to use synchronous IPIs despite the fact that
it may be run with interrupts disabled.

This means that when arch_trigger_cpumask_backtrace() is invoked, for
example by the RCU CPU stall watchdog, we may:

  - Deadlock due to use of synchronous IPIs with interrupts disabled,
causing the CPU that's attempting to generate the backtrace output
to hang itself.

  - Not succeed in generating the desired output from remote CPUs.

  - Produce warnings about this from smp_call_function_many(), for
example:

[42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
[42760.535755]  0-...!: (1 GPs behind) idle=ade/140/0 
softirq=526944/526945 fqs=0
[42760.547874]  1-...!: (0 ticks this GP) idle=e4a/140/0 
softirq=547885/547885 fqs=0
[42760.559869]  (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
[42760.568927] [ cut here ]
[42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 
smp_call_function_many+0x88/0x20c
[42760.587839] Modules linked in:
[42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 
4.15.4-00373-gee058bb4d0c2 #2
[42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fff0 0007 
0006  8e09bca8
[42760.616937] 95b2b379 95b2b379 807a0080 0007 81944518 
018a 0032 
[42760.630095]  0030 8000  806eca74 
0009 8017e2b8 01a0
[42760.643169]  0002  8e09baa4 0008 
808b8008 86d69080 8e09bca0
[42760.656282] 8e09ad50 805e20aa    
8017e2b8 0009 801070ca
[42760.669424] ...
[42760.673919] Call Trace:
[42760.678672] [<27fde568>] show_stack+0x70/0xf0
[42760.685417] [<84751641>] dump_stack+0xaa/0xd0
[42760.692188] [<699d671c>] __warn+0x80/0x92
[42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
[42760.705912] [] smp_call_function_many+0x88/0x20c
[42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
[42760.722216] [] rcu_dump_cpu_stacks+0x6a/0x98
[42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
[42760.737476] [<059b3b43>] update_process_times+0x18/0x34
[42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
[42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
[42760.759882] [] __hrtimer_run_queues+0xc6/0x226
[42760.767418] [] hrtimer_interrupt+0x88/0x19a
[42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
[42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
[42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
[42760.805545] [] gic_irq_dispatch+0xa/0x14
[42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.820086] [] do_IRQ+0x16/0x20
[42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
[42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
[42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
[42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
[42760.855693] [] flush_tlb_mm+0x2a/0x98
[42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
[42760.869628] [] arch_tlb_finish_mmu+0x26/0x3e
[42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
[42760.883907] [] exit_mmap+0x76/0xea
[42760.890428] [] mmput+0x80/0x11a
[42760.896632] [] do_exit+0x1f4/0x80c
[42760.903158] [] do_group_exit+0x20/0x7e
[42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
[42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
[42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
[42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
[42760.938342] [ cut here ]
[42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 
smp_call_function_single+0xee/0xf8
...

This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
IPIs & smp_call_function_single_async() in order to resolve this
problem. We ensure use of the pre-allocated call_single_data_t
structures is serialized by maintaining a cpumask indicating that
they're busy, and refusing to attempt to send an IPI when a CPU's bit is
set in this mask. This should only happen if

Re:[PATCH 4.9 03/32] MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()

2018-07-16 Thread 陈华才

Hi, Greg,

kernel-4.9 doesn't have call_single_data_t, we should use struct 
call_single_data instead.

Huacai
 
-- Original --
From:  "Greg Kroah-Hartman";
Date:  Mon, Jul 16, 2018 03:36 PM
To:  "linux-kernel";
Cc:  "Greg Kroah-Hartman"; 
"stable"; "Paul Burton"; "James 
Hogan"; "Ralf Baechle"; "Huacai 
Chen"; "linux-mips";
Subject:  [PATCH 4.9 03/32] MIPS: Use async IPIs for 
arch_trigger_cpumask_backtrace()
 
4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Paul Burton 

commit b63e132b6433a41cf311e8bc382d33fd2b73b505 upstream.

The current MIPS implementation of arch_trigger_cpumask_backtrace() is
broken because it attempts to use synchronous IPIs despite the fact that
it may be run with interrupts disabled.

This means that when arch_trigger_cpumask_backtrace() is invoked, for
example by the RCU CPU stall watchdog, we may:

  - Deadlock due to use of synchronous IPIs with interrupts disabled,
causing the CPU that's attempting to generate the backtrace output
to hang itself.

  - Not succeed in generating the desired output from remote CPUs.

  - Produce warnings about this from smp_call_function_many(), for
example:

[42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
[42760.535755]  0-...!: (1 GPs behind) idle=ade/140/0 
softirq=526944/526945 fqs=0
[42760.547874]  1-...!: (0 ticks this GP) idle=e4a/140/0 
softirq=547885/547885 fqs=0
[42760.559869]  (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
[42760.568927] [ cut here ]
[42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 
smp_call_function_many+0x88/0x20c
[42760.587839] Modules linked in:
[42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 
4.15.4-00373-gee058bb4d0c2 #2
[42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fff0 0007 
0006  8e09bca8
[42760.616937] 95b2b379 95b2b379 807a0080 0007 81944518 
018a 0032 
[42760.630095]  0030 8000  806eca74 
0009 8017e2b8 01a0
[42760.643169]  0002  8e09baa4 0008 
808b8008 86d69080 8e09bca0
[42760.656282] 8e09ad50 805e20aa    
8017e2b8 0009 801070ca
[42760.669424] ...
[42760.673919] Call Trace:
[42760.678672] [<27fde568>] show_stack+0x70/0xf0
[42760.685417] [<84751641>] dump_stack+0xaa/0xd0
[42760.692188] [<699d671c>] __warn+0x80/0x92
[42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
[42760.705912] [] smp_call_function_many+0x88/0x20c
[42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
[42760.722216] [] rcu_dump_cpu_stacks+0x6a/0x98
[42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
[42760.737476] [<059b3b43>] update_process_times+0x18/0x34
[42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
[42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
[42760.759882] [] __hrtimer_run_queues+0xc6/0x226
[42760.767418] [] hrtimer_interrupt+0x88/0x19a
[42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
[42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
[42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
[42760.805545] [] gic_irq_dispatch+0xa/0x14
[42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
[42760.820086] [] do_IRQ+0x16/0x20
[42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
[42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
[42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
[42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
[42760.855693] [] flush_tlb_mm+0x2a/0x98
[42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
[42760.869628] [] arch_tlb_finish_mmu+0x26/0x3e
[42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
[42760.883907] [] exit_mmap+0x76/0xea
[42760.890428] [] mmput+0x80/0x11a
[42760.896632] [] do_exit+0x1f4/0x80c
[42760.903158] [] do_group_exit+0x20/0x7e
[42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
[42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
[42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
[42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
[42760.938342] [ cut here ]
[42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 
smp_call_function_single+0xee/0xf8
...

This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
IPIs & smp_call_function_single_async() in order to resolve this
problem. We ensure use of the pre-allocated call_single_data_t
structures is serialized by maintaining a cpumask indicating that
they're busy, and refusing to attempt to send an IPI when a CPU's bit is
set in this mask. This should only happen if

Re: [PATCH V2] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-07-10 Thread 陈华才

Hi, Peter,

I'm afraid that you have missing something..

Firstly, our previous conclusion (READ_ONCE need a barrier to avoid 'reads 
prioritised over writes') is totally wrong. So define cpu_relax() to smp_mb() 
like ARM11MPCore is incorrect, even if it can 'solve' Loongson's problem.
Secondly, I think the real problem is like this:
 1, CPU0 set the lock to 0, then do something;
 2, While CPU0 is doing something, CPU1 set the flag to 1 with WRITE_ONCE(), 
and then wait the lock become to 1 with a READ_ONCE() loop;
 3, After CPU0 complete its work, it wait the flag become to 1, and if so then 
set the lock to 1; 
 4, If the lock becomes to 1, CPU1 will leave the READ_ONCE() loop.
If without SFB, everything is OK. But with SFB in step 2, a READ_ONCE() loop is 
right after WRITE_ONCE(), which makes the flag cached in SFB (so be invisible 
by other CPUs) for ever, then both CPU0 and CPU1 wait for ever.

I don't think this is a hardware bug, in design, SFB will flushed to L1 cache 
in three cases:
1, data in SFB is full (be a complete cache line);
2, there is a subsequent read access in the same cache line;
3, a 'sync' instruction is executed.

In this case, there is no other memory access (read or write) between 
WRITE_ONCE() and READ_ONCE() loop. So Case 1 and Case 2 will not happen, and 
the only way to make the flag be visible is wbflush (wbflush is sync in 
Loongson's case).

I think this problem is not only happens on Loongson, but will happen on other 
CPUs which have write buffer (unless the write buffer has a 4th case to be 
flushed).

Huacai

-- Original --
From:  "Peter Zijlstra";
Date:  Tue, Jul 10, 2018 06:54 PM
To:  "Huacai Chen";
Cc:  "Paul Burton"; "Ralf Baechle"; 
"James Hogan"; "linux-mips"; 
"Fuxin Zhang"; "wuzhangjin"; 
"stable"; "Alan Stern"; 
"Andrea Parri"; "Will 
Deacon"; "Boqun Feng"; "Nicholas 
Piggin"; "David Howells"; "Jade 
Alglave"; "Luc Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"LKML";
Subject:  Re: [PATCH V2] MIPS: implement smp_cond_load_acquire() for Loongson-3

On Tue, Jul 10, 2018 at 11:36:37AM +0200, Peter Zijlstra wrote:

> So now explain why the cpu_relax() hack that arm did doesn't work for
> you?

So below is the patch I think you want; if not explain in detail how
this is wrong.

diff --git a/arch/mips/include/asm/processor.h 
b/arch/mips/include/asm/processor.h
index af34afbc32d9..e59773de6528 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
 #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)

+#ifdef CONFIG_CPU_LOONGSON3
+/*
+ * Loongson-3 has a CPU bug where the store buffer gets starved when stuck in a
+ * read loop. Since spin loops of any kind should have a cpu_relax() in them,
+ * force a store-buffer flush from cpu_relax() such that any pending writes
+ * will become available as expected.
+ */
+#define cpu_relax()smp_mb()
+#else
 #define cpu_relax()barrier()
+#endif

 /*
  * Return_address is a replacement for __builtin_return_address(count)

Re: [PATCH V2] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-07-10 Thread 陈华才

Hi, Peter,

I'm afraid that you have missing something..

Firstly, our previous conclusion (READ_ONCE need a barrier to avoid 'reads 
prioritised over writes') is totally wrong. So define cpu_relax() to smp_mb() 
like ARM11MPCore is incorrect, even if it can 'solve' Loongson's problem.
Secondly, I think the real problem is like this:
 1, CPU0 set the lock to 0, then do something;
 2, While CPU0 is doing something, CPU1 set the flag to 1 with WRITE_ONCE(), 
and then wait the lock become to 1 with a READ_ONCE() loop;
 3, After CPU0 complete its work, it wait the flag become to 1, and if so then 
set the lock to 1; 
 4, If the lock becomes to 1, CPU1 will leave the READ_ONCE() loop.
If without SFB, everything is OK. But with SFB in step 2, a READ_ONCE() loop is 
right after WRITE_ONCE(), which makes the flag cached in SFB (so be invisible 
by other CPUs) for ever, then both CPU0 and CPU1 wait for ever.

I don't think this is a hardware bug, in design, SFB will flushed to L1 cache 
in three cases:
1, data in SFB is full (be a complete cache line);
2, there is a subsequent read access in the same cache line;
3, a 'sync' instruction is executed.

In this case, there is no other memory access (read or write) between 
WRITE_ONCE() and READ_ONCE() loop. So Case 1 and Case 2 will not happen, and 
the only way to make the flag be visible is wbflush (wbflush is sync in 
Loongson's case).

I think this problem is not only happens on Loongson, but will happen on other 
CPUs which have write buffer (unless the write buffer has a 4th case to be 
flushed).

Huacai

-- Original --
From:  "Peter Zijlstra";
Date:  Tue, Jul 10, 2018 06:54 PM
To:  "Huacai Chen";
Cc:  "Paul Burton"; "Ralf Baechle"; 
"James Hogan"; "linux-mips"; 
"Fuxin Zhang"; "wuzhangjin"; 
"stable"; "Alan Stern"; 
"Andrea Parri"; "Will 
Deacon"; "Boqun Feng"; "Nicholas 
Piggin"; "David Howells"; "Jade 
Alglave"; "Luc Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"LKML";
Subject:  Re: [PATCH V2] MIPS: implement smp_cond_load_acquire() for Loongson-3

On Tue, Jul 10, 2018 at 11:36:37AM +0200, Peter Zijlstra wrote:

> So now explain why the cpu_relax() hack that arm did doesn't work for
> you?

So below is the patch I think you want; if not explain in detail how
this is wrong.

diff --git a/arch/mips/include/asm/processor.h 
b/arch/mips/include/asm/processor.h
index af34afbc32d9..e59773de6528 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
 #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)

+#ifdef CONFIG_CPU_LOONGSON3
+/*
+ * Loongson-3 has a CPU bug where the store buffer gets starved when stuck in a
+ * read loop. Since spin loops of any kind should have a cpu_relax() in them,
+ * force a store-buffer flush from cpu_relax() such that any pending writes
+ * will become available as expected.
+ */
+#define cpu_relax()smp_mb()
+#else
 #define cpu_relax()barrier()
+#endif

 /*
  * Return_address is a replacement for __builtin_return_address(count)

Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-06-19 Thread 陈华才

Hi, Peter and Paul,

Loongson-3's Store Fill Buffer is nearly the same as your "Store Buffer", and 
it increases the memory ordering weakness. So, smp_cond_load_acquire() only 
need a __smp_mb() before the loop, not after every READ_ONCE(). In other word, 
the following code is just OK:

#define smp_cond_load_acquire(ptr, cond_expr)   \
({  \
typeof(ptr) __PTR = (ptr);  \
typeof(*ptr) VAL;   \
__smp_mb(); \
for (;;) {  \
VAL = READ_ONCE(*__PTR);\
if (cond_expr)  \
break;  \
cpu_relax();\
}   \
__smp_mb(); \
VAL;\
})

the __smp_mb() before loop is used to avoid "reads prioritised over writes", 
which is caused by SFB's weak ordering and similar to ARM11MPCore (mentioned by 
Will Deacon).

Huacai

-- Original --
From:  "Peter Zijlstra";
Date:  Tue, Jun 19, 2018 03:22 PM
To:  "陈华才";
Cc:  "Paul Burton"; "Ralf Baechle"; 
"James Hogan"; "linux-mips"; 
"Fuxin Zhang"; "wuzhangjin"; "Huacai 
Chen"; "stable"; "Alan 
Stern"; 
"AndreaParri"; "Will 
Deacon"; "Boqun Feng"; "Nicholas 
Piggin"; "David Howells"; "Jade 
Alglave"; "Luc Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"linux-kernel";
Subject:  Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

On Tue, Jun 19, 2018 at 02:40:14PM +0800, 陈华才 wrote:
> Hi, Paul,
> 
> First of all, could you please check why linux-mips reject e-mails
> from lemote.com? Of course I can send e-mails by gmail, but my gmail
> can't receive e-mails from linux-mips since March, 2018.

Could you please learn to use email? No top posting and wrap lines at 78
chars.

> I have already read Documentation/memory-barriers.txt, but I don't
> think we should define a smp_read_barrier_depends() for Loongson-3.
> Because Loongson-3's behavior isn't like Alpha, and in syntax, this is
> not a data-dependent issue.

Agreed, this is not a data-dependency issue.

> There is no document about Loongson-3's SFB. In my opinion, SFB looks
> like the L0 cache but sometimes it is out of cache-coherent machanism
> (L1 cache's cross-core coherency is maintained by hardware, but not
> always true for SFB). smp_mb() is needed for smp_cond_load_acquire(),
> but not every READ_ONCE().

Linux does _NOT_ support non-coherent SMP. If your system is not fully
coherent, you're out of luck.

But please, explain in excruciating detail what exactly you need that
smp_mb for. If, like I posited in my previous email, it is to ensure
remote store buffer flushes, then your machine is terminally broken.

Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-06-19 Thread 陈华才

Hi, Peter and Paul,

Loongson-3's Store Fill Buffer is nearly the same as your "Store Buffer", and 
it increases the memory ordering weakness. So, smp_cond_load_acquire() only 
need a __smp_mb() before the loop, not after every READ_ONCE(). In other word, 
the following code is just OK:

#define smp_cond_load_acquire(ptr, cond_expr)   \
({  \
typeof(ptr) __PTR = (ptr);  \
typeof(*ptr) VAL;   \
__smp_mb(); \
for (;;) {  \
VAL = READ_ONCE(*__PTR);\
if (cond_expr)  \
break;  \
cpu_relax();\
}   \
__smp_mb(); \
VAL;\
})

the __smp_mb() before loop is used to avoid "reads prioritised over writes", 
which is caused by SFB's weak ordering and similar to ARM11MPCore (mentioned by 
Will Deacon).

Huacai

-- Original --
From:  "Peter Zijlstra";
Date:  Tue, Jun 19, 2018 03:22 PM
To:  "陈华才";
Cc:  "Paul Burton"; "Ralf Baechle"; 
"James Hogan"; "linux-mips"; 
"Fuxin Zhang"; "wuzhangjin"; "Huacai 
Chen"; "stable"; "Alan 
Stern"; 
"AndreaParri"; "Will 
Deacon"; "Boqun Feng"; "Nicholas 
Piggin"; "David Howells"; "Jade 
Alglave"; "Luc Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"linux-kernel";
Subject:  Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

On Tue, Jun 19, 2018 at 02:40:14PM +0800, 陈华才 wrote:
> Hi, Paul,
> 
> First of all, could you please check why linux-mips reject e-mails
> from lemote.com? Of course I can send e-mails by gmail, but my gmail
> can't receive e-mails from linux-mips since March, 2018.

Could you please learn to use email? No top posting and wrap lines at 78
chars.

> I have already read Documentation/memory-barriers.txt, but I don't
> think we should define a smp_read_barrier_depends() for Loongson-3.
> Because Loongson-3's behavior isn't like Alpha, and in syntax, this is
> not a data-dependent issue.

Agreed, this is not a data-dependency issue.

> There is no document about Loongson-3's SFB. In my opinion, SFB looks
> like the L0 cache but sometimes it is out of cache-coherent machanism
> (L1 cache's cross-core coherency is maintained by hardware, but not
> always true for SFB). smp_mb() is needed for smp_cond_load_acquire(),
> but not every READ_ONCE().

Linux does _NOT_ support non-coherent SMP. If your system is not fully
coherent, you're out of luck.

But please, explain in excruciating detail what exactly you need that
smp_mb for. If, like I posited in my previous email, it is to ensure
remote store buffer flushes, then your machine is terminally broken.

Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-06-19 Thread 陈华才

Hi, Paul,

First of all, could you please check why linux-mips reject e-mails from 
lemote.com? Of course I can send e-mails by gmail, but my gmail can't receive 
e-mails from linux-mips since March, 2018.

I have already read Documentation/memory-barriers.txt, but I don't think we 
should define a smp_read_barrier_depends() for Loongson-3. Because Loongson-3's 
behavior isn't like Alpha, and in syntax, this is not a data-dependent issue.

There is no document about Loongson-3's SFB. In my opinion, SFB looks like the 
L0 cache but sometimes it is out of cache-coherent machanism (L1 cache's 
cross-core coherency is maintained by hardware, but not always true for SFB). 
smp_mb() is needed for smp_cond_load_acquire(), but not every READ_ONCE().

Huacai

-- Original --
From:  "Paul Burton";
Date:  Tue, Jun 19, 2018 02:51 AM
To:  "Huacai Chen";
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "Huacai Chen"; 
"stable"; "Alan Stern"; 
"AndreaParri"; "Will 
Deacon"; "Peter Zijlstra"; "Boqun 
Feng"; "Nicholas Piggin"; "David 
Howells"; "Jade Alglave"; "Luc 
Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"linux-kernel";
Subject:  Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

Hi Huacai,

On Fri, Jun 15, 2018 at 02:07:38PM +0800, Huacai Chen wrote:
> After commit 7f56b58a92aaf2c ("locking/mcs: Use smp_cond_load_acquire()
> in MCS spin loop") Loongson-3 fails to boot. This is because Loongson-3
> has SFB (Store Fill Buffer) and READ_ONCE() may get an old value in a
> tight loop. So in smp_cond_load_acquire() we need a __smp_mb() after
> every READ_ONCE().

Thanks - modifying smp_cond_load_acquire() is a step better than
modifying arch_mcs_spin_lock_contended() to avoid it, but I'm still not
sure we've reached the root of the problem. If tight loops using
READ_ONCE() are at fault then what's special about
smp_cond_load_acquire()? Could other such loops not hit the same
problem?

Is the scenario you encounter the same as that outlined in the "DATA
DEPENDENCY BARRIERS (HISTORICAL)" section of
Documentation/memory-barriers.txt by any chance? If so then perhaps it
would be better to implement smp_read_barrier_depends(), or just raw
read_barrier_depends() depending upon how your SFB functions.

Is there any public documentation describing the behaviour of the store
fill buffer in Loongson-3?

Part of the problem is that I'm still not sure what's actually happening
in your system - it would be helpful to have further information in the
commit message about what actually happens. For example if you could
walk us through an example of the problem step by step in the style of
the diagrams you'll see in Documentation/memory-barriers.txt then I
think that would help us to see what the best solution here is.

I've copied the LKMM maintainers in case they have further input.

Thanks,
Paul

> This patch introduce a Loongson-specific smp_cond_load_acquire(). And
> it should be backported to as early as linux-4.5, in which release the
> smp_cond_acquire() is introduced.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>  arch/mips/include/asm/barrier.h | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index a5eb1bb..4ea384d 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -222,6 +222,23 @@
>  #define __smp_mb__before_atomic()__smp_mb__before_llsc()
>  #define __smp_mb__after_atomic() smp_llsc_mb()
>  
> +#ifdef CONFIG_CPU_LOONGSON3
> +/* Loongson-3 need a __smp_mb() after READ_ONCE() here */
> +#define smp_cond_load_acquire(ptr, cond_expr)\
> +({   \
> + typeof(ptr) __PTR = (ptr);  \
> + typeof(*ptr) VAL;   \
> + for (;;) {  \
> + VAL = READ_ONCE(*__PTR);\
> + __smp_mb(); \
> + if (cond_expr)  \
> + break;  \
> + cpu_relax();\
> + }   \
> + VAL;\
> +})
> +#endif   /* CONFIG_CPU_LOONGSON3 */
> +
>  #include 
>  
>  #endif /* __ASM_BARRIER_H */
> -- 
> 2.7.0
>

Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

2018-06-19 Thread 陈华才

Hi, Paul,

First of all, could you please check why linux-mips reject e-mails from 
lemote.com? Of course I can send e-mails by gmail, but my gmail can't receive 
e-mails from linux-mips since March, 2018.

I have already read Documentation/memory-barriers.txt, but I don't think we 
should define a smp_read_barrier_depends() for Loongson-3. Because Loongson-3's 
behavior isn't like Alpha, and in syntax, this is not a data-dependent issue.

There is no document about Loongson-3's SFB. In my opinion, SFB looks like the 
L0 cache but sometimes it is out of cache-coherent machanism (L1 cache's 
cross-core coherency is maintained by hardware, but not always true for SFB). 
smp_mb() is needed for smp_cond_load_acquire(), but not every READ_ONCE().

Huacai

-- Original --
From:  "Paul Burton";
Date:  Tue, Jun 19, 2018 02:51 AM
To:  "Huacai Chen";
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "Huacai Chen"; 
"stable"; "Alan Stern"; 
"AndreaParri"; "Will 
Deacon"; "Peter Zijlstra"; "Boqun 
Feng"; "Nicholas Piggin"; "David 
Howells"; "Jade Alglave"; "Luc 
Maranget"; "Paul E. 
McKenney"; "Akira Yokosawa"; 
"linux-kernel";
Subject:  Re: [PATCH] MIPS: implement smp_cond_load_acquire() for Loongson-3

Hi Huacai,

On Fri, Jun 15, 2018 at 02:07:38PM +0800, Huacai Chen wrote:
> After commit 7f56b58a92aaf2c ("locking/mcs: Use smp_cond_load_acquire()
> in MCS spin loop") Loongson-3 fails to boot. This is because Loongson-3
> has SFB (Store Fill Buffer) and READ_ONCE() may get an old value in a
> tight loop. So in smp_cond_load_acquire() we need a __smp_mb() after
> every READ_ONCE().

Thanks - modifying smp_cond_load_acquire() is a step better than
modifying arch_mcs_spin_lock_contended() to avoid it, but I'm still not
sure we've reached the root of the problem. If tight loops using
READ_ONCE() are at fault then what's special about
smp_cond_load_acquire()? Could other such loops not hit the same
problem?

Is the scenario you encounter the same as that outlined in the "DATA
DEPENDENCY BARRIERS (HISTORICAL)" section of
Documentation/memory-barriers.txt by any chance? If so then perhaps it
would be better to implement smp_read_barrier_depends(), or just raw
read_barrier_depends() depending upon how your SFB functions.

Is there any public documentation describing the behaviour of the store
fill buffer in Loongson-3?

Part of the problem is that I'm still not sure what's actually happening
in your system - it would be helpful to have further information in the
commit message about what actually happens. For example if you could
walk us through an example of the problem step by step in the style of
the diagrams you'll see in Documentation/memory-barriers.txt then I
think that would help us to see what the best solution here is.

I've copied the LKMM maintainers in case they have further input.

Thanks,
Paul

> This patch introduce a Loongson-specific smp_cond_load_acquire(). And
> it should be backported to as early as linux-4.5, in which release the
> smp_cond_acquire() is introduced.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>  arch/mips/include/asm/barrier.h | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index a5eb1bb..4ea384d 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -222,6 +222,23 @@
>  #define __smp_mb__before_atomic()__smp_mb__before_llsc()
>  #define __smp_mb__after_atomic() smp_llsc_mb()
>  
> +#ifdef CONFIG_CPU_LOONGSON3
> +/* Loongson-3 need a __smp_mb() after READ_ONCE() here */
> +#define smp_cond_load_acquire(ptr, cond_expr)\
> +({   \
> + typeof(ptr) __PTR = (ptr);  \
> + typeof(*ptr) VAL;   \
> + for (;;) {  \
> + VAL = READ_ONCE(*__PTR);\
> + __smp_mb(); \
> + if (cond_expr)  \
> + break;  \
> + cpu_relax();\
> + }   \
> + VAL;\
> +})
> +#endif   /* CONFIG_CPU_LOONGSON3 */
> +
>  #include 
>  
>  #endif /* __ASM_BARRIER_H */
> -- 
> 2.7.0
>

Re: Regression with arm in next with stack protector

2018-03-27 Thread 陈华才

Allwinner devices are OK, Exynos devices (except 4210) are OK.

-- Original --
From:  "Tony Lindgren"<t...@atomide.com>;
Date:  Mon, Mar 26, 2018 11:37 PM
To:  "陈华才"<che...@lemote.com>;
Cc:  "Andrew Morton"<a...@linux-foundation.org>; "Fabio 
Estevam"<feste...@gmail.com>; "Stephen Rothwell"<s...@canb.auug.org.au>; "Rich 
Felker"<dal...@libc.org>; "Russell King"<li...@arm.linux.org.uk>; "Yoshinori 
Sato"<ys...@users.sourceforge.jp>; 
"linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
Baechle"<r...@linux-mips.org>; "linux-omap"<linux-o...@vger.kernel.org>; "James 
Hogan"<james.ho...@mips.com>; "moderated list:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE"<linux-arm-ker...@lists.infradead.org>;
Subject:  Re: Regression with arm in next with stack protector

* 陈华才 <che...@lemote.com> [180326 06:59]:
> Hi, Tony and Fabio,
> 
> Could you please upload your kernel binary to somewhere for me? I don't 
> understand why some ARM boards is OK while others are broken.

Well the kernel I'm testing is just current Linux next cross
compiled omap2plus_defconfig kernel. I do have few more config
options enabled like LOCKDEP and DEBUG_ATOMIC_SLEEP, but I
doubt they matter here :)

Then I'm using gcc-7.3.0 and binutils-2.30 built with the
buildall.git scripts:

git://git.infradead.org/users/segher/buildall.git

If you still need binaries, let me know.

Do you know which arm devices are working with your patch?

Regards,

Tony

Re: Regression with arm in next with stack protector

2018-03-27 Thread 陈华才

Allwinner devices are OK, Exynos devices (except 4210) are OK.

-- Original --
From:  "Tony Lindgren";
Date:  Mon, Mar 26, 2018 11:37 PM
To:  "陈华才";
Cc:  "Andrew Morton"; "Fabio 
Estevam"; "Stephen Rothwell"; "Rich 
Felker"; "Russell King"; "Yoshinori 
Sato"; 
"linux-kernel"; "Ralf 
Baechle"; "linux-omap"; "James 
Hogan"; "moderated list:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE";
Subject:  Re: Regression with arm in next with stack protector

* 陈华才  [180326 06:59]:
> Hi, Tony and Fabio,
> 
> Could you please upload your kernel binary to somewhere for me? I don't 
> understand why some ARM boards is OK while others are broken.

Well the kernel I'm testing is just current Linux next cross
compiled omap2plus_defconfig kernel. I do have few more config
options enabled like LOCKDEP and DEBUG_ATOMIC_SLEEP, but I
doubt they matter here :)

Then I'm using gcc-7.3.0 and binutils-2.30 built with the
buildall.git scripts:

git://git.infradead.org/users/segher/buildall.git

If you still need binaries, let me know.

Do you know which arm devices are working with your patch?

Regards,

Tony

Re: Regression with arm in next with stack protector

2018-03-26 Thread 陈华才

Hi, Tony and Fabio,

Could you please upload your kernel binary to somewhere for me? I don't 
understand why some ARM boards is OK while others are broken.

Huacai
 
-- Original --
From:  "Andrew Morton";
Date:  Sat, Mar 24, 2018 03:15 AM
To:  "Fabio Estevam";
Cc:  "Tony Lindgren"; "Huacai Chen"; 
"Stephen Rothwell"; "Rich Felker"; 
"Russell King"; "Yoshinori 
Sato"; 
"linux-kernel"; "Ralf 
Baechle"; "linux-omap"; "James 
Hogan"; "moderated list:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE";
Subject:  Re: Regression with arm in next with stack protector
 
On Fri, 23 Mar 2018 15:45:27 -0300 Fabio Estevam  wrote:

> On Fri, Mar 23, 2018 at 3:14 PM, Tony Lindgren  wrote:
> > Hi,
> >
> > Looks like commit 5638790dadae ("zboot: fix stack protector in
> > compressed boot phase") breaks booting on arm.
> 
> Yes, almost all imx are broken in linux-next due to this commit:
> https://kernelci.org/soc/imx/job/next/kernel/next-20180323/

Thanks, I dropped it and I expect Stephen will also do that.

Re: Regression with arm in next with stack protector

2018-03-26 Thread 陈华才

Hi, Tony and Fabio,

Could you please upload your kernel binary to somewhere for me? I don't 
understand why some ARM boards is OK while others are broken.

Huacai
 
-- Original --
From:  "Andrew Morton";
Date:  Sat, Mar 24, 2018 03:15 AM
To:  "Fabio Estevam";
Cc:  "Tony Lindgren"; "Huacai Chen"; 
"Stephen Rothwell"; "Rich Felker"; 
"Russell King"; "Yoshinori 
Sato"; 
"linux-kernel"; "Ralf 
Baechle"; "linux-omap"; "James 
Hogan"; "moderated list:ARM/FREESCALE IMX / MXC ARM 
ARCHITECTURE";
Subject:  Re: Regression with arm in next with stack protector
 
On Fri, 23 Mar 2018 15:45:27 -0300 Fabio Estevam  wrote:

> On Fri, Mar 23, 2018 at 3:14 PM, Tony Lindgren  wrote:
> > Hi,
> >
> > Looks like commit 5638790dadae ("zboot: fix stack protector in
> > compressed boot phase") breaks booting on arm.
> 
> Yes, almost all imx are broken in linux-next due to this commit:
> https://kernelci.org/soc/imx/job/next/kernel/next-20180323/

Thanks, I dropped it and I expect Stephen will also do that.

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-13 Thread 陈华才

But in b44_init(), there is no device instances.
 
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Fri, Nov 10, 2017 08:30 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
> diff --git a/drivers/net/ethernet/broadcom/b44.c 
> b/drivers/net/ethernet/broadcom/b44.c
> index a1125d1..2f6ffe5 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -2344,6 +2344,10 @@ static int b44_init_one(struct ssb_device *sdev,
>   struct net_device *dev;
>   struct b44 *bp;
>   int err;
> + unsigned int dma_desc_align_size = 
> dma_get_cache_alignment(sdev->dma_dev);
> +
> + /* Setup paramaters for syncing RX/TX DMA descriptors */
> + dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
>  
>   instance++;
>  
> @@ -2587,12 +2591,8 @@ static inline void b44_pci_exit(void)
>  
>  static int __init b44_init(void)
>  {
> - unsigned int dma_desc_align_size = dma_get_cache_alignment();
>   int err;
>  
> - /* Setup paramaters for syncing RX/TX DMA descriptors */
> - dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
> -

This looks wrong - you override a global variable for each probed
device.

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-13 Thread 陈华才

But in b44_init(), there is no device instances.
 
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Fri, Nov 10, 2017 08:30 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
> diff --git a/drivers/net/ethernet/broadcom/b44.c 
> b/drivers/net/ethernet/broadcom/b44.c
> index a1125d1..2f6ffe5 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -2344,6 +2344,10 @@ static int b44_init_one(struct ssb_device *sdev,
>   struct net_device *dev;
>   struct b44 *bp;
>   int err;
> + unsigned int dma_desc_align_size = 
> dma_get_cache_alignment(sdev->dma_dev);
> +
> + /* Setup paramaters for syncing RX/TX DMA descriptors */
> + dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
>  
>   instance++;
>  
> @@ -2587,12 +2591,8 @@ static inline void b44_pci_exit(void)
>  
>  static int __init b44_init(void)
>  {
> - unsigned int dma_desc_align_size = dma_get_cache_alignment();
>   int err;
>  
> - /* Setup paramaters for syncing RX/TX DMA descriptors */
> - dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
> -

This looks wrong - you override a global variable for each probed
device.

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-03 Thread 陈华才

Only patch 4 can be merged to stable, please ignore cc-stable in the rest.
 
-- Original --
From:  "Christoph Hellwig"<h...@lst.de>;
Date:  Fri, Nov 3, 2017 01:14 PM
To:  "陈华才"<che...@lemote.com>; 
Cc:  "Marek Szyprowski"<m.szyprow...@samsung.com>; "Christoph 
Hellwig"<h...@lst.de>; "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
"linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
"linux-mips"<linux-m...@linux-mips.org>; "James E . J 
.Bottomley"<j...@linux.vnet.ibm.com>; "Martin K . 
Petersen"<martin.peter...@oracle.com>; 
"linux-scsi"<linux-s...@vger.kernel.org>; "stable"<sta...@vger.kernel.org>; 
"Michael S . Tsirkin"<m...@redhat.com>; "Pawel Osciak"<pa...@osciak.com>; 
"Kyungmin Park"<kyungmin.p...@samsung.com>; "Michael 
Chan"<michael.c...@broadcom.com>; "Benjamin 
Herrenschmidt"<b...@kernel.crashing.org>; "Ivan Mikhaylov"<i...@ru.ibm.com>; 
"Tariq Toukan"<tar...@mellanox.com>; "Andy Gross"<agr...@codeaurora.org>; "Mark 
A . Greer"<mgr...@animalcreek.com>; "RobertBaldyga"<r.bald...@hackerion.com>; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
I can queue 1 up in the dma-mapping tree, and if I get reviews for
the mips and scsi bits I'd be happy to queue those up as well.

But I think you'd be better off moving patches 3 and 4 to the front
without the dma_get_cache_alignment prototype change so that they can be
merged to stable.

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-03 Thread 陈华才

Only patch 4 can be merged to stable, please ignore cc-stable in the rest.

-- Original --
From:  "Christoph Hellwig";
Date:  Fri, Nov 3, 2017 01:14 PM
To:  "陈华才"; 
Cc:  "Marek Szyprowski"; "Christoph 
Hellwig"; "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "RobertBaldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

I can queue 1 up in the dma-mapping tree, and if I get reviews for
the mips and scsi bits I'd be happy to queue those up as well.

But I think you'd be better off moving patches 3 and 4 to the front
without the dma_get_cache_alignment prototype change so that they can be
merged to stable.

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-02 Thread 陈华才

Why this is still un-merged? Should I remove the cc-stable and resend this 
series?

Huacai
 
 
-- Original --
From:  "陈华才"<che...@lemote.com>;
Date:  Thu, Oct 26, 2017 02:33 PM
To:  "Marek Szyprowski"<m.szyprow...@samsung.com>; "Christoph 
Hellwig"<h...@lst.de>; 
Cc:  "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
"linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
"linux-mips"<linux-m...@linux-mips.org>; "James E . J 
.Bottomley"<j...@linux.vnet.ibm.com>; "Martin K . 
Petersen"<martin.peter...@oracle.com>; 
"linux-scsi"<linux-s...@vger.kernel.org>; "stable"<sta...@vger.kernel.org>; 
"Michael S . Tsirkin"<m...@redhat.com>; "Pawel Osciak"<pa...@osciak.com>; 
"Kyungmin Park"<kyungmin.p...@samsung.com>; "Michael 
Chan"<michael.c...@broadcom.com>; "Benjamin 
Herrenschmidt"<b...@kernel.crashing.org>; "Ivan Mikhaylov"<i...@ru.ibm.com>; 
"Tariq Toukan"<tar...@mellanox.com>; "Andy Gross"<agr...@codeaurora.org>; "Mark 
A . Greer"<mgr...@animalcreek.com>; "RobertBaldyga"<r.bald...@hackerion.com>; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Maybe my first version is suitable for stable.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski"<m.szyprow...@samsung.com>;
Date:  Wed, Oct 25, 2017 03:21 PM
To:  "陈华才"<che...@lemote.com>; "Christoph Hellwig"<h...@lst.de>; 
Cc:  "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
"linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
"linux-mips"<linux-m...@linux-mips.org>; "James E . J 
.Bottomley"<j...@linux.vnet.ibm.com>; "Martin K . 
Petersen"<martin.peter...@oracle.com>; 
"linux-scsi"<linux-s...@vger.kernel.org>; "stable"<sta...@vger.kernel.org>; 
"Michael S . Tsirkin"<m...@redhat.com>; "Pawel Osciak"<pa...@osciak.com>; 
"Kyungmin Park"<kyungmin.p...@samsung.com>; "Michael 
Chan"<michael.c...@broadcom.com>; "Benjamin 
Herrenschmidt"<b...@kernel.crashing.org>; "Ivan Mikhaylov"<i...@ru.ibm.com>; 
"Tariq Toukan"<tar...@mellanox.com>; "Andy Gross"<agr...@codeaurora.org>; "Mark 
A . Greer"<mgr...@animalcreek.com>; "RobertBaldyga"<r.bald...@hackerion.com>; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-25 03:22, 陈华才 wrote:
> Hi, Marek
>
> Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Then patch #3 has to be reworked. First change scsi to align the block 
queue to dma_get_cache_alignment(). This will be safe in all cases and 
it will not hurt memory usage that much. Such version can be applied 
first and sent to stable without any dependencies. Please also describe 
deeply why such change is needed and what issues can be observed without 
it, on which systems.

Then as an optimization add support for per-device cache_alignment 
(patches #1 and #2). I'm still not convinced that it makes sense to 
align DMA structures to values less than L1 cache line size. It might 
hurt performance, because cache coherency has its cost and it is also 
relevant to multi-core/smp access to any objects that are in the same l1 
cache line. Memory savings that might be the results of such lower 
alignment are probably negligible.

>
> Huacai
>   
>   
> -- Original --
> From:  "Marek Szyprowski"<m.szyprow...@samsung.com>;
> Date:  Tue, Oct 24, 2017 09:30 PM
> To:  "Huacai Chen"<che...@lemote.com>; "Christoph Hellwig"<h...@lst.de>;
> Cc:  "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
> Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
> "linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
> Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
> "linux-mips"<linux-m...@linux-mips.org>; "James E . J 
> .Bottomley"

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-11-02 Thread 陈华才

Why this is still un-merged? Should I remove the cc-stable and resend this 
series?

Huacai
 
 
-- Original --
From:  "陈华才";
Date:  Thu, Oct 26, 2017 02:33 PM
To:  "Marek Szyprowski"; "Christoph 
Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "RobertBaldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Maybe my first version is suitable for stable.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Wed, Oct 25, 2017 03:21 PM
To:  "陈华才"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "RobertBaldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-25 03:22, 陈华才 wrote:
> Hi, Marek
>
> Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Then patch #3 has to be reworked. First change scsi to align the block 
queue to dma_get_cache_alignment(). This will be safe in all cases and 
it will not hurt memory usage that much. Such version can be applied 
first and sent to stable without any dependencies. Please also describe 
deeply why such change is needed and what issues can be observed without 
it, on which systems.

Then as an optimization add support for per-device cache_alignment 
(patches #1 and #2). I'm still not convinced that it makes sense to 
align DMA structures to values less than L1 cache line size. It might 
hurt performance, because cache coherency has its cost and it is also 
relevant to multi-core/smp access to any objects that are in the same l1 
cache line. Memory savings that might be the results of such lower 
alignment are probably negligible.

>
> Huacai
>   
>   
> -- Original --
> From:  "Marek Szyprowski";
> Date:  Tue, Oct 24, 2017 09:30 PM
> To:  "Huacai Chen"; "Christoph Hellwig";
> Cc:  "Robin Murphy"; "Andrew 
> Morton"; "Fuxin Zhang"; 
> "linux-kernel"; "Ralf 
> Baechle"; "JamesHogan"; 
> "linux-mips"; "James E . J 
> .Bottomley"; "Martin K . 
> Petersen"; 
> "linux-scsi"; "stable"; 
> "Michael S . Tsirkin"; "Pawel Osciak"; 
> "Kyungmin Park"; "Michael 
> Chan"; "Benjamin 
> Herrenschmidt"; "Ivan Mikhaylov"; 
> "Tariq Toukan"; "Andy Gross"; 
> "Mark A . Greer"; "Robert 
> Baldyga";
> Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()
>
>   
> Hi Huacai,
>
> On 2017-10-23 09:12, Huacai Chen wrote:
>> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
>> it can return different alignments due to different devices' I/O cache
>> coherency.
>>
>> Currently, ARM/ARM64 and MIPS support coherent & noncoherent devices
>> co-exist. This may be extended in the future, so add a new function
>> pointer (i.e, get_cache_alignment) in 'struct dma_map_ops' as a generic
>> solution.
>>
>> Cc: sta...@vger.kernel.org
> I don't think this change should go to stable.
>
>> Cc: Michael S. Tsirkin 
>> Cc: Pawel Osciak 
>> Cc: Marek Szyprowski 
>> Cc: Kyungmin Park 
>> Cc: Michael Chan 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Ivan Mikhaylov 
>> Cc: Tariq Toukan 
>> Cc: Andy Gross 
>> Cc: Mark A. Greer 
>> Cc: Robert Baldyga 
>> Cc: Marek Szyprowski 
>> Signed-off-by: Huacai Chen 
>> ---
>>drivers/infiniband/hw/mthca/mthca_main.c   |   2 +-
>>drivers/media/v4l2-core/videobuf2-dma-contig.c |   2 +-
>>dr

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-10-26 Thread 陈华才

Maybe my first version is suitable for stable.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski"<m.szyprow...@samsung.com>;
Date:  Wed, Oct 25, 2017 03:21 PM
To:  "陈华才"<che...@lemote.com>; "Christoph Hellwig"<h...@lst.de>; 
Cc:  "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
"linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
"linux-mips"<linux-m...@linux-mips.org>; "James E . J 
.Bottomley"<j...@linux.vnet.ibm.com>; "Martin K . 
Petersen"<martin.peter...@oracle.com>; 
"linux-scsi"<linux-s...@vger.kernel.org>; "stable"<sta...@vger.kernel.org>; 
"Michael S . Tsirkin"<m...@redhat.com>; "Pawel Osciak"<pa...@osciak.com>; 
"Kyungmin Park"<kyungmin.p...@samsung.com>; "Michael 
Chan"<michael.c...@broadcom.com>; "Benjamin 
Herrenschmidt"<b...@kernel.crashing.org>; "Ivan Mikhaylov"<i...@ru.ibm.com>; 
"Tariq Toukan"<tar...@mellanox.com>; "Andy Gross"<agr...@codeaurora.org>; "Mark 
A . Greer"<mgr...@animalcreek.com>; "RobertBaldyga"<r.bald...@hackerion.com>; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-25 03:22, 陈华才 wrote:
> Hi, Marek
>
> Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Then patch #3 has to be reworked. First change scsi to align the block 
queue to dma_get_cache_alignment(). This will be safe in all cases and 
it will not hurt memory usage that much. Such version can be applied 
first and sent to stable without any dependencies. Please also describe 
deeply why such change is needed and what issues can be observed without 
it, on which systems.

Then as an optimization add support for per-device cache_alignment 
(patches #1 and #2). I'm still not convinced that it makes sense to 
align DMA structures to values less than L1 cache line size. It might 
hurt performance, because cache coherency has its cost and it is also 
relevant to multi-core/smp access to any objects that are in the same l1 
cache line. Memory savings that might be the results of such lower 
alignment are probably negligible.

>
> Huacai
>   
>   
> -- Original --
> From:  "Marek Szyprowski"<m.szyprow...@samsung.com>;
> Date:  Tue, Oct 24, 2017 09:30 PM
> To:  "Huacai Chen"<che...@lemote.com>; "Christoph Hellwig"<h...@lst.de>;
> Cc:  "Robin Murphy"<robin.mur...@arm.com>; "Andrew 
> Morton"<a...@linux-foundation.org>; "Fuxin Zhang"<zhan...@lemote.com>; 
> "linux-kernel"<linux-kernel@vger.kernel.org>; "Ralf 
> Baechle"<r...@linux-mips.org>; "JamesHogan"<james.ho...@imgtec.com>; 
> "linux-mips"<linux-m...@linux-mips.org>; "James E . J 
> .Bottomley"<j...@linux.vnet.ibm.com>; "Martin K . 
> Petersen"<martin.peter...@oracle.com>; 
> "linux-scsi"<linux-s...@vger.kernel.org>; "stable"<sta...@vger.kernel.org>; 
> "Michael S . Tsirkin"<m...@redhat.com>; "Pawel Osciak"<pa...@osciak.com>; 
> "Kyungmin Park"<kyungmin.p...@samsung.com>; "Michael 
> Chan"<michael.c...@broadcom.com>; "Benjamin 
> Herrenschmidt"<b...@kernel.crashing.org>; "Ivan Mikhaylov"<i...@ru.ibm.com>; 
> "Tariq Toukan"<tar...@mellanox.com>; "Andy Gross"<agr...@codeaurora.org>; 
> "Mark A . Greer"<mgr...@animalcreek.com>; "Robert 
> Baldyga"<r.bald...@hackerion.com>;
> Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()
>
>   
> Hi Huacai,
>
> On 2017-10-23 09:12, Huacai Chen wrote:
>> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
>> it can return different alignments due to different devices' I/O cache
>> coherency.
>>
>> Currently, ARM/ARM64 and MIPS support coherent & noncoherent devices
>> co-exist. This may be extended in the future, so add a new function
>> pointer (i.e, get_cache_alignment) in 'struct dma_map_ops' as a generic
>> solution.
>>
>> Cc: sta...@vger.kernel.org
> I don't think this change should go to stable.
>
>> Cc: Michael S. Tsirkin <m...@redhat.com>
>> Cc: Pawel Osciak

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-10-26 Thread 陈华才

Maybe my first version is suitable for stable.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Wed, Oct 25, 2017 03:21 PM
To:  "陈华才"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "RobertBaldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-25 03:22, 陈华才 wrote:
> Hi, Marek
>
> Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Then patch #3 has to be reworked. First change scsi to align the block 
queue to dma_get_cache_alignment(). This will be safe in all cases and 
it will not hurt memory usage that much. Such version can be applied 
first and sent to stable without any dependencies. Please also describe 
deeply why such change is needed and what issues can be observed without 
it, on which systems.

Then as an optimization add support for per-device cache_alignment 
(patches #1 and #2). I'm still not convinced that it makes sense to 
align DMA structures to values less than L1 cache line size. It might 
hurt performance, because cache coherency has its cost and it is also 
relevant to multi-core/smp access to any objects that are in the same l1 
cache line. Memory savings that might be the results of such lower 
alignment are probably negligible.

>
> Huacai
>   
>   
> -- Original --
> From:  "Marek Szyprowski";
> Date:  Tue, Oct 24, 2017 09:30 PM
> To:  "Huacai Chen"; "Christoph Hellwig";
> Cc:  "Robin Murphy"; "Andrew 
> Morton"; "Fuxin Zhang"; 
> "linux-kernel"; "Ralf 
> Baechle"; "JamesHogan"; 
> "linux-mips"; "James E . J 
> .Bottomley"; "Martin K . 
> Petersen"; 
> "linux-scsi"; "stable"; 
> "Michael S . Tsirkin"; "Pawel Osciak"; 
> "Kyungmin Park"; "Michael 
> Chan"; "Benjamin 
> Herrenschmidt"; "Ivan Mikhaylov"; 
> "Tariq Toukan"; "Andy Gross"; 
> "Mark A . Greer"; "Robert 
> Baldyga";
> Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()
>
>   
> Hi Huacai,
>
> On 2017-10-23 09:12, Huacai Chen wrote:
>> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
>> it can return different alignments due to different devices' I/O cache
>> coherency.
>>
>> Currently, ARM/ARM64 and MIPS support coherent & noncoherent devices
>> co-exist. This may be extended in the future, so add a new function
>> pointer (i.e, get_cache_alignment) in 'struct dma_map_ops' as a generic
>> solution.
>>
>> Cc: sta...@vger.kernel.org
> I don't think this change should go to stable.
>
>> Cc: Michael S. Tsirkin 
>> Cc: Pawel Osciak 
>> Cc: Marek Szyprowski 
>> Cc: Kyungmin Park 
>> Cc: Michael Chan 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Ivan Mikhaylov 
>> Cc: Tariq Toukan 
>> Cc: Andy Gross 
>> Cc: Mark A. Greer 
>> Cc: Robert Baldyga 
>> Cc: Marek Szyprowski 
>> Signed-off-by: Huacai Chen 
>> ---
>>drivers/infiniband/hw/mthca/mthca_main.c   |   2 +-
>>drivers/media/v4l2-core/videobuf2-dma-contig.c |   2 +-
>>drivers/net/ethernet/broadcom/b44.c|   8 +-
>>drivers/net/ethernet/ibm/emac/core.c   |  32 +++--
>>drivers/net/ethernet/ibm/emac/core.h   |  14 +-
>>drivers/net/ethernet/mellanox/mlx4/main.c  |   2 +-
>>drivers/spi/spi-qup.c  |   4 +-
>>drivers/tty/serial/mpsc.c  | 179 
>> +
>>drivers/tty/serial/samsung.c   |  14 +-
>>include/linux/dma-mapping.h|  17 ++-
> For videobuf2-dma-contig, serial/samsung and dma-mapping.h:
>
> Acked-by: Marek Szyprowski 
>
>
>>10 files changed, 150 insertions(+), 124 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
>> b/drivers/infiniband/hw/mthca/mthca_main.c
>> index e36a9bc..078fe8d 100644
>> --- a/drivers/infiniband/hw/mthca/

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-10-24 Thread 陈华才

Hi, Marek

Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Tue, Oct 24, 2017 09:30 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-23 09:12, Huacai Chen wrote:
> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
> it can return different alignments due to different devices' I/O cache
> coherency.
>
> Currently, ARM/ARM64 and MIPS support coherent & noncoherent devices
> co-exist. This may be extended in the future, so add a new function
> pointer (i.e, get_cache_alignment) in 'struct dma_map_ops' as a generic
> solution.
>
> Cc: sta...@vger.kernel.org

I don't think this change should go to stable.

> Cc: Michael S. Tsirkin 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Michael Chan 
> Cc: Benjamin Herrenschmidt 
> Cc: Ivan Mikhaylov 
> Cc: Tariq Toukan 
> Cc: Andy Gross 
> Cc: Mark A. Greer 
> Cc: Robert Baldyga 
> Cc: Marek Szyprowski 
> Signed-off-by: Huacai Chen 
> ---
>   drivers/infiniband/hw/mthca/mthca_main.c   |   2 +-
>   drivers/media/v4l2-core/videobuf2-dma-contig.c |   2 +-
>   drivers/net/ethernet/broadcom/b44.c|   8 +-
>   drivers/net/ethernet/ibm/emac/core.c   |  32 +++--
>   drivers/net/ethernet/ibm/emac/core.h   |  14 +-
>   drivers/net/ethernet/mellanox/mlx4/main.c  |   2 +-
>   drivers/spi/spi-qup.c  |   4 +-
>   drivers/tty/serial/mpsc.c  | 179 
> +
>   drivers/tty/serial/samsung.c   |  14 +-
>   include/linux/dma-mapping.h|  17 ++-

For videobuf2-dma-contig, serial/samsung and dma-mapping.h:

Acked-by: Marek Szyprowski 


>   10 files changed, 150 insertions(+), 124 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
> b/drivers/infiniband/hw/mthca/mthca_main.c
> index e36a9bc..078fe8d 100644
> --- a/drivers/infiniband/hw/mthca/mthca_main.c
> +++ b/drivers/infiniband/hw/mthca/mthca_main.c
> @@ -416,7 +416,7 @@ static int mthca_init_icm(struct mthca_dev *mdev,
>   
>   /* CPU writes to non-reserved MTTs, while HCA might DMA to reserved 
> mtts */
>   mdev->limits.reserved_mtts = ALIGN(mdev->limits.reserved_mtts * 
> mdev->limits.mtt_seg_size,
> -dma_get_cache_alignment()) / 
> mdev->limits.mtt_seg_size;
> +
> dma_get_cache_alignment(>pdev->dev)) / mdev->limits.mtt_seg_size;
>   
>   mdev->mr_table.mtt_table = mthca_alloc_icm_table(mdev, 
> init_hca->mtt_base,
>
> mdev->limits.mtt_seg_size,
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
> b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index 9f389f3..1f6a9b7 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -484,7 +484,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
> unsigned long vaddr,
>   int ret = 0;
>   struct sg_table *sgt;
>   unsigned long contig_size;
> - unsigned long dma_align = dma_get_cache_alignment();
> + unsigned long dma_align = dma_get_cache_alignment(dev);
>   
>   /* Only cache aligned DMA transfers are reliable */
>   if (!IS_ALIGNED(vaddr | size, dma_align)) {
> diff --git a/drivers/net/ethernet/broadcom/b44.c 
> b/drivers/net/ethernet/broadcom/b44.c
> index a1125d1..2f6ffe5 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -2344,6 +2344,10 @@

Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

2017-10-24 Thread 陈华才

Hi, Marek

Patch3 is needed for stable, but Patch3 depend on Patch1 and Patch2.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Tue, Oct 24, 2017 09:30 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
"Michael S . Tsirkin"; "Pawel Osciak"; 
"Kyungmin Park"; "Michael 
Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
Subject:  Re: [PATCH V9 1/4] dma-mapping: Rework dma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-23 09:12, Huacai Chen wrote:
> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
> it can return different alignments due to different devices' I/O cache
> coherency.
>
> Currently, ARM/ARM64 and MIPS support coherent & noncoherent devices
> co-exist. This may be extended in the future, so add a new function
> pointer (i.e, get_cache_alignment) in 'struct dma_map_ops' as a generic
> solution.
>
> Cc: sta...@vger.kernel.org

I don't think this change should go to stable.

> Cc: Michael S. Tsirkin 
> Cc: Pawel Osciak 
> Cc: Marek Szyprowski 
> Cc: Kyungmin Park 
> Cc: Michael Chan 
> Cc: Benjamin Herrenschmidt 
> Cc: Ivan Mikhaylov 
> Cc: Tariq Toukan 
> Cc: Andy Gross 
> Cc: Mark A. Greer 
> Cc: Robert Baldyga 
> Cc: Marek Szyprowski 
> Signed-off-by: Huacai Chen 
> ---
>   drivers/infiniband/hw/mthca/mthca_main.c   |   2 +-
>   drivers/media/v4l2-core/videobuf2-dma-contig.c |   2 +-
>   drivers/net/ethernet/broadcom/b44.c|   8 +-
>   drivers/net/ethernet/ibm/emac/core.c   |  32 +++--
>   drivers/net/ethernet/ibm/emac/core.h   |  14 +-
>   drivers/net/ethernet/mellanox/mlx4/main.c  |   2 +-
>   drivers/spi/spi-qup.c  |   4 +-
>   drivers/tty/serial/mpsc.c  | 179 
> +
>   drivers/tty/serial/samsung.c   |  14 +-
>   include/linux/dma-mapping.h|  17 ++-

For videobuf2-dma-contig, serial/samsung and dma-mapping.h:

Acked-by: Marek Szyprowski 


>   10 files changed, 150 insertions(+), 124 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
> b/drivers/infiniband/hw/mthca/mthca_main.c
> index e36a9bc..078fe8d 100644
> --- a/drivers/infiniband/hw/mthca/mthca_main.c
> +++ b/drivers/infiniband/hw/mthca/mthca_main.c
> @@ -416,7 +416,7 @@ static int mthca_init_icm(struct mthca_dev *mdev,
>   
>   /* CPU writes to non-reserved MTTs, while HCA might DMA to reserved 
> mtts */
>   mdev->limits.reserved_mtts = ALIGN(mdev->limits.reserved_mtts * 
> mdev->limits.mtt_seg_size,
> -dma_get_cache_alignment()) / 
> mdev->limits.mtt_seg_size;
> +
> dma_get_cache_alignment(>pdev->dev)) / mdev->limits.mtt_seg_size;
>   
>   mdev->mr_table.mtt_table = mthca_alloc_icm_table(mdev, 
> init_hca->mtt_base,
>
> mdev->limits.mtt_seg_size,
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
> b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index 9f389f3..1f6a9b7 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -484,7 +484,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
> unsigned long vaddr,
>   int ret = 0;
>   struct sg_table *sgt;
>   unsigned long contig_size;
> - unsigned long dma_align = dma_get_cache_alignment();
> + unsigned long dma_align = dma_get_cache_alignment(dev);
>   
>   /* Only cache aligned DMA transfers are reliable */
>   if (!IS_ALIGNED(vaddr | size, dma_align)) {
> diff --git a/drivers/net/ethernet/broadcom/b44.c 
> b/drivers/net/ethernet/broadcom/b44.c
> index a1125d1..2f6ffe5 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -2344,6 +2344,10 @@ static int b44_init_one(struct ssb_device *sdev,
>   struct net_device *dev;
>   struct b44 *bp;
>   int err;
> + unsigned int dma_desc_align_size = 
> dma_get_cache_alignment(sdev->dma_dev);
> +
> + /* Setup paramaters for syncing RX/TX DMA descriptors */
> + dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
>   
>   instance++;
>   
> @@ -2587,12 +2591,8 @@ static inline void b44_pci_exit(void)
>   
>   static int __init b44_init(void)
>   {
> - unsigned int dma_desc_align_size = dma_get_cache_alignment();
>   int err;
>   
> - /* Setup paramaters for syncing RX/TX DMA descriptors */
> - dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
> sizeof(struct dma_desc));
> -
>   err = b44_pci_init();
>   if (err)
>   return err;
> diff --git

Re: [PATCH V8 5/5] libata: Align DMA buffer todma_get_cache_alignment()

2017-10-19 Thread 陈华才

Hi, Matt,

I found that 4ee34ea3a12396f35b26d90a094c75db ("libata: Align ata_device's id 
on a cacheline") can resolve everything. Because the size of id[ATA_ID_WORDS] 
is already aligned and devslp_timing needn't to be aligned. So, In V9 of this 
series I will drop this patch. Why I had problems before? because I used 
linux-4.4.

Huacai
 
 
-- Original --
From:  "Matt Redfearn";
Date:  Thu, Oct 19, 2017 03:52 PM
To:  "Tejun Heo"; "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"AndrewMorton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 5/5] libata: Align DMA buffer 
todma_get_cache_alignment()

 


On 18/10/17 14:03, Tejun Heo wrote:
> On Tue, Oct 17, 2017 at 04:05:42PM +0800, Huacai Chen wrote:
>> In non-coherent DMA mode, kernel uses cache flushing operations to
>> maintain I/O coherency, so in ata_do_dev_read_id() the DMA buffer
>> should be aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer
>> and a kernel structure share a same cache line, and if the kernel
>> structure has dirty data, cache_invalidate (no writeback) will cause
>> data corruption.
>>
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Huacai Chen 
>> ---
>>   drivers/ata/libata-core.c | 15 +--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>> index ee4c1ec..e134955 100644
>> --- a/drivers/ata/libata-core.c
>> +++ b/drivers/ata/libata-core.c
>> @@ -1833,8 +1833,19 @@ static u32 ata_pio_mask_no_iordy(const struct 
>> ata_device *adev)
>>   unsigned int ata_do_dev_read_id(struct ata_device *dev,
>>  struct ata_taskfile *tf, u16 *id)
>>   {
>> -return ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE,
>> - id, sizeof(id[0]) * ATA_ID_WORDS, 0);
>> +u16 *devid;
>> +int res, size = sizeof(u16) * ATA_ID_WORDS;
>> +
>> +if (IS_ALIGNED((unsigned long)id, dma_get_cache_alignment(>tdev)))
>> +res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, id, 
>> size, 0);
>> +else {
>> +devid = kmalloc(size, GFP_KERNEL);
>> +res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, devid, 
>> size, 0);
>> +memcpy(id, devid, size);
>> +kfree(devid);
>> +}
>> +
>> +return res;
> Hmm... I think it'd be a lot better to ensure that the buffers are
> aligned properly to begin with.  There are only two buffers which are
> used for id reading - ata_port->sector_buf and ata_device->id.  Both
> are embedded arrays but making them separately allocated aligned
> buffers shouldn't be difficult.
>
> Thanks.

FWIW, I agree that the buffers used for DMA should be split out from the 
structure. We ran into this problem on MIPS last year, 
4ee34ea3a12396f35b26d90a094c75db95080baa ("libata: Align ata_device's id 
on a cacheline") partially fixed it, but likely should have also 
cacheline aligned the following devslp_timing in the struct such that we 
guarantee that members of the struct not used for DMA do not share the 
same cacheline as the DMA buffer. Not having this means that 
architectures, such as MIPS, which in some cases have to perform manual 
invalidation of DMA buffer can clobber valid adjacent data if it is in 
the same cacheline.

Thanks,
Matt

Re: [PATCH V8 5/5] libata: Align DMA buffer todma_get_cache_alignment()

2017-10-19 Thread 陈华才

Hi, Matt,

I found that 4ee34ea3a12396f35b26d90a094c75db ("libata: Align ata_device's id 
on a cacheline") can resolve everything. Because the size of id[ATA_ID_WORDS] 
is already aligned and devslp_timing needn't to be aligned. So, In V9 of this 
series I will drop this patch. Why I had problems before? because I used 
linux-4.4.

Huacai
 
 
-- Original --
From:  "Matt Redfearn";
Date:  Thu, Oct 19, 2017 03:52 PM
To:  "Tejun Heo"; "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"AndrewMorton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 5/5] libata: Align DMA buffer 
todma_get_cache_alignment()

 


On 18/10/17 14:03, Tejun Heo wrote:
> On Tue, Oct 17, 2017 at 04:05:42PM +0800, Huacai Chen wrote:
>> In non-coherent DMA mode, kernel uses cache flushing operations to
>> maintain I/O coherency, so in ata_do_dev_read_id() the DMA buffer
>> should be aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer
>> and a kernel structure share a same cache line, and if the kernel
>> structure has dirty data, cache_invalidate (no writeback) will cause
>> data corruption.
>>
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Huacai Chen 
>> ---
>>   drivers/ata/libata-core.c | 15 +--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>> index ee4c1ec..e134955 100644
>> --- a/drivers/ata/libata-core.c
>> +++ b/drivers/ata/libata-core.c
>> @@ -1833,8 +1833,19 @@ static u32 ata_pio_mask_no_iordy(const struct 
>> ata_device *adev)
>>   unsigned int ata_do_dev_read_id(struct ata_device *dev,
>>  struct ata_taskfile *tf, u16 *id)
>>   {
>> -return ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE,
>> - id, sizeof(id[0]) * ATA_ID_WORDS, 0);
>> +u16 *devid;
>> +int res, size = sizeof(u16) * ATA_ID_WORDS;
>> +
>> +if (IS_ALIGNED((unsigned long)id, dma_get_cache_alignment(>tdev)))
>> +res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, id, 
>> size, 0);
>> +else {
>> +devid = kmalloc(size, GFP_KERNEL);
>> +res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, devid, 
>> size, 0);
>> +memcpy(id, devid, size);
>> +kfree(devid);
>> +}
>> +
>> +return res;
> Hmm... I think it'd be a lot better to ensure that the buffers are
> aligned properly to begin with.  There are only two buffers which are
> used for id reading - ata_port->sector_buf and ata_device->id.  Both
> are embedded arrays but making them separately allocated aligned
> buffers shouldn't be difficult.
>
> Thanks.

FWIW, I agree that the buffers used for DMA should be split out from the 
structure. We ran into this problem on MIPS last year, 
4ee34ea3a12396f35b26d90a094c75db95080baa ("libata: Align ata_device's id 
on a cacheline") partially fixed it, but likely should have also 
cacheline aligned the following devslp_timing in the struct such that we 
guarantee that members of the struct not used for DMA do not share the 
same cacheline as the DMA buffer. Not having this means that 
architectures, such as MIPS, which in some cases have to perform manual 
invalidation of DMA buffer can clobber valid adjacent data if it is in 
the same cacheline.

Thanks,
Matt

Re: [PATCH V8 4/5] libsas: Align SMP req/resp todma_get_cache_alignment()

2017-10-17 Thread 陈华才

Hi, Marek,

Yes, I know in include/linux/slab.h, there is
#define KMALLOC_MIN_SIZE ARCH_DMA_MINALIGN

But I observed that the req/resp isn't aligned to ARCH_DMA_MINALIGN and I don't 
know why.

Problems I experienced is: In non-coherent mode, mvsas with an expander cannot 
detect any disks.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Tue, Oct 17, 2017 07:55 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Tejun Heo"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 4/5] libsas: Align SMP req/resp 
todma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-17 10:05, Huacai Chen wrote:
> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so libsas's SMP request/response should be
> aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer and a kernel
> structure share a same cache line, and if the kernel structure has
> dirty data, cache_invalidate (no writeback) will cause data corruption.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>   drivers/scsi/libsas/sas_expander.c | 93 
> +++---
>   1 file changed, 57 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/scsi/libsas/sas_expander.c 
> b/drivers/scsi/libsas/sas_expander.c
> index 6b4fd23..124a44b 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -164,17 +164,17 @@ static int smp_execute_task(struct domain_device *dev, 
> void *req, int req_size,
>   
>   /* -- Allocations -- */
>   
> -static inline void *alloc_smp_req(int size)
> +static inline void *alloc_smp_req(int size, int align)
>   {
> - u8 *p = kzalloc(size, GFP_KERNEL);
> + u8 *p = kzalloc(ALIGN(size, align), GFP_KERNEL);

If I remember correctly, kernel guarantees that each kmalloced buffer is
always at least aligned to the CPU cache line, so CPU cache can be
invalidated on the allocated buffer without corrupting anything else.
Taking this into account, I wonder if the above change make sense.

Have you experienced any problems without this change?

>   if (p)
>   p[0] = SMP_REQUEST;
>   return p;
>   }
>   
> -static inline void *alloc_smp_resp(int size)
> +static inline void *alloc_smp_resp(int size, int align)
>   {
> - return kzalloc(size, GFP_KERNEL);
> + return kzalloc(ALIGN(size, align), GFP_KERNEL);

Save a above.

>   }
>   
>   static char sas_route_char(struct domain_device *dev, struct ex_phy *phy)
> @@ -403,15 +403,17 @@ static int sas_ex_phy_discover_helper(struct 
> domain_device *dev, u8 *disc_req,
>   int sas_ex_phy_discover(struct domain_device *dev, int single)
>   {
>   struct expander_device *ex = >ex_dev;
> - int  res = 0;
> + int  res = 0, align;
>   u8   *disc_req;
>   u8   *disc_resp;
>   
> - disc_req = alloc_smp_req(DISCOVER_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + disc_req = alloc_smp_req(DISCOVER_REQ_SIZE, align);
>   if (!disc_req)
>   return -ENOMEM;
>   
> - disc_resp = alloc_smp_resp(DISCOVER_RESP_SIZE);
> + disc_resp = alloc_smp_resp(DISCOVER_RESP_SIZE, align);
>   if (!disc_resp) {
>   kfree(disc_req);
>   return -ENOMEM;
> @@ -480,14 +482,15 @@ static int sas_ex_general(struct domain_device *dev)
>   {
>   u8 *rg_req;
>   struct smp_resp *rg_resp;
> - int res;
> - int i;
> + int i, res, align;
>   
> - rg_req = alloc_smp_req(RG_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + rg_req = alloc_smp_req(RG_REQ_SIZE, align);
>   if (!rg_req)
>   return -ENOMEM;
>   
> - rg_resp = alloc_smp_resp(RG_RESP_SIZE);
> + rg_resp = alloc_smp_resp(RG_RESP_SIZE, align);
>   if (!rg_resp) {
>   kfree(rg_req);
>   return -ENOMEM;
> @@ -552,13 +555,15 @@ static int sas_ex_manuf_info(struct domain_device *dev)
>   {
>   u8 *mi_req;
>   u8 *mi_resp;
> - int res;
> + int res, align;
>   
> - mi_req = alloc_smp_req(MI_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + mi_req = alloc_smp_req(MI_REQ_SIZE, align);
>   if (!mi_req)
>   return -ENOMEM;
>   
> - mi_resp = alloc_smp_resp(MI_RESP_SIZE);
> + mi_resp = alloc_smp_resp(MI_RESP_SIZE, align);
>   if (!mi_resp) {
>   kfree(mi_req);

Re: [PATCH V8 4/5] libsas: Align SMP req/resp todma_get_cache_alignment()

2017-10-17 Thread 陈华才

Hi, Marek,

Yes, I know in include/linux/slab.h, there is
#define KMALLOC_MIN_SIZE ARCH_DMA_MINALIGN

But I observed that the req/resp isn't aligned to ARCH_DMA_MINALIGN and I don't 
know why.

Problems I experienced is: In non-coherent mode, mvsas with an expander cannot 
detect any disks.

Huacai
 
 
-- Original --
From:  "Marek Szyprowski";
Date:  Tue, Oct 17, 2017 07:55 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Robin Murphy"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "JamesHogan"; 
"linux-mips"; "James E . J 
.Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Tejun Heo"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 4/5] libsas: Align SMP req/resp 
todma_get_cache_alignment()

 
Hi Huacai,

On 2017-10-17 10:05, Huacai Chen wrote:
> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so libsas's SMP request/response should be
> aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer and a kernel
> structure share a same cache line, and if the kernel structure has
> dirty data, cache_invalidate (no writeback) will cause data corruption.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>   drivers/scsi/libsas/sas_expander.c | 93 
> +++---
>   1 file changed, 57 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/scsi/libsas/sas_expander.c 
> b/drivers/scsi/libsas/sas_expander.c
> index 6b4fd23..124a44b 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -164,17 +164,17 @@ static int smp_execute_task(struct domain_device *dev, 
> void *req, int req_size,
>   
>   /* -- Allocations -- */
>   
> -static inline void *alloc_smp_req(int size)
> +static inline void *alloc_smp_req(int size, int align)
>   {
> - u8 *p = kzalloc(size, GFP_KERNEL);
> + u8 *p = kzalloc(ALIGN(size, align), GFP_KERNEL);

If I remember correctly, kernel guarantees that each kmalloced buffer is
always at least aligned to the CPU cache line, so CPU cache can be
invalidated on the allocated buffer without corrupting anything else.
Taking this into account, I wonder if the above change make sense.

Have you experienced any problems without this change?

>   if (p)
>   p[0] = SMP_REQUEST;
>   return p;
>   }
>   
> -static inline void *alloc_smp_resp(int size)
> +static inline void *alloc_smp_resp(int size, int align)
>   {
> - return kzalloc(size, GFP_KERNEL);
> + return kzalloc(ALIGN(size, align), GFP_KERNEL);

Save a above.

>   }
>   
>   static char sas_route_char(struct domain_device *dev, struct ex_phy *phy)
> @@ -403,15 +403,17 @@ static int sas_ex_phy_discover_helper(struct 
> domain_device *dev, u8 *disc_req,
>   int sas_ex_phy_discover(struct domain_device *dev, int single)
>   {
>   struct expander_device *ex = >ex_dev;
> - int  res = 0;
> + int  res = 0, align;
>   u8   *disc_req;
>   u8   *disc_resp;
>   
> - disc_req = alloc_smp_req(DISCOVER_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + disc_req = alloc_smp_req(DISCOVER_REQ_SIZE, align);
>   if (!disc_req)
>   return -ENOMEM;
>   
> - disc_resp = alloc_smp_resp(DISCOVER_RESP_SIZE);
> + disc_resp = alloc_smp_resp(DISCOVER_RESP_SIZE, align);
>   if (!disc_resp) {
>   kfree(disc_req);
>   return -ENOMEM;
> @@ -480,14 +482,15 @@ static int sas_ex_general(struct domain_device *dev)
>   {
>   u8 *rg_req;
>   struct smp_resp *rg_resp;
> - int res;
> - int i;
> + int i, res, align;
>   
> - rg_req = alloc_smp_req(RG_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + rg_req = alloc_smp_req(RG_REQ_SIZE, align);
>   if (!rg_req)
>   return -ENOMEM;
>   
> - rg_resp = alloc_smp_resp(RG_RESP_SIZE);
> + rg_resp = alloc_smp_resp(RG_RESP_SIZE, align);
>   if (!rg_resp) {
>   kfree(rg_req);
>   return -ENOMEM;
> @@ -552,13 +555,15 @@ static int sas_ex_manuf_info(struct domain_device *dev)
>   {
>   u8 *mi_req;
>   u8 *mi_resp;
> - int res;
> + int res, align;
>   
> - mi_req = alloc_smp_req(MI_REQ_SIZE);
> + align = dma_get_cache_alignment(>phy->dev);
> +
> + mi_req = alloc_smp_req(MI_REQ_SIZE, align);
>   if (!mi_req)
>   return -ENOMEM;
>   
> - mi_resp = alloc_smp_resp(MI_RESP_SIZE);
> + mi_resp = alloc_smp_resp(MI_RESP_SIZE, align);
>   if (!mi_resp) {
>   kfree(mi_req);
>   return -ENOMEM;
> @@ -593,13 +598,15 @@ int sas_smp_phy_control(struct domain_device *dev, int 
> phy_id,
>   {
>   u8 *pc_req;
>   u8 *pc_resp;
> - int res;
> + int res, align;
> +
> + align = dma_get_cache_alignment(>phy->dev);
>   
> - pc_req = alloc_smp_req(PC_REQ_SIZE);
> + pc_req = alloc_smp_req(PC_REQ_SIZE, align);
>   if (!pc_req)
>

Re: [PATCH V8 5/5] libata: Align DMA buffer todma_get_cache_alignment()

2017-10-17 Thread 陈华才

Hi, Sergei,

Without this patch, in non-coherent mode, ata_do_set_mode() return with "no PIO 
support", and disk detection fails.

Huacai
 
 
-- Original --
From:  "Sergei Shtylyov";
Date:  Tue, Oct 17, 2017 05:43 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Robin 
Murphy"; "Andrew Morton"; 
"Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Tejun Heo"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 5/5] libata: Align DMA buffer 
todma_get_cache_alignment()

 
On 10/17/2017 11:05 AM, Huacai Chen wrote:

> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so in ata_do_dev_read_id() the DMA buffer
> should be aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer
> and a kernel structure share a same cache line, and if the kernel
> structure has dirty data, cache_invalidate (no writeback) will cause
> data corruption.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>   drivers/ata/libata-core.c | 15 +--
>   1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index ee4c1ec..e134955 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -1833,8 +1833,19 @@ static u32 ata_pio_mask_no_iordy(const struct 
> ata_device *adev)
>   unsigned int ata_do_dev_read_id(struct ata_device *dev,
>   struct ata_taskfile *tf, u16 *id)
>   {
> - return ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE,
> -  id, sizeof(id[0]) * ATA_ID_WORDS, 0);
> + u16 *devid;
> + int res, size = sizeof(u16) * ATA_ID_WORDS;
> +
> + if (IS_ALIGNED((unsigned long)id, dma_get_cache_alignment(>tdev)))
> + res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, id, 
> size, 0);
> + else {
> + devid = kmalloc(size, GFP_KERNEL);
> + res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, devid, 
> size, 0);
> + memcpy(id, devid, size);
> + kfree(devid);
> + }
> +
> + return res;

This function is called only for the PIO mode commands, so I doubt this is 
necessary...

MBR, Sergei

Re: [PATCH V8 5/5] libata: Align DMA buffer todma_get_cache_alignment()

2017-10-17 Thread 陈华才

Hi, Sergei,

Without this patch, in non-coherent mode, ata_do_set_mode() return with "no PIO 
support", and disk detection fails.

Huacai
 
 
-- Original --
From:  "Sergei Shtylyov";
Date:  Tue, Oct 17, 2017 05:43 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Robin 
Murphy"; "Andrew Morton"; 
"Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Tejun Heo"; 
"linux-ide"; "stable"; 
Subject:  Re: [PATCH V8 5/5] libata: Align DMA buffer 
todma_get_cache_alignment()

 
On 10/17/2017 11:05 AM, Huacai Chen wrote:

> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so in ata_do_dev_read_id() the DMA buffer
> should be aligned to ARCH_DMA_MINALIGN. Otherwise, If a DMA buffer
> and a kernel structure share a same cache line, and if the kernel
> structure has dirty data, cache_invalidate (no writeback) will cause
> data corruption.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>   drivers/ata/libata-core.c | 15 +--
>   1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index ee4c1ec..e134955 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -1833,8 +1833,19 @@ static u32 ata_pio_mask_no_iordy(const struct 
> ata_device *adev)
>   unsigned int ata_do_dev_read_id(struct ata_device *dev,
>   struct ata_taskfile *tf, u16 *id)
>   {
> - return ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE,
> -  id, sizeof(id[0]) * ATA_ID_WORDS, 0);
> + u16 *devid;
> + int res, size = sizeof(u16) * ATA_ID_WORDS;
> +
> + if (IS_ALIGNED((unsigned long)id, dma_get_cache_alignment(>tdev)))
> + res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, id, 
> size, 0);
> + else {
> + devid = kmalloc(size, GFP_KERNEL);
> + res = ata_exec_internal(dev, tf, NULL, DMA_FROM_DEVICE, devid, 
> size, 0);
> + memcpy(id, devid, size);
> + kfree(devid);
> + }
> +
> + return res;

This function is called only for the PIO mode commands, so I doubt this is 
necessary...

MBR, Sergei

Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

2017-09-25 Thread 陈华才

Hi, Christoph,

Can I put the declaration in asm/dma-coherence.h?

And, last time you said it is OK to pass a NULL to dma_get_cache_alignment() 
and cc all driver maintainers. I have do so.

Huacai

-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 25, 2017 08:51 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Roland Dreier"; 
"Pawel Osciak"; "Kyungmin Park"; 
"Michael Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
"stable"; 
Subject:  Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

 
> index aba7138..e2c5d9e 100644
> --- a/arch/mips/include/asm/dma-mapping.h
> +++ b/arch/mips/include/asm/dma-mapping.h
> @@ -39,4 +39,6 @@ static inline void arch_setup_dma_ops(struct device *dev, 
> u64 dma_base,
>  #endif
>  }
>  
> +int mips_get_cache_alignment(struct device *dev);

All the other mips generic dma helpers are prefixed mips_dma_*
so it might make sense to follow that.

Also please don't add arch-local helpers to asm/dma-mapping.h - this
is a header used by linux/dma-mapping.h and should not contain
implementation details if avoidable.

> +dma_get_cache_alignment(NULL)) / 
> mdev->limits.mtt_seg_size;

As said before - please don't pass NULL to this function but the proper
device, which would be >pdev->dev in this case for example.

Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

2017-09-25 Thread 陈华才

Hi, Christoph,

Can I put the declaration in asm/dma-coherence.h?

And, last time you said it is OK to pass a NULL to dma_get_cache_alignment() 
and cc all driver maintainers. I have do so.

Huacai

-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 25, 2017 08:51 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Roland Dreier"; 
"Pawel Osciak"; "Kyungmin Park"; 
"Michael Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
"stable"; 
Subject:  Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

 
> index aba7138..e2c5d9e 100644
> --- a/arch/mips/include/asm/dma-mapping.h
> +++ b/arch/mips/include/asm/dma-mapping.h
> @@ -39,4 +39,6 @@ static inline void arch_setup_dma_ops(struct device *dev, 
> u64 dma_base,
>  #endif
>  }
>  
> +int mips_get_cache_alignment(struct device *dev);

All the other mips generic dma helpers are prefixed mips_dma_*
so it might make sense to follow that.

Also please don't add arch-local helpers to asm/dma-mapping.h - this
is a header used by linux/dma-mapping.h and should not contain
implementation details if avoidable.

> +dma_get_cache_alignment(NULL)) / 
> mdev->limits.mtt_seg_size;

As said before - please don't pass NULL to this function but the proper
device, which would be >pdev->dev in this case for example.

Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

2017-09-25 Thread 陈华才

Hi, Robin,

Can ARM/ARM64 use the same implementation as MIPS? Or I just do MIPS things is 
OK?
 
Huacai
 
-- Original --
From:  "Robin Murphy";
Date:  Mon, Sep 25, 2017 08:57 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Roland Dreier"; 
"Pawel Osciak"; "Kyungmin Park"; 
"Michael Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
"stable"; 
Subject:  Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

 
On 25/09/17 10:46, Huacai Chen wrote:
> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
> it can return different alignments due to different devices' I/O cache
> coherency.
> 
> Currently, MIPS is the only architecture which support coherent & non-
> coherent devices co-exist. This may be changed in the future, so add a
> new get_cache_alignment() function pointer in 'struct dma_map_ops' as a
> generic solution.

FWIW, ARM and arm64 have also supported per-device coherency for quite
some time.

> For compatibility (always return ARCH_DMA_MINALIGN), make all existing
> callers pass a NULL dev argument to dma_get_cache_alignment().
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen  ---
>  arch/mips/cavium-octeon/dma-octeon.c   |  3 ++-
>  arch/mips/include/asm/dma-mapping.h|  2 ++
>  arch/mips/loongson64/common/dma-swiotlb.c  |  1 +
>  arch/mips/mm/dma-default.c | 11 ++-
>  arch/mips/netlogic/common/nlm-dma.c|  3 ++-
>  drivers/infiniband/hw/mthca/mthca_main.c   |  2 +-
>  drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
>  drivers/net/ethernet/broadcom/b44.c|  2 +-
>  drivers/net/ethernet/ibm/emac/core.h   |  2 +-
>  drivers/net/ethernet/mellanox/mlx4/main.c  |  2 +-
>  drivers/spi/spi-qup.c  |  4 ++--
>  drivers/tty/serial/mpsc.c  | 16 
>  drivers/tty/serial/samsung.c   | 14 +++---
>  include/linux/dma-mapping.h| 17 -
>  14 files changed, 51 insertions(+), 30 deletions(-)

I think it might be neater to split this into two patches - one making
the treewide prototype change, then introducing the .get_cache_alignemnt
callback separately - but that's only my personal preference.

Otherwise (and modulo Christoph's comments), I'd say we're nearly there.

Thanks,
Robin.

> diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
> b/arch/mips/cavium-octeon/dma-octeon.c
> index c64bd87..7978237 100644
> --- a/arch/mips/cavium-octeon/dma-octeon.c
> +++ b/arch/mips/cavium-octeon/dma-octeon.c
> @@ -324,7 +324,8 @@ static struct octeon_dma_map_ops _octeon_pci_dma_map_ops 
> = {
>   .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
>   .sync_sg_for_device = octeon_dma_sync_sg_for_device,
>   .mapping_error = swiotlb_dma_mapping_error,
> - .dma_supported = swiotlb_dma_supported
> + .dma_supported = swiotlb_dma_supported,
> + .get_cache_alignment = mips_get_cache_alignment
>   },
>  };
>  
> diff --git a/arch/mips/include/asm/dma-mapping.h 
> b/arch/mips/include/asm/dma-mapping.h
> index aba7138..e2c5d9e 100644
> --- a/arch/mips/include/asm/dma-mapping.h
> +++ b/arch/mips/include/asm/dma-mapping.h
> @@ -39,4 +39,6 @@ static inline void arch_setup_dma_ops(struct device *dev, 
> u64 dma_base,
>  #endif
>  }
>  
> +int mips_get_cache_alignment(struct device *dev);
> +
>  #endif /* _ASM_DMA_MAPPING_H */
> diff --git a/arch/mips/loongson64/common/dma-swiotlb.c 
> b/arch/mips/loongson64/common/dma-swiotlb.c
> index 34486c1..09cb8a4 100644
> --- a/arch/mips/loongson64/common/dma-swiotlb.c
> +++ b/arch/mips/loongson64/common/dma-swiotlb.c
> @@ -119,6 +119,7 @@ static const struct dma_map_ops loongson_dma_map_ops = {
>   .sync_sg_for_device = loongson_dma_sync_sg_for_device,
>   .mapping_error = swiotlb_dma_mapping_error,
>   .dma_supported = loongson_dma_supported,
> + .get_cache_alignment = mips_get_cache_alignment
>  };
>  
>  void __init plat_swiotlb_setup(void)
> diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
>

Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

2017-09-25 Thread 陈华才

Hi, Robin,

Can ARM/ARM64 use the same implementation as MIPS? Or I just do MIPS things is 
OK?
 
Huacai
 
-- Original --
From:  "Robin Murphy";
Date:  Mon, Sep 25, 2017 08:57 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "Ralf 
Baechle"; "James Hogan"; 
"linux-mips"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "Roland Dreier"; 
"Pawel Osciak"; "Kyungmin Park"; 
"Michael Chan"; "Benjamin 
Herrenschmidt"; "Ivan Mikhaylov"; 
"Tariq Toukan"; "Andy Gross"; "Mark 
A . Greer"; "Robert Baldyga"; 
"stable"; 
Subject:  Re: [PATCH V7 1/2] dma-mapping: Rework dma_get_cache_alignment()

 
On 25/09/17 10:46, Huacai Chen wrote:
> Make dma_get_cache_alignment() to accept a 'dev' argument. As a result,
> it can return different alignments due to different devices' I/O cache
> coherency.
> 
> Currently, MIPS is the only architecture which support coherent & non-
> coherent devices co-exist. This may be changed in the future, so add a
> new get_cache_alignment() function pointer in 'struct dma_map_ops' as a
> generic solution.

FWIW, ARM and arm64 have also supported per-device coherency for quite
some time.

> For compatibility (always return ARCH_DMA_MINALIGN), make all existing
> callers pass a NULL dev argument to dma_get_cache_alignment().
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen  ---
>  arch/mips/cavium-octeon/dma-octeon.c   |  3 ++-
>  arch/mips/include/asm/dma-mapping.h|  2 ++
>  arch/mips/loongson64/common/dma-swiotlb.c  |  1 +
>  arch/mips/mm/dma-default.c | 11 ++-
>  arch/mips/netlogic/common/nlm-dma.c|  3 ++-
>  drivers/infiniband/hw/mthca/mthca_main.c   |  2 +-
>  drivers/media/v4l2-core/videobuf2-dma-contig.c |  2 +-
>  drivers/net/ethernet/broadcom/b44.c|  2 +-
>  drivers/net/ethernet/ibm/emac/core.h   |  2 +-
>  drivers/net/ethernet/mellanox/mlx4/main.c  |  2 +-
>  drivers/spi/spi-qup.c  |  4 ++--
>  drivers/tty/serial/mpsc.c  | 16 
>  drivers/tty/serial/samsung.c   | 14 +++---
>  include/linux/dma-mapping.h| 17 -
>  14 files changed, 51 insertions(+), 30 deletions(-)

I think it might be neater to split this into two patches - one making
the treewide prototype change, then introducing the .get_cache_alignemnt
callback separately - but that's only my personal preference.

Otherwise (and modulo Christoph's comments), I'd say we're nearly there.

Thanks,
Robin.

> diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
> b/arch/mips/cavium-octeon/dma-octeon.c
> index c64bd87..7978237 100644
> --- a/arch/mips/cavium-octeon/dma-octeon.c
> +++ b/arch/mips/cavium-octeon/dma-octeon.c
> @@ -324,7 +324,8 @@ static struct octeon_dma_map_ops _octeon_pci_dma_map_ops 
> = {
>   .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
>   .sync_sg_for_device = octeon_dma_sync_sg_for_device,
>   .mapping_error = swiotlb_dma_mapping_error,
> - .dma_supported = swiotlb_dma_supported
> + .dma_supported = swiotlb_dma_supported,
> + .get_cache_alignment = mips_get_cache_alignment
>   },
>  };
>  
> diff --git a/arch/mips/include/asm/dma-mapping.h 
> b/arch/mips/include/asm/dma-mapping.h
> index aba7138..e2c5d9e 100644
> --- a/arch/mips/include/asm/dma-mapping.h
> +++ b/arch/mips/include/asm/dma-mapping.h
> @@ -39,4 +39,6 @@ static inline void arch_setup_dma_ops(struct device *dev, 
> u64 dma_base,
>  #endif
>  }
>  
> +int mips_get_cache_alignment(struct device *dev);
> +
>  #endif /* _ASM_DMA_MAPPING_H */
> diff --git a/arch/mips/loongson64/common/dma-swiotlb.c 
> b/arch/mips/loongson64/common/dma-swiotlb.c
> index 34486c1..09cb8a4 100644
> --- a/arch/mips/loongson64/common/dma-swiotlb.c
> +++ b/arch/mips/loongson64/common/dma-swiotlb.c
> @@ -119,6 +119,7 @@ static const struct dma_map_ops loongson_dma_map_ops = {
>   .sync_sg_for_device = loongson_dma_sync_sg_for_device,
>   .mapping_error = swiotlb_dma_mapping_error,
>   .dma_supported = loongson_dma_supported,
> + .get_cache_alignment = mips_get_cache_alignment
>  };
>  
>  void __init plat_swiotlb_setup(void)
> diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
> index c01bd20..c49987e 100644
> --- a/arch/mips/mm/dma-default.c
> +++ b/arch/mips/mm/dma-default.c
> @@ -394,6 +394,14 @@ void dma_cache_sync(struct device *dev, void *vaddr, 
> size_t size,
>  
>  EXPORT_SYMBOL(dma_cache_sync);
>  
> +int mips_get_cache_alignment(struct device *dev)
> +{
> + if (plat_device_is_coherent(dev))
> + return 1;
> + else
> + return ARCH_DMA_MINALIGN;
> +}
> +
>  static const struct dma_map_ops mips_default_dma_map_ops = {
>   .alloc = mips_dma_alloc_coherent,
>   .free = mips_dma_free_coherent,
> @@

Re: [PATCH V6 1/3] dma-mapping: Introduce device_is_coherent() as ahelper

2017-09-21 Thread 陈华才

Hi, Robin,

Before 2.6.36 dma_get_cache_alignment is arch-dependent, and it is unified in 
commit 4565f0170dfc849b3629c27d7 ("dma-mapping: unify dma_get_cache_alignment 
implementations"). Should we revert to the old implementation?

Huacai

-- Original --
From:  "Robin Murphy";
Date:  Thu, Sep 21, 2017 06:47 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
Subject:  Re: [PATCH V6 1/3] dma-mapping: Introduce device_is_coherent() as 
ahelper

On 19/09/17 09:52, Huacai Chen wrote:
> We will use device_is_coherent() as a helper function, which will be
> used in the next patch.
> 
> There is a MIPS-specific plat_device_is_coherent(), but we need a more
> generic solution, so add and use a new function pointer in dma_map_ops.

I think we're heading in the right direction with the series, but I
still don't like this patch. I can pretty much guarantee that driver
authors *will* abuse a generic device_is_coherent() API to mean "I can
skip other DMA API calls and just use virt_to_phys()".

I think it would be far better to allow architectures to provide their
own override of dma_get_cache_alignment(), and let the coherency detail
remain internal to the relevant arch implementations.

[...]
> @@ -697,6 +698,15 @@ static inline void *dma_zalloc_coherent(struct device 
> *dev, size_t size,
>  }
>  
>  #ifdef CONFIG_HAS_DMA
> +static inline int device_is_coherent(struct device *dev)
> +{
> + const struct dma_map_ops *ops = get_dma_ops(dev);
> + if (ops && ops->device_is_coherent)
> + return ops->device_is_coherent(dev);
> + else
> + return 1;/* compatible behavior */

That is also quite scary - if someone now adds a new
dma_get_cache_alignemnt() call and dutifully passes a non-NULL device,
they will now get back an alignment of 1 on all non-coherent platforms
except MIPS: hello data corruption.

Robin.

> +}
> +
>  static inline int dma_get_cache_alignment(void)
>  {
>  #ifdef ARCH_DMA_MINALIGN
>

Re: [PATCH V6 1/3] dma-mapping: Introduce device_is_coherent() as ahelper

2017-09-21 Thread 陈华才

Hi, Robin,

Before 2.6.36 dma_get_cache_alignment is arch-dependent, and it is unified in 
commit 4565f0170dfc849b3629c27d7 ("dma-mapping: unify dma_get_cache_alignment 
implementations"). Should we revert to the old implementation?

Huacai

-- Original --
From:  "Robin Murphy";
Date:  Thu, Sep 21, 2017 06:47 PM
To:  "Huacai Chen"; "Christoph Hellwig"; 
Cc:  "Marek Szyprowski"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-kernel"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
Subject:  Re: [PATCH V6 1/3] dma-mapping: Introduce device_is_coherent() as 
ahelper

On 19/09/17 09:52, Huacai Chen wrote:
> We will use device_is_coherent() as a helper function, which will be
> used in the next patch.
> 
> There is a MIPS-specific plat_device_is_coherent(), but we need a more
> generic solution, so add and use a new function pointer in dma_map_ops.

I think we're heading in the right direction with the series, but I
still don't like this patch. I can pretty much guarantee that driver
authors *will* abuse a generic device_is_coherent() API to mean "I can
skip other DMA API calls and just use virt_to_phys()".

I think it would be far better to allow architectures to provide their
own override of dma_get_cache_alignment(), and let the coherency detail
remain internal to the relevant arch implementations.

[...]
> @@ -697,6 +698,15 @@ static inline void *dma_zalloc_coherent(struct device 
> *dev, size_t size,
>  }
>  
>  #ifdef CONFIG_HAS_DMA
> +static inline int device_is_coherent(struct device *dev)
> +{
> + const struct dma_map_ops *ops = get_dma_ops(dev);
> + if (ops && ops->device_is_coherent)
> + return ops->device_is_coherent(dev);
> + else
> + return 1;/* compatible behavior */

That is also quite scary - if someone now adds a new
dma_get_cache_alignemnt() call and dutifully passes a non-NULL device,
they will now get back an alignment of 1 on all non-coherent platforms
except MIPS: hello data corruption.

Robin.

> +}
> +
>  static inline int dma_get_cache_alignment(void)
>  {
>  #ifdef ARCH_DMA_MINALIGN
>

Re: [PATCH V6 2/3] dma-mapping: Rework dma_get_cache_alignment()function

2017-09-20 Thread 陈华才

Hi, Christoph,

I have changed dma_get_cache_alignment's return value, and I don't know whether 
those drivers want to return ARCH_DMA_MINALIGN unconditionally. So I pass a 
NULL for those drivers, in order to keep their old behavior.
 
Huacai
 
-- Original --
From:  "Christoph Hellwig";
Date:  Tue, Sep 19, 2017 11:02 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
Subject:  Re: [PATCH V6 2/3] dma-mapping: Rework 
dma_get_cache_alignment()function

 
>   mdev->limits.reserved_mtts = ALIGN(mdev->limits.reserved_mtts * 
> mdev->limits.mtt_seg_size,
> -dma_get_cache_alignment()) / 
> mdev->limits.mtt_seg_size;
> +dma_get_cache_alignment(NULL)) / 
> mdev->limits.mtt_seg_size;
>  
>   mdev->mr_table.mtt_table = mthca_alloc_icm_table(mdev, 
> init_hca->mtt_base,

Please pass the actually relevant struct device for each call.

Re: [PATCH V6 2/3] dma-mapping: Rework dma_get_cache_alignment()function

2017-09-20 Thread 陈华才

Hi, Christoph,

I have changed dma_get_cache_alignment's return value, and I don't know whether 
those drivers want to return ARCH_DMA_MINALIGN unconditionally. So I pass a 
NULL for those drivers, in order to keep their old behavior.
 
Huacai
 
-- Original --
From:  "Christoph Hellwig";
Date:  Tue, Sep 19, 2017 11:02 PM
To:  "Huacai Chen"; 
Cc:  "Christoph Hellwig"; "Marek 
Szyprowski"; "Robin Murphy"; 
"Andrew Morton"; "Fuxin Zhang"; 
"linux-kernel"; "James E . J . 
Bottomley"; "Martin K . 
Petersen"; 
"linux-scsi"; "stable"; 
Subject:  Re: [PATCH V6 2/3] dma-mapping: Rework 
dma_get_cache_alignment()function

 
>   mdev->limits.reserved_mtts = ALIGN(mdev->limits.reserved_mtts * 
> mdev->limits.mtt_seg_size,
> -dma_get_cache_alignment()) / 
> mdev->limits.mtt_seg_size;
> +dma_get_cache_alignment(NULL)) / 
> mdev->limits.mtt_seg_size;
>  
>   mdev->mr_table.mtt_table = mthca_alloc_icm_table(mdev, 
> init_hca->mtt_base,

Please pass the actually relevant struct device for each call.

Re: [V5, 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Oh, I know, I've make a mistake, dmapool doesn't need to change.

Huacai
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 11:51 PM
To:  "Robin Murphy"; 
Cc:  "Huacai Chen"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-mm"; "linux-kernel"; 
"stable"; 
Subject:  Re: [V5, 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent 
DMA mode

 
On Mon, Sep 18, 2017 at 10:44:54AM +0100, Robin Murphy wrote:
> On 18/09/17 05:22, Huacai Chen wrote:
> > In non-coherent DMA mode, kernel uses cache flushing operations to
> > maintain I/O coherency, so the dmapool objects should be aligned to
> > ARCH_DMA_MINALIGN. Otherwise, it will cause data corruption, at least
> > on MIPS:
> > 
> > Step 1, dma_map_single
> > Step 2, cache_invalidate (no writeback)
> > Step 3, dma_from_device
> > Step 4, dma_unmap_single
> 
> This is a massive red warning flag for the whole series, because DMA
> pools don't work like that. At best, this will do nothing, and at worst
> it is papering over egregious bugs elsewhere. Streaming mappings of
> coherent allocations means completely broken code.

Oh, I hadn't even seen that part.  Yes, dma coherent (and pool)
allocations must never be used for streaming mappings.  I wish we'd
have some debug infrastructure to warn on such uses.

Re: [V5, 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Oh, I know, I've make a mistake, dmapool doesn't need to change.

Huacai
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 11:51 PM
To:  "Robin Murphy"; 
Cc:  "Huacai Chen"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-mm"; "linux-kernel"; 
"stable"; 
Subject:  Re: [V5, 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent 
DMA mode

 
On Mon, Sep 18, 2017 at 10:44:54AM +0100, Robin Murphy wrote:
> On 18/09/17 05:22, Huacai Chen wrote:
> > In non-coherent DMA mode, kernel uses cache flushing operations to
> > maintain I/O coherency, so the dmapool objects should be aligned to
> > ARCH_DMA_MINALIGN. Otherwise, it will cause data corruption, at least
> > on MIPS:
> > 
> > Step 1, dma_map_single
> > Step 2, cache_invalidate (no writeback)
> > Step 3, dma_from_device
> > Step 4, dma_unmap_single
> 
> This is a massive red warning flag for the whole series, because DMA
> pools don't work like that. At best, this will do nothing, and at worst
> it is papering over egregious bugs elsewhere. Streaming mappings of
> coherent allocations means completely broken code.

Oh, I hadn't even seen that part.  Yes, dma coherent (and pool)
allocations must never be used for streaming mappings.  I wish we'd
have some debug infrastructure to warn on such uses.

Re: [PATCH V5 3/3] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Hi, Christoph,

I don't think dma_get_cache_alignment is the "absolute minimum alignment" in 
all cases. At least on MIPS/Loongson, if we use I/O coherent mode (Cached DMA 
mode), align block queue to 4Bytes is enough. If we align block queue  to 
dma_get_cache_alignment in I/O coherent mode, there are peformance lost because 
we cannot use zero-copy in most cases (user buffers are usually not aligned).

Huacai

-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 01:20 PM
To:  "Huacai Chen";
Cc:  "James E . J . Bottomley"; "Martin K . 
Petersen"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-scsi"; 
"linux-kernel"; "stable";
Subject:  Re: [PATCH V5 3/3] scsi: Align queue to ARCH_DMA_MINALIGN 
innon-coherent DMA mode

Please send all patches in the series to the same to and cc lists.

On Mon, Sep 18, 2017 at 12:22:54PM +0800, Huacai Chen wrote:
> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so scsi's block queue should be aligned to
> ARCH_DMA_MINALIGN. Otherwise, it will cause data corruption, at least
> on MIPS:
> 
> Step 1, dma_map_single
> Step 2, cache_invalidate (no writeback)
> Step 3, dma_from_device
> Step 4, dma_unmap_single
> 
> If a DMA buffer and a kernel structure share a same cache line, and if
> the kernel structure has dirty data, cache_invalidate (no writeback)
> will cause data lost.

And as said before we must _always_ align to dma_get_cache_alignment.
This is even documented in Documentation/DMA-API.txt:

-- snip --

int
dma_get_cache_alignment(void)

Returns the processor cache alignment.  This is the absolute minimum
alignment *and* width that you must observe when either mapping
memory or doing partial flushes.

-- snip --

> + if (device_is_coherent(dev))
> + blk_queue_dma_alignment(q, 0x04 - 1);
> + else
> + blk_queue_dma_alignment(q, dma_get_cache_alignment() - 1);

So as said before this should become something like:

blk_queue_dma_alignment(q, max(0x04, dma_get_cache_alignment()) - 1);

Re: [PATCH V5 3/3] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Hi, Christoph,

I don't think dma_get_cache_alignment is the "absolute minimum alignment" in 
all cases. At least on MIPS/Loongson, if we use I/O coherent mode (Cached DMA 
mode), align block queue to 4Bytes is enough. If we align block queue  to 
dma_get_cache_alignment in I/O coherent mode, there are peformance lost because 
we cannot use zero-copy in most cases (user buffers are usually not aligned).

Huacai

-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 01:20 PM
To:  "Huacai Chen";
Cc:  "James E . J . Bottomley"; "Martin K . 
Petersen"; "Andrew 
Morton"; "Fuxin Zhang"; 
"linux-scsi"; 
"linux-kernel"; "stable";
Subject:  Re: [PATCH V5 3/3] scsi: Align queue to ARCH_DMA_MINALIGN 
innon-coherent DMA mode

Please send all patches in the series to the same to and cc lists.

On Mon, Sep 18, 2017 at 12:22:54PM +0800, Huacai Chen wrote:
> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so scsi's block queue should be aligned to
> ARCH_DMA_MINALIGN. Otherwise, it will cause data corruption, at least
> on MIPS:
> 
> Step 1, dma_map_single
> Step 2, cache_invalidate (no writeback)
> Step 3, dma_from_device
> Step 4, dma_unmap_single
> 
> If a DMA buffer and a kernel structure share a same cache line, and if
> the kernel structure has dirty data, cache_invalidate (no writeback)
> will cause data lost.

And as said before we must _always_ align to dma_get_cache_alignment.
This is even documented in Documentation/DMA-API.txt:

-- snip --

int
dma_get_cache_alignment(void)

Returns the processor cache alignment.  This is the absolute minimum
alignment *and* width that you must observe when either mapping
memory or doing partial flushes.

-- snip --

> + if (device_is_coherent(dev))
> + blk_queue_dma_alignment(q, 0x04 - 1);
> + else
> + blk_queue_dma_alignment(q, dma_get_cache_alignment() - 1);

So as said before this should become something like:

blk_queue_dma_alignment(q, max(0x04, dma_get_cache_alignment()) - 1);

Re: [PATCH V5 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Hi, Christoph,

Maybe you missed something.
1, pool_alloc_page() use dma_alloc_coherent() to allocate pool pages, and of 
course these pages are aligned to  ARCH_DMA_MINALIGN.
2, dma_pool_alloc() is the element allocator, but it doesn't use 
dma_alloc_coherent(). Elements only align to pool->size, but pool->size is 
usually less than ARCH_DMA_MINALIGN.
3, ARCH_DMA_MINALIGN is now only used in serveral drivers, no dma_ops use it.

Huacai
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 01:22 PM
To:  "Huacai Chen";
Cc:  "Andrew Morton"; "Fuxin 
Zhang"; "linux-mm"; 
"linux-kernel"; "stable";
Subject:  Re: [PATCH V5 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN 
innon-coherent DMA mode
 
The dmapool code uses dma_alloc_coherent to allocate each element,
and dma_alloc_coherent must align to ARCH_DMA_MINALIGN already.
If you implementation doesn't do that it needs to be fixed.

Re: [PATCH V5 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-18 Thread 陈华才

Hi, Christoph,

Maybe you missed something.
1, pool_alloc_page() use dma_alloc_coherent() to allocate pool pages, and of 
course these pages are aligned to  ARCH_DMA_MINALIGN.
2, dma_pool_alloc() is the element allocator, but it doesn't use 
dma_alloc_coherent(). Elements only align to pool->size, but pool->size is 
usually less than ARCH_DMA_MINALIGN.
3, ARCH_DMA_MINALIGN is now only used in serveral drivers, no dma_ops use it.

Huacai
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 18, 2017 01:22 PM
To:  "Huacai Chen";
Cc:  "Andrew Morton"; "Fuxin 
Zhang"; "linux-mm"; 
"linux-kernel"; "stable";
Subject:  Re: [PATCH V5 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN 
innon-coherent DMA mode
 
The dmapool code uses dma_alloc_coherent to allocate each element,
and dma_alloc_coherent must align to ARCH_DMA_MINALIGN already.
If you implementation doesn't do that it needs to be fixed.

Re: [PATCH V3 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-13 Thread 陈华才

Hi, Andrew,

It will cause data corruption, at least on MIPS:
step 1, dma_map_single
step 2, cache_invalidate (no writeback)
step 3, dma_from_device
step 4, dma_unmap_single
If a DMA buffer and a kernel structure share a same cache line, and if the 
kernel structure has dirty data, cache_invalidate (no writeback) may cause data 
lost.

Huacai

-- Original --
From:  "Andrew Morton";
Date:  Thu, Sep 14, 2017 05:52 AM
To:  "Huacai Chen"; 
Cc:  "Fuxin Zhang"; "linux-mm"; 
"linux-kernel"; "stable"; 
Subject:  Re: [PATCH V3 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN 
innon-coherent DMA mode

On Wed, 13 Sep 2017 17:20:51 +0800 Huacai Chen  wrote:

> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so the dmapool objects should be aligned to
> ARCH_DMA_MINALIGN.

What are the user-visible effects of this bug?

Re: [PATCH V3 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-13 Thread 陈华才

Hi, Andrew,

It will cause data corruption, at least on MIPS:
step 1, dma_map_single
step 2, cache_invalidate (no writeback)
step 3, dma_from_device
step 4, dma_unmap_single
If a DMA buffer and a kernel structure share a same cache line, and if the 
kernel structure has dirty data, cache_invalidate (no writeback) may cause data 
lost.

Huacai

-- Original --
From:  "Andrew Morton";
Date:  Thu, Sep 14, 2017 05:52 AM
To:  "Huacai Chen"; 
Cc:  "Fuxin Zhang"; "linux-mm"; 
"linux-kernel"; "stable"; 
Subject:  Re: [PATCH V3 2/3] mm: dmapool: Align to ARCH_DMA_MINALIGN 
innon-coherent DMA mode

On Wed, 13 Sep 2017 17:20:51 +0800 Huacai Chen  wrote:

> In non-coherent DMA mode, kernel uses cache flushing operations to
> maintain I/O coherency, so the dmapool objects should be aligned to
> ARCH_DMA_MINALIGN.

What are the user-visible effects of this bug?

Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-11 Thread 陈华才

Hi, Christoph

I think we cannot modify dma_get_cache_alignment(), because existing callers 
may want to unconditionally return ARCH_DMA_MINALIGN.

Huacai
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 11, 2017 03:39 PM
To:  "Huacai Chen"; 
Cc:  "James E . J . Bottomley"; "Martin K . 
Petersen"; "Fuxin Zhang"; 
"linux-scsi"; 
"linux-kernel"; "stable"; 
Subject:  Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent 
DMA mode

 
> + if (plat_device_is_coherent(dev))

We can't just call platform device code.  We'll need a proper
DMA API call for this.

> + blk_queue_dma_alignment(q, 0x04 - 1);
> + else
> + blk_queue_dma_alignment(q, dma_get_cache_alignment() - 1);

Which we already have with dma_get_cache_alignment, except that it
doesn't take a struct device pointer and doesn't call into dma_map
ops.  So please add a struct device argument to dma_get_cache_alignment,
and let it call into dma_map_ops where needed.

With that you can replace the above with:

blk_queue_dma_alignment(q,
max(0x04U, dma_get_cache_alignment(dev)) - 1);

Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent DMA mode

2017-09-11 Thread 陈华才

Hi, Christoph

I think we cannot modify dma_get_cache_alignment(), because existing callers 
may want to unconditionally return ARCH_DMA_MINALIGN.

Huacai
 
 
-- Original --
From:  "Christoph Hellwig";
Date:  Mon, Sep 11, 2017 03:39 PM
To:  "Huacai Chen"; 
Cc:  "James E . J . Bottomley"; "Martin K . 
Petersen"; "Fuxin Zhang"; 
"linux-scsi"; 
"linux-kernel"; "stable"; 
Subject:  Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN innon-coherent 
DMA mode

 
> + if (plat_device_is_coherent(dev))

We can't just call platform device code.  We'll need a proper
DMA API call for this.

> + blk_queue_dma_alignment(q, 0x04 - 1);
> + else
> + blk_queue_dma_alignment(q, dma_get_cache_alignment() - 1);

Which we already have with dma_get_cache_alignment, except that it
doesn't take a struct device pointer and doesn't call into dma_map
ops.  So please add a struct device argument to dma_get_cache_alignment,
and let it call into dma_map_ops where needed.

With that you can replace the above with:

blk_queue_dma_alignment(q,
max(0x04U, dma_get_cache_alignment(dev)) - 1);

Re:mips builds failing in v3.18.38 and v4.1.29

2016-07-31 Thread 陈华才

Hi,

I have already told Sasha that "MIPS: Reserve nosave data for hibernation" 
should not be backported to 4.1/3.18. But he has no response.

Huacai

-- Original --
From:  "Guenter Roeck";
Date:  Mon, Aug 1, 2016 07:57 AM
To:  "Sasha Levin";
Cc:  "stable"; 
"linux-kernel@vger.kernel.org"; "Huacai 
Chen";
Subject:  mips builds failing in v3.18.38 and v4.1.29

Sorry I didn't notice earlier - my 3.18/4.1 build scripts were broken.

arch/mips/kernel/setup.c: In function 'arch_mem_init':
arch/mips/kernel/setup.c:689:2: error: implicit declaration of function 
'reserve_bootmem_region'

Offending commit is 1dd0964204277108e ("MIPS: Reserve nosave data for 
hibernation").
It fixes a bug which does not exist in 3.18 / 4.1, in turn causing all mips 
builds in 3.18
and 4.1 to fail.

Guenter

Re:mips builds failing in v3.18.38 and v4.1.29

2016-07-31 Thread 陈华才

Hi,

I have already told Sasha that "MIPS: Reserve nosave data for hibernation" 
should not be backported to 4.1/3.18. But he has no response.

Huacai

-- Original --
From:  "Guenter Roeck";
Date:  Mon, Aug 1, 2016 07:57 AM
To:  "Sasha Levin";
Cc:  "stable"; 
"linux-kernel@vger.kernel.org"; "Huacai 
Chen";
Subject:  mips builds failing in v3.18.38 and v4.1.29

Sorry I didn't notice earlier - my 3.18/4.1 build scripts were broken.

arch/mips/kernel/setup.c: In function 'arch_mem_init':
arch/mips/kernel/setup.c:689:2: error: implicit declaration of function 
'reserve_bootmem_region'

Offending commit is 1dd0964204277108e ("MIPS: Reserve nosave data for 
hibernation").
It fixes a bug which does not exist in 3.18 / 4.1, in turn causing all mips 
builds in 3.18
and 4.1 to fail.

Guenter

Re:next: fuloong2e qemu boot failure due to 'MIPS: Loongson: AddLoongson-3A R2 basic support'

2016-04-19 Thread 陈华才

Hi,

Could you please remove the line "#define cpu_hwrena_impl_bits0xc000" 
in arch/mips/include/asm/mach-loongson64/cpu-feature-overrides.h and try 
again?Thanks.

Huacai
 
-- Original --
From:  "Guenter Roeck";
Date:  Wed, Apr 20, 2016 10:54 AM
To:  "Huacai Chen";
Cc:  "Ralf Baechle"; 
"linux-mips"; 
"linux-next"; 
"linux-kernel";
Subject:  next: fuloong2e qemu boot failure due to 'MIPS: Loongson: 
AddLoongson-3A R2 basic support'
 
Hi,

qemu fails to boot in -next for machine fulong2e with configuration
fuloong2e_defconfig. Bisect points to commit 'MIPS: Loongson: Add
Loongson-3A R2 basic support'. qemu hangs in boot, after displaying 
"Inode-cache hash table entries: 16384 (order: 3, 131072 bytes)".

Bisect log is attached.

Guenter

---
# bad: [1bd7a2081d2c7b096f75aa934658e404ccaba5fd] Add linux-next specific files 
for 20160418
# good: [bf16200689118d19de1b8d2a3c314fc21f5dc7bb] Linux 4.6-rc3
git bisect start 'HEAD' 'v4.6-rc3'
# bad: [493ac92ff65ec4c4cd4c43870e778760a012951d] Merge remote-tracking branch 
'ipvs-next/master'
git bisect bad 493ac92ff65ec4c4cd4c43870e778760a012951d
# bad: [20ca3ae9c517eee9b2f1bd0fb2a06e2d14153792] Merge remote-tracking branch 
'btrfs-kdave/for-next'
git bisect bad 20ca3ae9c517eee9b2f1bd0fb2a06e2d14153792
# good: [c454e65fb9ade11d0f84718d06a6888e2c92804d] Merge remote-tracking branch 
'omap/for-next'
git bisect good c454e65fb9ade11d0f84718d06a6888e2c92804d
# good: [6f5c70fb9b4fc0534157bfa40cea9b402e6f2506] Merge remote-tracking branch 
'microblaze/next'
git bisect good 6f5c70fb9b4fc0534157bfa40cea9b402e6f2506
# bad: [7f053cd68fd14243c8f202b4086d7dd75c409e6f] MIPS: Loongson-3: Introduce 
CONFIG_LOONGSON3_ENHANCEMENT
git bisect bad 7f053cd68fd14243c8f202b4086d7dd75c409e6f
# good: [e9aacdd7f0b66c4ace17e5950c48e7cc61a253c8] MIPS: Allow RIXI to be used 
on non-R2 or R6 cores
git bisect good e9aacdd7f0b66c4ace17e5950c48e7cc61a253c8
# good: [d1e8b9a8dc6c7fa9add5dfa7083e035ce037e56d] MAINTAINERS: add Loongson1 
architecture entry
git bisect good d1e8b9a8dc6c7fa9add5dfa7083e035ce037e56d
# good: [13ff6275bb389c3669082d3ef8483592a31eb0ea] MIPS: Fix siginfo.h to use 
strict posix types
git bisect good 13ff6275bb389c3669082d3ef8483592a31eb0ea
# good: [66e74bdd51e617023fa2e79a807b704fb3eed8aa] MIPS: Enable ptrace hw 
watchpoints on MIPS R6
git bisect good 66e74bdd51e617023fa2e79a807b704fb3eed8aa
# good: [f7cabc2dac8adf5986dbc700584bc3b8fe493d4d] MIPS: Loongson-3: Adjust irq 
dispatch to speedup processing
git bisect good f7cabc2dac8adf5986dbc700584bc3b8fe493d4d
# bad: [4978c8477e96fb9e9d870d8f42328dcabf1a65e9] MIPS: Loongson-3: Set cache 
flush handlers to cache_noop
git bisect bad 4978c8477e96fb9e9d870d8f42328dcabf1a65e9
# bad: [04a35922c1dac1b4864b8b366a37474e9e51d8c0] MIPS: Loongson: Add 
Loongson-3A R2 basic support
git bisect bad 04a35922c1dac1b4864b8b366a37474e9e51d8c0
# first bad commit: [04a35922c1dac1b4864b8b366a37474e9e51d8c0] MIPS: Loongson: 
Add Loongson-3A R2 basic support

Re:next: fuloong2e qemu boot failure due to 'MIPS: Loongson: AddLoongson-3A R2 basic support'

2016-04-19 Thread 陈华才

Hi,

Could you please remove the line "#define cpu_hwrena_impl_bits0xc000" 
in arch/mips/include/asm/mach-loongson64/cpu-feature-overrides.h and try 
again?Thanks.

Huacai
 
-- Original --
From:  "Guenter Roeck";
Date:  Wed, Apr 20, 2016 10:54 AM
To:  "Huacai Chen";
Cc:  "Ralf Baechle"; 
"linux-mips"; 
"linux-next"; 
"linux-kernel";
Subject:  next: fuloong2e qemu boot failure due to 'MIPS: Loongson: 
AddLoongson-3A R2 basic support'
 
Hi,

qemu fails to boot in -next for machine fulong2e with configuration
fuloong2e_defconfig. Bisect points to commit 'MIPS: Loongson: Add
Loongson-3A R2 basic support'. qemu hangs in boot, after displaying 
"Inode-cache hash table entries: 16384 (order: 3, 131072 bytes)".

Bisect log is attached.

Guenter

---
# bad: [1bd7a2081d2c7b096f75aa934658e404ccaba5fd] Add linux-next specific files 
for 20160418
# good: [bf16200689118d19de1b8d2a3c314fc21f5dc7bb] Linux 4.6-rc3
git bisect start 'HEAD' 'v4.6-rc3'
# bad: [493ac92ff65ec4c4cd4c43870e778760a012951d] Merge remote-tracking branch 
'ipvs-next/master'
git bisect bad 493ac92ff65ec4c4cd4c43870e778760a012951d
# bad: [20ca3ae9c517eee9b2f1bd0fb2a06e2d14153792] Merge remote-tracking branch 
'btrfs-kdave/for-next'
git bisect bad 20ca3ae9c517eee9b2f1bd0fb2a06e2d14153792
# good: [c454e65fb9ade11d0f84718d06a6888e2c92804d] Merge remote-tracking branch 
'omap/for-next'
git bisect good c454e65fb9ade11d0f84718d06a6888e2c92804d
# good: [6f5c70fb9b4fc0534157bfa40cea9b402e6f2506] Merge remote-tracking branch 
'microblaze/next'
git bisect good 6f5c70fb9b4fc0534157bfa40cea9b402e6f2506
# bad: [7f053cd68fd14243c8f202b4086d7dd75c409e6f] MIPS: Loongson-3: Introduce 
CONFIG_LOONGSON3_ENHANCEMENT
git bisect bad 7f053cd68fd14243c8f202b4086d7dd75c409e6f
# good: [e9aacdd7f0b66c4ace17e5950c48e7cc61a253c8] MIPS: Allow RIXI to be used 
on non-R2 or R6 cores
git bisect good e9aacdd7f0b66c4ace17e5950c48e7cc61a253c8
# good: [d1e8b9a8dc6c7fa9add5dfa7083e035ce037e56d] MAINTAINERS: add Loongson1 
architecture entry
git bisect good d1e8b9a8dc6c7fa9add5dfa7083e035ce037e56d
# good: [13ff6275bb389c3669082d3ef8483592a31eb0ea] MIPS: Fix siginfo.h to use 
strict posix types
git bisect good 13ff6275bb389c3669082d3ef8483592a31eb0ea
# good: [66e74bdd51e617023fa2e79a807b704fb3eed8aa] MIPS: Enable ptrace hw 
watchpoints on MIPS R6
git bisect good 66e74bdd51e617023fa2e79a807b704fb3eed8aa
# good: [f7cabc2dac8adf5986dbc700584bc3b8fe493d4d] MIPS: Loongson-3: Adjust irq 
dispatch to speedup processing
git bisect good f7cabc2dac8adf5986dbc700584bc3b8fe493d4d
# bad: [4978c8477e96fb9e9d870d8f42328dcabf1a65e9] MIPS: Loongson-3: Set cache 
flush handlers to cache_noop
git bisect bad 4978c8477e96fb9e9d870d8f42328dcabf1a65e9
# bad: [04a35922c1dac1b4864b8b366a37474e9e51d8c0] MIPS: Loongson: Add 
Loongson-3A R2 basic support
git bisect bad 04a35922c1dac1b4864b8b366a37474e9e51d8c0
# first bad commit: [04a35922c1dac1b4864b8b366a37474e9e51d8c0] MIPS: Loongson: 
Add Loongson-3A R2 basic support

Re:[PATCH] mips: Fix console output for Fulong2e system

2015-08-07 Thread 陈华才

Acked-by: Huacai Chen 
 
-- Original --
From:  "Guenter Roeck";
Date:  Fri, Aug 7, 2015 01:57 PM
To:  "Ralf Baechle"; 
Cc:  "Huacai Chen"; "linux-mips"; 
"linux-kernel"; "Guenter 
Roeck"; 
Subject:  [PATCH] mips: Fix console output for Fulong2e system

 
Commit 3adeb2566b9b ("MIPS: Loongson: Improve LEFI firmware interface")
made the number of UARTs dynamic if LEFI_FIRMWARE_INTERFACE is configured.
Unfortunately, it did not initialize the number of UARTs if
LEFI_FIRMWARE_INTERFACE is not configured. As a result, the Fulong2e
system has no console.

Fixes: 3adeb2566b9b ("MIPS: Loongson: Improve LEFI firmware interface")
Cc: Huacai Chen 
Signed-off-by: Guenter Roeck 
---
Never mind my earlier e-mail, I figured it out.
Should be a candidate for stable (v3.19+, ie v4.1 in practice).

 arch/mips/loongson64/common/env.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/mips/loongson64/common/env.c 
b/arch/mips/loongson64/common/env.c
index f6c44dd332e2..d6d07ad56180 100644
--- a/arch/mips/loongson64/common/env.c
+++ b/arch/mips/loongson64/common/env.c
@@ -64,6 +64,9 @@ void __init prom_init_env(void)
}
if (memsize == 0)
memsize = 256;
+
+   loongson_sysconf.nr_uarts = 1;
+
pr_info("memsize=%u, highmemsize=%u\n", memsize, highmemsize);
 #else
struct boot_params *boot_p;
-- 
2.1.4

Re:[PATCH] mips: Fix console output for Fulong2e system

2015-08-07 Thread 陈华才

Acked-by: Huacai Chen che...@lemote.com
 
-- Original --
From:  Guenter Roeckli...@roeck-us.net;
Date:  Fri, Aug 7, 2015 01:57 PM
To:  Ralf Baechler...@linux-mips.org; 
Cc:  Huacai Chenche...@lemote.com; linux-mipslinux-m...@linux-mips.org; 
linux-kernellinux-kernel@vger.kernel.org; Guenter 
Roeckli...@roeck-us.net; 
Subject:  [PATCH] mips: Fix console output for Fulong2e system

 
Commit 3adeb2566b9b (MIPS: Loongson: Improve LEFI firmware interface)
made the number of UARTs dynamic if LEFI_FIRMWARE_INTERFACE is configured.
Unfortunately, it did not initialize the number of UARTs if
LEFI_FIRMWARE_INTERFACE is not configured. As a result, the Fulong2e
system has no console.

Fixes: 3adeb2566b9b (MIPS: Loongson: Improve LEFI firmware interface)
Cc: Huacai Chen che...@lemote.com
Signed-off-by: Guenter Roeck li...@roeck-us.net
---
Never mind my earlier e-mail, I figured it out.
Should be a candidate for stable (v3.19+, ie v4.1 in practice).

 arch/mips/loongson64/common/env.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/mips/loongson64/common/env.c 
b/arch/mips/loongson64/common/env.c
index f6c44dd332e2..d6d07ad56180 100644
--- a/arch/mips/loongson64/common/env.c
+++ b/arch/mips/loongson64/common/env.c
@@ -64,6 +64,9 @@ void __init prom_init_env(void)
}
if (memsize == 0)
memsize = 256;
+
+   loongson_sysconf.nr_uarts = 1;
+
pr_info(memsize=%u, highmemsize=%u\n, memsize, highmemsize);
 #else
struct boot_params *boot_p;
-- 
2.1.4

Re:[PATCH -next] MIPS: Remove duplicated include from numa.c

2014-08-12 Thread 陈华才

Reviewed-off-by: Huacai Chen 
 
-- Original --
From:  "weiyj_lk";
Date:  Tue, Aug 12, 2014 08:26 PM
To:  "Ralf Baechle"; "Huacai Chen"; 
Cc:  "Wei Yongjun"; 
"linux-mips"; 
"linux-kernel"; 
Subject:  [PATCH -next] MIPS: Remove duplicated include from numa.c

 
From: Wei Yongjun 

Remove duplicated include.

Signed-off-by: Wei Yongjun 
---
 arch/mips/loongson/loongson-3/numa.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/mips/loongson/loongson-3/numa.c 
b/arch/mips/loongson/loongson-3/numa.c
index ca025a6..37ed184 100644
--- a/arch/mips/loongson/loongson-3/numa.c
+++ b/arch/mips/loongson/loongson-3/numa.c
@@ -24,8 +24,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include

Re:[PATCH -next] MIPS: Remove duplicated include from numa.c

2014-08-12 Thread 陈华才

Reviewed-off-by: Huacai Chen che...@lemote.com
 
-- Original --
From:  weiyj_lkweiyj...@163.com;
Date:  Tue, Aug 12, 2014 08:26 PM
To:  Ralf Baechler...@linux-mips.org; Huacai Chenche...@lemote.com; 
Cc:  Wei Yongjunyongjun_...@trendmicro.com.cn; 
linux-mipslinux-m...@linux-mips.org; 
linux-kernellinux-kernel@vger.kernel.org; 
Subject:  [PATCH -next] MIPS: Remove duplicated include from numa.c

 
From: Wei Yongjun yongjun_...@trendmicro.com.cn

Remove duplicated include.

Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn
---
 arch/mips/loongson/loongson-3/numa.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/mips/loongson/loongson-3/numa.c 
b/arch/mips/loongson/loongson-3/numa.c
index ca025a6..37ed184 100644
--- a/arch/mips/loongson/loongson-3/numa.c
+++ b/arch/mips/loongson/loongson-3/numa.c
@@ -24,8 +24,6 @@
 #include asm/page.h
 #include asm/pgalloc.h
 #include asm/sections.h
-#include linux/bootmem.h
-#include linux/init.h
 #include linux/irq.h
 #include asm/bootinfo.h
 #include asm/mc146818-time.h

Re: [PATCH] genirq: Avoid NULL OOPS in irq handling

2013-10-30 Thread 陈华才

> On Wed, 30 Oct 2013, "陈华才" wrote:
>
>> I use a Loongson-3(MIPS-series CPU) machine, there is a serial port
>> integrated in the CPU (but it iss buggy), and it use handle_percpu_irq()
>> as the irq handler. Maybe I should move the checking into
>> handle_percpu_irq()?
>
> Why is a device interrupt using handle_percpu_irq()?
Seems that IRQs directly deliverd to MIPS CPU (those without interrupt
controller) are all handled by handle_percpu_irq() I will try to
overwrite the handler in arch-specific code and keep the code in kernel/
irq as is. Thanks.
>
> Thanks,
>
>   tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] genirq: Avoid NULL OOPS in irq handling

2013-10-30 Thread 陈华才

I use a Loongson-3(MIPS-series CPU) machine, there is a serial port
integrated in the CPU (but it iss buggy), and it use handle_percpu_irq()
as the irq handler. Maybe I should move the checking into
handle_percpu_irq()?

Huacai

> On Sat, 28 Sep 2013, Huacai Chen wrote:
>
>> Some devices (e.g. serial port) setup irq handler at dev open and free
>> it at dev close. So, sometimes there is no irqaction for a specific
>> irq. But some buggy devices may send irqs at any time. This patch avoid
>> the NULL OOPS when irqaction isn't registered.
>
> All callers except the real per cpu interrupts are checking whether
> there is a valid action before calling. And serial ports are not
> routed to real per cpu interrupts. Can you provide more detailed
> information about the problem you are trying to solve please?
>
> Thanks,
>
>   tglx
>
>> Signed-off-by: Huacai Chen 
>> ---
>>  kernel/irq/handle.c |4 
>>  1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
>> index 131ca17..1c78e69 100644
>> --- a/kernel/irq/handle.c
>> +++ b/kernel/irq/handle.c
>> @@ -135,6 +135,9 @@ handle_irq_event_percpu(struct irq_desc *desc,
>> struct irqaction *action)
>>  irqreturn_t retval = IRQ_NONE;
>>  unsigned int flags = 0, irq = desc->irq_data.irq;
>>
>> +if (!action)
>> +goto out;
>> +
>>  do {
>>  irqreturn_t res;
>>
>> @@ -174,6 +177,7 @@ handle_irq_event_percpu(struct irq_desc *desc,
>> struct irqaction *action)
>>
>>  add_interrupt_randomness(irq, flags);
>>
>> +out:
>>  if (!noirqdebug)
>>  note_interrupt(irq, desc, retval);
>>  return retval;
>> --
>> 1.7.7.3
>>
>>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] genirq: Avoid NULL OOPS in irq handling

2013-10-30 Thread 陈华才

I use a Loongson-3(MIPS-series CPU) machine, there is a serial port
integrated in the CPU (but it iss buggy), and it use handle_percpu_irq()
as the irq handler. Maybe I should move the checking into
handle_percpu_irq()?

Huacai

 On Sat, 28 Sep 2013, Huacai Chen wrote:

 Some devices (e.g. serial port) setup irq handler at dev open and free
 it at dev close. So, sometimes there is no irqaction for a specific
 irq. But some buggy devices may send irqs at any time. This patch avoid
 the NULL OOPS when irqaction isn't registered.

 All callers except the real per cpu interrupts are checking whether
 there is a valid action before calling. And serial ports are not
 routed to real per cpu interrupts. Can you provide more detailed
 information about the problem you are trying to solve please?

 Thanks,

   tglx

 Signed-off-by: Huacai Chen che...@lemote.com
 ---
  kernel/irq/handle.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)

 diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
 index 131ca17..1c78e69 100644
 --- a/kernel/irq/handle.c
 +++ b/kernel/irq/handle.c
 @@ -135,6 +135,9 @@ handle_irq_event_percpu(struct irq_desc *desc,
 struct irqaction *action)
  irqreturn_t retval = IRQ_NONE;
  unsigned int flags = 0, irq = desc-irq_data.irq;

 +if (!action)
 +goto out;
 +
  do {
  irqreturn_t res;

 @@ -174,6 +177,7 @@ handle_irq_event_percpu(struct irq_desc *desc,
 struct irqaction *action)

  add_interrupt_randomness(irq, flags);

 +out:
  if (!noirqdebug)
  note_interrupt(irq, desc, retval);
  return retval;
 --
 1.7.7.3





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] genirq: Avoid NULL OOPS in irq handling

2013-10-30 Thread 陈华才

 On Wed, 30 Oct 2013, 陈华才 wrote:

 I use a Loongson-3(MIPS-series CPU) machine, there is a serial port
 integrated in the CPU (but it iss buggy), and it use handle_percpu_irq()
 as the irq handler. Maybe I should move the checking into
 handle_percpu_irq()?

 Why is a device interrupt using handle_percpu_irq()?
Seems that IRQs directly deliverd to MIPS CPU (those without interrupt
controller) are all handled by handle_percpu_irq() I will try to
overwrite the handler in arch-specific code and keep the code in kernel/
irq as is. Thanks.

 Thanks,

   tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 01/14] MIPS: Build uasm-generated code only once to avoid CPU Hotplug problem

2013-03-04 Thread 陈华才

I'm sorry, this is the only patch, please ignore [01/14].

> Currently, clear_page()/copy_page() are generated by Micro-assembler
> dynamically. But they are unavailable until uasm_resolve_relocs() has
> finished because jump labels are illegal before that. Since these
> functions are shared by every CPU, we only call build_clear_page()/
> build_copy_page() only once at boot time. Without this patch, programs
> will get random memory corruption (segmentation fault, bus error, etc.)
> while CPU Hotplug (e.g. one CPU is using clear_page() while another is
> generating it in cpu_cache_init()).
>
> For similar reasons we modify build_tlb_refill_handler()'s invocation.
>
> V2:
> 1, Rework the code to make CPU#0 can be online/offline.
> 2, Introduce cpu_has_local_ebase feature since some types of MIPS CPU
> need a per-CPU tlb_refill_handler().
>
> Signed-off-by: Huacai Chen 
> Signed-off-by: Hongbing Hu 
> ---
>  arch/mips/include/asm/cpu-features.h   |3 +++
>  .../asm/mach-loongson/cpu-feature-overrides.h  |1 +
>  arch/mips/mm/page.c|   10 ++
>  arch/mips/mm/tlbex.c   |   10 --
>  4 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/mips/include/asm/cpu-features.h
> b/arch/mips/include/asm/cpu-features.h
> index c507b93..1204408 100644
> --- a/arch/mips/include/asm/cpu-features.h
> +++ b/arch/mips/include/asm/cpu-features.h
> @@ -110,6 +110,9 @@
>  #ifndef cpu_has_pindexed_dcache
>  #define cpu_has_pindexed_dcache  (cpu_data[0].dcache.flags &
> MIPS_CACHE_PINDEX)
>  #endif
> +#ifndef cpu_has_local_ebase
> +#define cpu_has_local_ebase  1
> +#endif
>
>  /*
>   * I-Cache snoops remote store.  This only matters on SMP.  Some
> multiprocessors
> diff --git a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
> b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
> index 1a05d85..8eec8e2 100644
> --- a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
> +++ b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
> @@ -57,5 +57,6 @@
>  #define cpu_has_vint 0
>  #define cpu_has_vtag_icache  0
>  #define cpu_has_watch1
> +#define cpu_has_local_ebase  0
>
>  #endif /* __ASM_MACH_LOONGSON_CPU_FEATURE_OVERRIDES_H */
> diff --git a/arch/mips/mm/page.c b/arch/mips/mm/page.c
> index 8e666c5..6a39c01 100644
> --- a/arch/mips/mm/page.c
> +++ b/arch/mips/mm/page.c
> @@ -247,6 +247,11 @@ void __cpuinit build_clear_page(void)
>   struct uasm_label *l = labels;
>   struct uasm_reloc *r = relocs;
>   int i;
> + static atomic_t run_once = ATOMIC_INIT(0);
> +
> + if (atomic_xchg(_once, 1)) {
> + return;
> + }
>
>   memset(labels, 0, sizeof(labels));
>   memset(relocs, 0, sizeof(relocs));
> @@ -389,6 +394,11 @@ void __cpuinit build_copy_page(void)
>   struct uasm_label *l = labels;
>   struct uasm_reloc *r = relocs;
>   int i;
> + static atomic_t run_once = ATOMIC_INIT(0);
> +
> + if (atomic_xchg(_once, 1)) {
> + return;
> + }
>
>   memset(labels, 0, sizeof(labels));
>   memset(relocs, 0, sizeof(relocs));
> diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
> index 1c8ac49..4a8b294 100644
> --- a/arch/mips/mm/tlbex.c
> +++ b/arch/mips/mm/tlbex.c
> @@ -2161,8 +2161,11 @@ void __cpuinit build_tlb_refill_handler(void)
>   case CPU_TX3922:
>   case CPU_TX3927:
>  #ifndef CONFIG_MIPS_PGD_C0_CONTEXT
> - build_r3000_tlb_refill_handler();
> + if (cpu_has_local_ebase)
> + build_r3000_tlb_refill_handler();
>   if (!run_once) {
> + if (!cpu_has_local_ebase)
> + build_r3000_tlb_refill_handler();
>   build_r3000_tlb_load_handler();
>   build_r3000_tlb_store_handler();
>   build_r3000_tlb_modify_handler();
> @@ -2191,9 +2194,12 @@ void __cpuinit build_tlb_refill_handler(void)
>   build_r4000_tlb_load_handler();
>   build_r4000_tlb_store_handler();
>   build_r4000_tlb_modify_handler();
> + if (!cpu_has_local_ebase)
> +     build_r4000_tlb_refill_handler();
>   run_once++;
>   }
> - build_r4000_tlb_refill_handler();
> + if (cpu_has_local_ebase)
> + build_r4000_tlb_refill_handler();
>   }
>  }
>
> --
> 1.7.7.3
>
>


-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 01/14] MIPS: Build uasm-generated code only once to avoid CPU Hotplug problem

2013-03-04 Thread 陈华才

I'm sorry, this is the only patch, please ignore [01/14].

 Currently, clear_page()/copy_page() are generated by Micro-assembler
 dynamically. But they are unavailable until uasm_resolve_relocs() has
 finished because jump labels are illegal before that. Since these
 functions are shared by every CPU, we only call build_clear_page()/
 build_copy_page() only once at boot time. Without this patch, programs
 will get random memory corruption (segmentation fault, bus error, etc.)
 while CPU Hotplug (e.g. one CPU is using clear_page() while another is
 generating it in cpu_cache_init()).

 For similar reasons we modify build_tlb_refill_handler()'s invocation.

 V2:
 1, Rework the code to make CPU#0 can be online/offline.
 2, Introduce cpu_has_local_ebase feature since some types of MIPS CPU
 need a per-CPU tlb_refill_handler().

 Signed-off-by: Huacai Chen che...@lemote.com
 Signed-off-by: Hongbing Hu h...@lemote.com
 ---
  arch/mips/include/asm/cpu-features.h   |3 +++
  .../asm/mach-loongson/cpu-feature-overrides.h  |1 +
  arch/mips/mm/page.c|   10 ++
  arch/mips/mm/tlbex.c   |   10 --
  4 files changed, 22 insertions(+), 2 deletions(-)

 diff --git a/arch/mips/include/asm/cpu-features.h
 b/arch/mips/include/asm/cpu-features.h
 index c507b93..1204408 100644
 --- a/arch/mips/include/asm/cpu-features.h
 +++ b/arch/mips/include/asm/cpu-features.h
 @@ -110,6 +110,9 @@
  #ifndef cpu_has_pindexed_dcache
  #define cpu_has_pindexed_dcache  (cpu_data[0].dcache.flags 
 MIPS_CACHE_PINDEX)
  #endif
 +#ifndef cpu_has_local_ebase
 +#define cpu_has_local_ebase  1
 +#endif

  /*
   * I-Cache snoops remote store.  This only matters on SMP.  Some
 multiprocessors
 diff --git a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
 b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
 index 1a05d85..8eec8e2 100644
 --- a/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
 +++ b/arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
 @@ -57,5 +57,6 @@
  #define cpu_has_vint 0
  #define cpu_has_vtag_icache  0
  #define cpu_has_watch1
 +#define cpu_has_local_ebase  0

  #endif /* __ASM_MACH_LOONGSON_CPU_FEATURE_OVERRIDES_H */
 diff --git a/arch/mips/mm/page.c b/arch/mips/mm/page.c
 index 8e666c5..6a39c01 100644
 --- a/arch/mips/mm/page.c
 +++ b/arch/mips/mm/page.c
 @@ -247,6 +247,11 @@ void __cpuinit build_clear_page(void)
   struct uasm_label *l = labels;
   struct uasm_reloc *r = relocs;
   int i;
 + static atomic_t run_once = ATOMIC_INIT(0);
 +
 + if (atomic_xchg(run_once, 1)) {
 + return;
 + }

   memset(labels, 0, sizeof(labels));
   memset(relocs, 0, sizeof(relocs));
 @@ -389,6 +394,11 @@ void __cpuinit build_copy_page(void)
   struct uasm_label *l = labels;
   struct uasm_reloc *r = relocs;
   int i;
 + static atomic_t run_once = ATOMIC_INIT(0);
 +
 + if (atomic_xchg(run_once, 1)) {
 + return;
 + }

   memset(labels, 0, sizeof(labels));
   memset(relocs, 0, sizeof(relocs));
 diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
 index 1c8ac49..4a8b294 100644
 --- a/arch/mips/mm/tlbex.c
 +++ b/arch/mips/mm/tlbex.c
 @@ -2161,8 +2161,11 @@ void __cpuinit build_tlb_refill_handler(void)
   case CPU_TX3922:
   case CPU_TX3927:
  #ifndef CONFIG_MIPS_PGD_C0_CONTEXT
 - build_r3000_tlb_refill_handler();
 + if (cpu_has_local_ebase)
 + build_r3000_tlb_refill_handler();
   if (!run_once) {
 + if (!cpu_has_local_ebase)
 + build_r3000_tlb_refill_handler();
   build_r3000_tlb_load_handler();
   build_r3000_tlb_store_handler();
   build_r3000_tlb_modify_handler();
 @@ -2191,9 +2194,12 @@ void __cpuinit build_tlb_refill_handler(void)
   build_r4000_tlb_load_handler();
   build_r4000_tlb_store_handler();
   build_r4000_tlb_modify_handler();
 + if (!cpu_has_local_ebase)
 + build_r4000_tlb_refill_handler();
   run_once++;
   }
 - build_r4000_tlb_refill_handler();
 + if (cpu_has_local_ebase)
 + build_r4000_tlb_refill_handler();
   }
  }

 --
 1.7.7.3




-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-27 Thread 陈华才

John, for patch 3 reworking, please tell me whether my
understanding is correct? (make cpu_has_coherent_cache
as a runtime option rather than config option), thank
you very much.

> Hi,
>
> ok, i dropped the series for now from my tree
>
>   John
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-27 Thread 陈华才

John, for patch 3 reworking, please tell me whether my
understanding is correct? (make cpu_has_coherent_cache
as a runtime option rather than config option), thank
you very much.

 Hi,

 ok, i dropped the series for now from my tree

   John



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-26 Thread 陈华才

Hi, John,

Compiling fails because __dev* prefix should be removed
due to upstream changes.

You said that patch 3 need to be rework, but I don't know
how to improve... Could you please tell me where is unsane?

Maybe you means I should make cpu_has_coherent_cache a
runtime value rather than a config option as follows?

1, remove CONFIG_CPU_SUPPORTS_COHERENT_CACHE
2, in arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
#define cpu_has_coherent_cache   1
3, in arch/mips/include/asm/cpu.h
#define MIPS_CPU_COHERENT_CACHE   0x0008
4, in arch/mips/include/asm/cpu-features.h
#ifndef cpu_has_coherent_cache
#define cpu_has_coherent_cache   (cpu_data[0].options &
MIPS_CPU_INCLUSIVE_CACHES)
#endif

Besides, the SMP code has a bug to fix (IPI sending) and
patch 3, patch 6 need to update. So I think a V9 is needed :(



> On 25/01/13 01:15, 陈华才 wrote:
>> ok, I'll prepare v9 of this seris in these days.
>>>
>
>
> Please dont send v9
>
> read my mail and compile / runtime test the tree please
>
> only patch 3 needs to be reworked and an update for the "MIPS: Loongson
> 3: Add HT-linked PCI support." needs to e made
>
>   John
>
>>>>
>>>> Huacai Chen(13):
>>>>MIPS: Loongson: Add basic Loongson-3 definition.
>>>>MIPS: Loongson: Add basic Loongson-3 CPU support.
>>>>MIPS: Loongson: Introduce and use cpu_has_coherent_cache feature.
>>>>MIPS: Loongson 3: Add Lemote-3A machtypes definition.
>>>>MIPS: Loongson: Add UEFI-like firmware interface support.
>>>>MIPS: Loongson 3: Add HT-linked PCI support.
>>>>MIPS: Loongson 3: Add IRQ init and dispatch support.
>>>>MIPS: Loongson 3: Add serial port support.
>>>>MIPS: Loongson: Add swiotlb to support big memory (>4GB).
>>>>MIPS: Loongson: Add Loongson-3 Kconfig options.
>>>>MIPS: Loongson 3: Add Loongson-3 SMP support.
>>>>MIPS: Loongson 3: Add CPU hotplug support.
>>>>MIPS: Loongson: Add a Loongson-3 default config file.
>>>>
>>>> Signed-off-by: Huacai Chen
>>>> Signed-off-by: Hongliang Tao
>>>> Signed-off-by: Hua Yan
>>>> ---
>>>
>>> Hi,
>>>
>>> I have added all patches apart from 3/13 to my queue.
>>>
>>> I believe "MIPS: Loongson: Introduce and use cpu_has_coherent_cache
>>> feature." should e rewritten in a saner way.
>>>
>>> Please compile and runtime test the tree before I send it to Ralf
>>> -->
>>> http://git.linux-mips.org/?p=john/linux-john.git;a=shortlog;h=refs/heads/mips-next-3.9
>>>
>>> I cleaned up a few minor whitespace errors while merging.
>>>
>>> http://patchwork.linux-mips.org/patch/4547/ has a few comments. Please
>>> prepare a patch asap to address those so i can fold it into the series.
>>>
>>> John
>>>
>>
>>
>
>


-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-26 Thread 陈华才

Hi, John,

Compiling fails because __dev* prefix should be removed
due to upstream changes.

You said that patch 3 need to be rework, but I don't know
how to improve... Could you please tell me where is unsane?

Maybe you means I should make cpu_has_coherent_cache a
runtime value rather than a config option as follows?

1, remove CONFIG_CPU_SUPPORTS_COHERENT_CACHE
2, in arch/mips/include/asm/mach-loongson/cpu-feature-overrides.h
#define cpu_has_coherent_cache   1
3, in arch/mips/include/asm/cpu.h
#define MIPS_CPU_COHERENT_CACHE   0x0008
4, in arch/mips/include/asm/cpu-features.h
#ifndef cpu_has_coherent_cache
#define cpu_has_coherent_cache   (cpu_data[0].options 
MIPS_CPU_INCLUSIVE_CACHES)
#endif

Besides, the SMP code has a bug to fix (IPI sending) and
patch 3, patch 6 need to update. So I think a V9 is needed :(



 On 25/01/13 01:15, 陈华才 wrote:
 ok, I'll prepare v9 of this seris in these days.



 Please dont send v9

 read my mail and compile / runtime test the tree please

 only patch 3 needs to be reworked and an update for the MIPS: Loongson
 3: Add HT-linked PCI support. needs to e made

   John


 Huacai Chen(13):
MIPS: Loongson: Add basic Loongson-3 definition.
MIPS: Loongson: Add basic Loongson-3 CPU support.
MIPS: Loongson: Introduce and use cpu_has_coherent_cache feature.
MIPS: Loongson 3: Add Lemote-3A machtypes definition.
MIPS: Loongson: Add UEFI-like firmware interface support.
MIPS: Loongson 3: Add HT-linked PCI support.
MIPS: Loongson 3: Add IRQ init and dispatch support.
MIPS: Loongson 3: Add serial port support.
MIPS: Loongson: Add swiotlb to support big memory (4GB).
MIPS: Loongson: Add Loongson-3 Kconfig options.
MIPS: Loongson 3: Add Loongson-3 SMP support.
MIPS: Loongson 3: Add CPU hotplug support.
MIPS: Loongson: Add a Loongson-3 default config file.

 Signed-off-by: Huacai Chenche...@lemote.com
 Signed-off-by: Hongliang Taota...@lemote.com
 Signed-off-by: Hua Yany...@lemote.com
 ---

 Hi,

 I have added all patches apart from 3/13 to my queue.

 I believe MIPS: Loongson: Introduce and use cpu_has_coherent_cache
 feature. should e rewritten in a saner way.

 Please compile and runtime test the tree before I send it to Ralf
 --
 http://git.linux-mips.org/?p=john/linux-john.git;a=shortlog;h=refs/heads/mips-next-3.9

 I cleaned up a few minor whitespace errors while merging.

 http://patchwork.linux-mips.org/patch/4547/ has a few comments. Please
 prepare a patch asap to address those so i can fold it into the series.

 John







-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-24 Thread 陈华才

ok, I'll prepare v9 of this seris in these days.
>
>>
>> Huacai Chen(13):
>>   MIPS: Loongson: Add basic Loongson-3 definition.
>>   MIPS: Loongson: Add basic Loongson-3 CPU support.
>>   MIPS: Loongson: Introduce and use cpu_has_coherent_cache feature.
>>   MIPS: Loongson 3: Add Lemote-3A machtypes definition.
>>   MIPS: Loongson: Add UEFI-like firmware interface support.
>>   MIPS: Loongson 3: Add HT-linked PCI support.
>>   MIPS: Loongson 3: Add IRQ init and dispatch support.
>>   MIPS: Loongson 3: Add serial port support.
>>   MIPS: Loongson: Add swiotlb to support big memory (>4GB).
>>   MIPS: Loongson: Add Loongson-3 Kconfig options.
>>   MIPS: Loongson 3: Add Loongson-3 SMP support.
>>   MIPS: Loongson 3: Add CPU hotplug support.
>>   MIPS: Loongson: Add a Loongson-3 default config file.
>>
>> Signed-off-by: Huacai Chen
>> Signed-off-by: Hongliang Tao
>> Signed-off-by: Hua Yan
>> ---
>
> Hi,
>
> I have added all patches apart from 3/13 to my queue.
>
> I believe "MIPS: Loongson: Introduce and use cpu_has_coherent_cache
> feature." should e rewritten in a saner way.
>
> Please compile and runtime test the tree before I send it to Ralf
> -->
> http://git.linux-mips.org/?p=john/linux-john.git;a=shortlog;h=refs/heads/mips-next-3.9
>
> I cleaned up a few minor whitespace errors while merging.
>
> http://patchwork.linux-mips.org/patch/4547/ has a few comments. Please
> prepare a patch asap to address those so i can fold it into the series.
>
>   John
>


-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V8 00/13] MIPS: Add Loongson-3 based machines support

2013-01-24 Thread 陈华才

ok, I'll prepare v9 of this seris in these days.


 Huacai Chen(13):
   MIPS: Loongson: Add basic Loongson-3 definition.
   MIPS: Loongson: Add basic Loongson-3 CPU support.
   MIPS: Loongson: Introduce and use cpu_has_coherent_cache feature.
   MIPS: Loongson 3: Add Lemote-3A machtypes definition.
   MIPS: Loongson: Add UEFI-like firmware interface support.
   MIPS: Loongson 3: Add HT-linked PCI support.
   MIPS: Loongson 3: Add IRQ init and dispatch support.
   MIPS: Loongson 3: Add serial port support.
   MIPS: Loongson: Add swiotlb to support big memory (4GB).
   MIPS: Loongson: Add Loongson-3 Kconfig options.
   MIPS: Loongson 3: Add Loongson-3 SMP support.
   MIPS: Loongson 3: Add CPU hotplug support.
   MIPS: Loongson: Add a Loongson-3 default config file.

 Signed-off-by: Huacai Chenche...@lemote.com
 Signed-off-by: Hongliang Taota...@lemote.com
 Signed-off-by: Hua Yany...@lemote.com
 ---

 Hi,

 I have added all patches apart from 3/13 to my queue.

 I believe MIPS: Loongson: Introduce and use cpu_has_coherent_cache
 feature. should e rewritten in a saner way.

 Please compile and runtime test the tree before I send it to Ralf
 --
 http://git.linux-mips.org/?p=john/linux-john.git;a=shortlog;h=refs/heads/mips-next-3.9

 I cleaned up a few minor whitespace errors while merging.

 http://patchwork.linux-mips.org/patch/4547/ has a few comments. Please
 prepare a patch asap to address those so i can fold it into the series.

   John



-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] sched: Fix a deadlock of cpu-hotplug

2012-10-24 Thread 陈华才


> On Wed, 2012-10-24 at 20:34 +0800, 陈华才 wrote:
>> I see, this is an arch-specific bug, sorry for my carelessness and thank
>> you for your tips.
>
> What arch are you using? And what exactly did the arch do wrong? Most of
> the code involved seems to be common code.
>
> Going by c0_compare_interrupt, this is some MIPS device.
>
Yes, I'm use MIPS, In a place which local_irq_save()/local_irq_restore()
should be used, I use local_irq_disable()/local_irq_enable() by mistake.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] sched: Fix a deadlock of cpu-hotplug

2012-10-24 Thread 陈华才


> On Wed, 2012-10-24 at 17:25 +0800, Huacai Chen wrote:
>> We found poweroff sometimes fails on our computers, so we have the
>> lock debug options configured. Then, when we do poweroff or take a
>> cpu down via cpu-hotplug, kernel complain as below. To resove this,
>> we modify sched_ttwu_pending(), disable the local irq when acquire
>> rq->lock.
>>
>> [   83.066406] =
>> [   83.066406] [ INFO: inconsistent lock state ]
>> [   83.066406] 3.5.0-3.lemote #428 Not tainted
>> [   83.066406] -
>> [   83.066406] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
>> [   83.066406] migration/1/7 [HC0[0]:SC0[0]:HE1:SE1] takes:
>> [   83.066406]  (>lock){?.-.-.}, at: []
>> sched_ttwu_pending+0x64/0x98
>> [   83.066406] {IN-HARDIRQ-W} state was registered at:
>> [   83.066406]   [] __lock_acquire+0x80c/0x1cc0
>> [   83.066406]   [] lock_acquire+0x60/0x9c
>> [   83.066406]   [] _raw_spin_lock+0x3c/0x50
>> [   83.066406]   [] scheduler_tick+0x48/0x178
>> [   83.066406]   [] update_process_times+0x54/0x70
>> [   83.066406]   [] tick_handle_periodic+0x2c/0x9c
>> [   83.066406]   [] c0_compare_interrupt+0x8c/0x94
>> [   83.066406]   [] handle_irq_event_percpu+0x7c/0x248
>> [   83.066406]   [] handle_percpu_irq+0x8c/0xc0
>> [   83.066406]   [] generic_handle_irq+0x48/0x58
>> [   83.066406]   [] do_IRQ+0x18/0x24
>> [   83.066406]   [] mach_irq_dispatch+0xe4/0x124
>> [   83.066406]   [] ret_from_irq+0x0/0x4
>> [   83.066406]   [] console_unlock+0x3e8/0x4c0
>> [   83.066406]   [] con_init+0x370/0x398
>> [   83.066406]   [] console_init+0x34/0x50
>> [   83.066406]   [] start_kernel+0x2f8/0x4e0
>> [   83.066406] irq event stamp: 971
>> [   83.066406] hardirqs last  enabled at (971): []
>> local_flush_tlb_all+0x134/0x17c
>> [   83.066406] hardirqs last disabled at (970): []
>> local_flush_tlb_all+0x48/0x17c
>> [   83.066406] softirqs last  enabled at (0): []
>> copy_process+0x510/0x117c
>> [   83.066406] softirqs last disabled at (0): [<  (null)>]
>> (null)
>> [   83.066406]
>> [   83.066406] other info that might help us debug this:
>> [   83.066406]  Possible unsafe locking scenario:
>> [   83.066406]
>> [   83.066406]CPU0
>> [   83.066406]
>> [   83.066406]   lock(>lock);
>> [   83.066406]   
>> [   83.066406] lock(>lock);
>> [   83.066406]
>> [   83.066406]  *** DEADLOCK ***
>> [   83.066406]
>> [   83.066406] no locks held by migration/1/7.
>> [   83.066406]
>> [   83.066406] stack backtrace:
>> [   83.066406] Call Trace:
>> [   83.066406] [] dump_stack+0x8/0x34
>> [   83.066406] [] print_usage_bug+0x2ec/0x314
>> [   83.066406] [] mark_lock+0x3fc/0x774
>> [   83.066406] [] __lock_acquire+0x8a8/0x1cc0
>> [   83.066406] [] lock_acquire+0x60/0x9c
>> [   83.066406] [] _raw_spin_lock+0x3c/0x50
>> [   83.066406] [] sched_ttwu_pending+0x64/0x98
>> [   83.066406] [] migration_call+0x10c/0x2e0
>> [   83.066406] [] notifier_call_chain+0x44/0x94
>> [   83.066406] [] __cpu_notify+0x30/0x5c
>> [   83.066406] [] take_cpu_down+0x5c/0x70
>> [   83.066406] [] stop_machine_cpu_stop+0x104/0x1e8
>> [   83.066406] [] cpu_stopper_thread+0x110/0x1ac
>> [   83.066406] [] kthread+0x88/0x90
>> [   83.066406] [] kernel_thread_helper+0x10/0x18
>
> Weird, that's from a CPU_DYING call, I thought those were with IRQs
> disabled.
>
> Look at how __stop_machine() calls the function with IRQs disabled for !
> stop_machine_initialized or !SMP. Also stop_machine_cpu_stop() seems to
> disabled interrupts, so how do we end up calling take_cpu_down() with
> IRQs enabled?
>
> That simply doesn't make any sense.
I see, this is an arch-specific bug, sorry for my carelessness and thank
you for your tips.

>
>> Signed-off-by: Huacai Chen 
>> ---
>>  kernel/sched/core.c |5 +++--
>>  1 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 36e2666..703754a 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1468,9 +1468,10 @@ static void sched_ttwu_pending(void)
>>  {
>>  struct rq *rq = this_rq();
>>  struct llist_node *llist = llist_del_all(>wake_list);
>> +unsigned long flags;
>>  struct task_struct *p;
>>
>> -raw_spin_lock(>lock);
>> +raw_spin_lock_irqsave(>lock, flags);
>>
>>  while (llist) {
>>  p = llist_entry(llist, struct task_struct, wake_entry);
>> @@ -1478,7 +1479,7 @@ static void sched_ttwu_pending(void)
>>  ttwu_do_activate(rq, p, 0);
>>  }
>>
>> -raw_spin_unlock(>lock);
>> +raw_spin_unlock_irqrestore(>lock, flags);
>>  }
>>
>>  void scheduler_ipi(void)
>
>
> That's wrong though, you add the cost to the common case instead of the
> hardly ever ran hotplug case.
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] sched: Fix a deadlock of cpu-hotplug

2012-10-24 Thread 陈华才


 On Wed, 2012-10-24 at 17:25 +0800, Huacai Chen wrote:
 We found poweroff sometimes fails on our computers, so we have the
 lock debug options configured. Then, when we do poweroff or take a
 cpu down via cpu-hotplug, kernel complain as below. To resove this,
 we modify sched_ttwu_pending(), disable the local irq when acquire
 rq-lock.

 [   83.066406] =
 [   83.066406] [ INFO: inconsistent lock state ]
 [   83.066406] 3.5.0-3.lemote #428 Not tainted
 [   83.066406] -
 [   83.066406] inconsistent {IN-HARDIRQ-W} - {HARDIRQ-ON-W} usage.
 [   83.066406] migration/1/7 [HC0[0]:SC0[0]:HE1:SE1] takes:
 [   83.066406]  (rq-lock){?.-.-.}, at: [802585ac]
 sched_ttwu_pending+0x64/0x98
 [   83.066406] {IN-HARDIRQ-W} state was registered at:
 [   83.066406]   [8027c9ac] __lock_acquire+0x80c/0x1cc0
 [   83.066406]   [8027e3d0] lock_acquire+0x60/0x9c
 [   83.066406]   [8074ba04] _raw_spin_lock+0x3c/0x50
 [   83.066406]   [8025a2fc] scheduler_tick+0x48/0x178
 [   83.066406]   [8023b334] update_process_times+0x54/0x70
 [   83.066406]   [80277568] tick_handle_periodic+0x2c/0x9c
 [   83.066406]   [8020a818] c0_compare_interrupt+0x8c/0x94
 [   83.066406]   [8029ec8c] handle_irq_event_percpu+0x7c/0x248
 [   83.066406]   [802a2774] handle_percpu_irq+0x8c/0xc0
 [   83.066406]   [8029e2c8] generic_handle_irq+0x48/0x58
 [   83.066406]   [80205c04] do_IRQ+0x18/0x24
 [   83.066406]   [802016e4] mach_irq_dispatch+0xe4/0x124
 [   83.066406]   [80203ca0] ret_from_irq+0x0/0x4
 [   83.066406]   [8022d114] console_unlock+0x3e8/0x4c0
 [   83.066406]   [811ff0d0] con_init+0x370/0x398
 [   83.066406]   [811fe3e0] console_init+0x34/0x50
 [   83.066406]   [811e4844] start_kernel+0x2f8/0x4e0
 [   83.066406] irq event stamp: 971
 [   83.066406] hardirqs last  enabled at (971): [8021c384]
 local_flush_tlb_all+0x134/0x17c
 [   83.066406] hardirqs last disabled at (970): [8021c298]
 local_flush_tlb_all+0x48/0x17c
 [   83.066406] softirqs last  enabled at (0): [802298a4]
 copy_process+0x510/0x117c
 [   83.066406] softirqs last disabled at (0): [  (null)]
 (null)
 [   83.066406]
 [   83.066406] other info that might help us debug this:
 [   83.066406]  Possible unsafe locking scenario:
 [   83.066406]
 [   83.066406]CPU0
 [   83.066406]
 [   83.066406]   lock(rq-lock);
 [   83.066406]   Interrupt
 [   83.066406] lock(rq-lock);
 [   83.066406]
 [   83.066406]  *** DEADLOCK ***
 [   83.066406]
 [   83.066406] no locks held by migration/1/7.
 [   83.066406]
 [   83.066406] stack backtrace:
 [   83.066406] Call Trace:
 [   83.066406] [80747544] dump_stack+0x8/0x34
 [   83.066406] [8027ba04] print_usage_bug+0x2ec/0x314
 [   83.066406] [8027be28] mark_lock+0x3fc/0x774
 [   83.066406] [8027ca48] __lock_acquire+0x8a8/0x1cc0
 [   83.066406] [8027e3d0] lock_acquire+0x60/0x9c
 [   83.066406] [8074ba04] _raw_spin_lock+0x3c/0x50
 [   83.066406] [802585ac] sched_ttwu_pending+0x64/0x98
 [   83.066406] [80745ff4] migration_call+0x10c/0x2e0
 [   83.066406] [80253110] notifier_call_chain+0x44/0x94
 [   83.066406] [8022eae0] __cpu_notify+0x30/0x5c
 [   83.066406] [8072b598] take_cpu_down+0x5c/0x70
 [   83.066406] [80299ba4] stop_machine_cpu_stop+0x104/0x1e8
 [   83.066406] [802997cc] cpu_stopper_thread+0x110/0x1ac
 [   83.066406] [8024c940] kthread+0x88/0x90
 [   83.066406] [80205ee4] kernel_thread_helper+0x10/0x18

 Weird, that's from a CPU_DYING call, I thought those were with IRQs
 disabled.

 Look at how __stop_machine() calls the function with IRQs disabled for !
 stop_machine_initialized or !SMP. Also stop_machine_cpu_stop() seems to
 disabled interrupts, so how do we end up calling take_cpu_down() with
 IRQs enabled?

 That simply doesn't make any sense.
I see, this is an arch-specific bug, sorry for my carelessness and thank
you for your tips.


 Signed-off-by: Huacai Chen che...@lemote.com
 ---
  kernel/sched/core.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 36e2666..703754a 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -1468,9 +1468,10 @@ static void sched_ttwu_pending(void)
  {
  struct rq *rq = this_rq();
  struct llist_node *llist = llist_del_all(rq-wake_list);
 +unsigned long flags;
  struct task_struct *p;

 -raw_spin_lock(rq-lock);
 +raw_spin_lock_irqsave(rq-lock, flags);

  while (llist) {
  p = llist_entry(llist, struct task_struct, wake_entry);
 @@ -1478,7 +1479,7 @@ static void sched_ttwu_pending(void)
  ttwu_do_activate(rq, p, 0);
  }

 -raw_spin_unlock(rq-lock);
 +raw_spin_unlock_irqrestore(rq-lock,

Re: [RFC][PATCH] sched: Fix a deadlock of cpu-hotplug

2012-10-24 Thread 陈华才


 On Wed, 2012-10-24 at 20:34 +0800, 陈华才 wrote:
 I see, this is an arch-specific bug, sorry for my carelessness and thank
 you for your tips.

 What arch are you using? And what exactly did the arch do wrong? Most of
 the code involved seems to be common code.

 Going by c0_compare_interrupt, this is some MIPS device.

Yes, I'm use MIPS, In a place which local_irq_save()/local_irq_restore()
should be used, I use local_irq_disable()/local_irq_enable() by mistake.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Seems like "sched: Add missing call to calc_load_exit_idle()" should be reverted in 3.5 branch

2012-10-12 Thread 陈华才

So I still think that "sched: Add missing call to calc_load_exit_idle()"
should be reverted in 3.5 branch...

> On 10/06/2012 01:23 AM, Peter Zijlstra wrote:
>> On Fri, 2012-10-05 at 10:10 -0700, Jonathan Nieder wrote:
>>> Peter Zijlstra wrote:
 On Thu, 2012-10-04 at 15:27 -0700, Greg Kroah-Hartman wrote:
>>>
> I'm puzzled as well.  Any ideas if I should do anything here or not?

 So I think the current v3.5.5 code is fine.
>>>
>>> Now I'm puzzled.  You wrote:
>>>
>>> | However, since we don't restart the tick, we won't be sampling load
>>> muck
>>> | and calling calc_load_exit_idle() from there is bound to confuse
>>> state.
>>>
>>> Doesn't that mean 900404e5d201 "sched: Add missing call to
>>> calc_load_exit_idle()" which is part of 3.5.5 was problematic?  Or
>>> did I just miscount the number of "not"s?
>>
>>
>> Argh, yeah, so now I've managed to confuse everyone I'm afraid.
>>
>> You are right, v3.5.5 has one calc_load_exit_idle() too many, the one in
>> tick_nohz_update_jiffies() needs to go.
>>
>> Sorry.. I got entirely confused figuring out wth happened with 3.6.
>>
> High loadavg reported with v3.6, and I just checked the upstream code,
> which puzzled many people. Sorry for that~
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Seems like sched: Add missing call to calc_load_exit_idle() should be reverted in 3.5 branch

2012-10-12 Thread 陈华才

So I still think that sched: Add missing call to calc_load_exit_idle()
should be reverted in 3.5 branch...

 On 10/06/2012 01:23 AM, Peter Zijlstra wrote:
 On Fri, 2012-10-05 at 10:10 -0700, Jonathan Nieder wrote:
 Peter Zijlstra wrote:
 On Thu, 2012-10-04 at 15:27 -0700, Greg Kroah-Hartman wrote:

 I'm puzzled as well.  Any ideas if I should do anything here or not?

 So I think the current v3.5.5 code is fine.

 Now I'm puzzled.  You wrote:

 | However, since we don't restart the tick, we won't be sampling load
 muck
 | and calling calc_load_exit_idle() from there is bound to confuse
 state.

 Doesn't that mean 900404e5d201 sched: Add missing call to
 calc_load_exit_idle() which is part of 3.5.5 was problematic?  Or
 did I just miscount the number of nots?


 Argh, yeah, so now I've managed to confuse everyone I'm afraid.

 You are right, v3.5.5 has one calc_load_exit_idle() too many, the one in
 tick_nohz_update_jiffies() needs to go.

 Sorry.. I got entirely confused figuring out wth happened with 3.6.

 High loadavg reported with v3.6, and I just checked the upstream code,
 which puzzled many people. Sorry for that~


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Seems like "sched: Add missing call to calc_load_exit_idle()" should be reverted in 3.5 branch

2012-10-04 Thread 陈华才

My opinion: The original patch "sched/nohz: Rewrite and fix load-avg
computation -- again" is designed for 3.5-branch and calc_load_exit_idle()
is called directly in tick_nohz_idle_exit(). So, the patch can be fully
applied in 3.5 and doesn't need to fix (Add the missing call), but not
fully applied in 3.6 (because code splitted) and need to fix.



> On Thu, Oct 04, 2012 at 08:31:59PM +0200, Peter Zijlstra wrote:
>> On Thu, 2012-10-04 at 10:46 -0700, Greg Kroah-Hartman wrote:
>> > On Thu, Oct 04, 2012 at 12:11:01PM +0800, Huacai Chen wrote:
>> > > Hi, Greg
>> > >
>> > > I found that Linux-3.5.5 accept this commit "sched: Add missing call
>> > > to calc_load_exit_idle()" but I think this isn't needed. Because
>> > > "5167e8d5417b sched/nohz: Rewrite and fix load-avg computation --
>> > > again not fully applied" is true for 3.6 branch, but not for 3.5
>> > > branch.
>> >
>> > But 5167e8d5417b is in 3.5, so shouldn't this commit still be
>> necessary?
>> >
>> > > In 3.5 branch, calc_load_exit_idle() is already called in
>> > > tick_nohz_idle_exit(), it doesn't need to be called at
>> > > tick_nohz_update_jiffies() again. In 3.6 branch, some code of
>> > > tick_nohz_idle_exit() is splitted to tick_nohz_restart_sched_tick()
>> > > and calc_load_exit_idle() is missing by accident, so commit "sched:
>> > > Add missing call to calc_load_exit_idle()" is needed.
>> >
>> > So this really should be dropped from 3.5?  Charles, Peter, Ingo, any
>> > thoughts here?
>>
>> Bah, lots of code movement there recently.. let me try and untangle all
>> that afresh.. /me checks out v3.5.5.
>>
>> OK, assuming ->tick_stopped means what the label says, then we only want
>> to call calc_load_enter_idle() when it flips to 1 and
>> calc_load_exit_idle() when it flips back to 0, such that when an actual
>> tick happens its got correct state.
>>
>> Now the actual patch "5167e8d5417b sched/nohz: Rewrite and fix load-avg
>> computation -- again not fully applied" modifies
>> tick_nohz_restart_sched_tick() which doesn't appear to exist in v3.5.5
>> and the patch fobbed it into tick_nohz_update_jiffies() which is called
>> from interrupt entry when nohz-idle so that the interrupt (and possible
>> tailing softirq) see a valid jiffies count.
>>
>> However, since we don't restart the tick, we won't be sampling load muck
>> and calling calc_load_exit_idle() from there is bound to confuse state.
>>
>> I hope.. damn this code ;-)
>>
>> I can't find wtf went wrong either, the initial patch 5167e8d5417bf5c
>> contains both hunks, but in that case the fixup 749c8814f0 doesn't make
>> sense, not can I find anything in merge commits using:
>>
>>   git log -S calc_load_exit_idle kernel/time/tick-sched.c
>>
>> /me puzzled
>
> I'm puzzled as well.  Any ideas if I should do anything here or not?
>
> greg k-h
>


-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Seems like sched: Add missing call to calc_load_exit_idle() should be reverted in 3.5 branch

2012-10-04 Thread 陈华才

My opinion: The original patch sched/nohz: Rewrite and fix load-avg
computation -- again is designed for 3.5-branch and calc_load_exit_idle()
is called directly in tick_nohz_idle_exit(). So, the patch can be fully
applied in 3.5 and doesn't need to fix (Add the missing call), but not
fully applied in 3.6 (because code splitted) and need to fix.



 On Thu, Oct 04, 2012 at 08:31:59PM +0200, Peter Zijlstra wrote:
 On Thu, 2012-10-04 at 10:46 -0700, Greg Kroah-Hartman wrote:
  On Thu, Oct 04, 2012 at 12:11:01PM +0800, Huacai Chen wrote:
   Hi, Greg
  
   I found that Linux-3.5.5 accept this commit sched: Add missing call
   to calc_load_exit_idle() but I think this isn't needed. Because
   5167e8d5417b sched/nohz: Rewrite and fix load-avg computation --
   again not fully applied is true for 3.6 branch, but not for 3.5
   branch.
 
  But 5167e8d5417b is in 3.5, so shouldn't this commit still be
 necessary?
 
   In 3.5 branch, calc_load_exit_idle() is already called in
   tick_nohz_idle_exit(), it doesn't need to be called at
   tick_nohz_update_jiffies() again. In 3.6 branch, some code of
   tick_nohz_idle_exit() is splitted to tick_nohz_restart_sched_tick()
   and calc_load_exit_idle() is missing by accident, so commit sched:
   Add missing call to calc_load_exit_idle() is needed.
 
  So this really should be dropped from 3.5?  Charles, Peter, Ingo, any
  thoughts here?

 Bah, lots of code movement there recently.. let me try and untangle all
 that afresh.. /me checks out v3.5.5.

 OK, assuming -tick_stopped means what the label says, then we only want
 to call calc_load_enter_idle() when it flips to 1 and
 calc_load_exit_idle() when it flips back to 0, such that when an actual
 tick happens its got correct state.

 Now the actual patch 5167e8d5417b sched/nohz: Rewrite and fix load-avg
 computation -- again not fully applied modifies
 tick_nohz_restart_sched_tick() which doesn't appear to exist in v3.5.5
 and the patch fobbed it into tick_nohz_update_jiffies() which is called
 from interrupt entry when nohz-idle so that the interrupt (and possible
 tailing softirq) see a valid jiffies count.

 However, since we don't restart the tick, we won't be sampling load muck
 and calling calc_load_exit_idle() from there is bound to confuse state.

 I hope.. damn this code ;-)

 I can't find wtf went wrong either, the initial patch 5167e8d5417bf5c
 contains both hunks, but in that case the fixup 749c8814f0 doesn't make
 sense, not can I find anything in merge commits using:

   git log -S calc_load_exit_idle kernel/time/tick-sched.c

 /me puzzled

 I'm puzzled as well.  Any ideas if I should do anything here or not?

 greg k-h



-- 
江苏中科梦兰电子科技有限公司

软件部 陈华才

E-mail: che...@lemote.com

Web: http://www.lemote.com/

Add: 江苏省常熟市虞山镇梦兰工业园

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

92 matches

Mail list logo