[PATCH] netfilter: fix dependency issues between IPv6 defragmentation and ip6tables

2015-05-03 Thread Liu Hua
commit f6318e558806c925029dc101f14874be9f9fa78f fix some related issue
when ip6tables is enabled. But when IP6_NF_IPTABLES is disabled and
NETFILTER_XT_TARGET_TPROXY is enabled. We will meet build failure with
"net/built-in.o: In function `tproxy_tg_init':
net/netfilter/xt_TPROXY.c:588: undefined reference to `nf_defrag_ipv6_enable'
"
So this patch change the Kconfig as ipv4 does.

Signed-off-by: Liu Hua 
---
 net/netfilter/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index f70e34a..34f54a8 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -865,7 +865,7 @@ config NETFILTER_XT_TARGET_TPROXY
depends on (IPV6 || IPV6=n)
depends on IP_NF_MANGLE
select NF_DEFRAG_IPV4
-   select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
+   select NF_DEFRAG_IPV6
help
  This option adds a `TPROXY' target, which is somewhat similar to
  REDIRECT.  It can only be used in the mangle table and is useful
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] irqchip: mark the irq status in handle_percpu_devid_irq

2015-01-28 Thread Liu Hua
Function handle_percpu_devid_irq lacks of irq's state
changing. The handler may fail. So we should mark this
irq's status. It is very useful when we deploy kdump
and the handler fails with panic. The kdump can deal
with this situaton in machine_kexec_mask_interrupts,
which will do some works to the interrupt controller
(write EOI..).

Without this patch, kdump can not go through if first
kernel panics in per-cpu interrupt handler(like hrtimers).
I can reduce this bugs on ARM SMP platform with per cpu
timers.

I have send a related patchset. But I think that is not the
best way to solve this problem.

  link:lkml.org/lkml/2014/8/4/18

Signed-off-by: Liu Hua 
---
 kernel/irq/chip.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6f1c7a5..3328a45 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -710,10 +710,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct 
irq_desc *desc)
if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
 
+   raw_spin_lock(&desc->lock);
+   irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
+   raw_spin_unlock(&desc->lock);
+
trace_irq_handler_entry(irq, action);
res = action->handler(irq, dev_id);
trace_irq_handler_exit(irq, action, res);
 
+   raw_spin_lock(&desc->lock);
+   irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
+   raw_spin_unlock(&desc->lock);
+
if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
 }
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Fwd: [RFD]:Is there any method to snapshot all information from both tasks on cpus and cpus when kernel crash occurs

2014-10-24 Thread Liu hua
Hi all,

For arm32 platform. I want to know whether or not a mechanism in kernel can 
snapshot tasks and cpus
information as the follow. If not, such a mechanism can be accepted by kernel.



...





TASK: [pid 1], [tgid 1], [linuxrc] [state S] [policy 0] [cpu 3]

 EIP address: [<401945a4>] 0x401945a4

   r0 = ;   r1 = 

   r2 = ;  r3 = 

   r4 = ;  r5 = 

   r6 = ;   r7 = 0072

   r8 = 0050;  r9 = 

   r10 = beec1e34; fp = 

   ip = 00095414;  sp = beec1bc8

   lr = 00069f2c;pc = 401945a4

   cpsr = 6210;

[] (schedule+0x4fc/0x61c)

[] (do_wait+0x1c0/0x22c)

[] (sys_wait4+0xa0/0xc0)

[] (ret_fast_syscall+0x0/0x3c)

TASK: [pid 2], [tgid 2], [kthreadd] [state S] [policy 0] [cpu 1]

 EIP address: [] kthreadd+0x8c/0x14c

[] (schedule+0x4fc/0x61c)

[] (kthreadd+0x8c/0x14c)

[] (kernel_thread_exit+0x0/0x8)

...





TASK: [pid 29666], [tgid 29666], [kstop/1] [state R] [policy 1] [cpu 1]

[] (ksnapshot_unwind_backtrace.part.2+0x50/0x154)

[] (us_dump_stack+0x38/0x44 [rtos_snapshot])

[] (snapshot_cpu_info+0x60/0x4a8 [rtos_snapshot])

[] (ksnapshot_taskinfo_buffer+0x320/0x4a8 [rtos_snapshot])

[] (fiq_callcack_handler+0x18/0x30 [test_ks])

[] (fiq_real_handle+0x24/0x30)

[] (do_IPI+0x64/0x1c0)

[] (__irq_svc+0x44/0xe0)

[] (stop_cpu+0x104/0x124)

[] (worker_thread+0x1a4/0x248)

[] (kthread+0x78/0x84)

[] (kernel_thread_exit+0x0/0x8)



...



...



Thanks
Yan











--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Documentation: correct parameter error for dma_mapping_error

2014-09-17 Thread Liu Hua
dma_mapping_error takes two parameters, but some of examples
in Documentation/DMA-API-HOWTO.txt just takes one. So correct
it.

Signed-off-by: Liu Hua 
---
 Documentation/DMA-API-HOWTO.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt
index dcbbe36..0f7afb2 100644
--- a/Documentation/DMA-API-HOWTO.txt
+++ b/Documentation/DMA-API-HOWTO.txt
@@ -531,7 +531,7 @@ To map a single region, you do:
size_t size = buffer->len;
 
dma_handle = dma_map_single(dev, addr, size, direction);
-   if (dma_mapping_error(dma_handle)) {
+   if (dma_mapping_error(dev, dma_handle)) {
/*
 * reduce current DMA mapping usage,
 * delay and try again later or
@@ -588,7 +588,7 @@ Specifically:
size_t size = buffer->len;
 
dma_handle = dma_map_page(dev, page, offset, size, direction);
-   if (dma_mapping_error(dma_handle)) {
+   if (dma_mapping_error(dev, dma_handle)) {
/*
 * reduce current DMA mapping usage,
 * delay and try again later or
@@ -689,7 +689,7 @@ to use the dma_sync_*() interfaces.
dma_addr_t mapping;
 
mapping = dma_map_single(cp->dev, buffer, len, DMA_FROM_DEVICE);
-   if (dma_mapping_error(dma_handle)) {
+   if (dma_mapping_error(cp->dev, dma_handle)) {
/*
 * reduce current DMA mapping usage,
 * delay and try again later or
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts

2014-08-06 Thread Liu hua
On 2014/8/6 17:46, Marc Zyngier wrote:
> Hi Liu,
> 
> On 06/08/14 09:43, Liu hua wrote:
>> 于 2014/8/4 17:43, Marc Zyngier 写道:
>>> Hi Liu,
>>>
>>> On 04/08/14 05:17, Liu Hua wrote:
>>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>>> fails to boot.
>>>>
>>>> On this situation, We have read register GICC_IAR. But we have no chance
>>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>>> the state of this type interrupt remains active. And that makes gic not
>>>> deliver this type interrupt to cpu interface.
>>>>
>>>> So we should not assume that all interrut states of GIC are inactive when
>>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>>> and deactive them
>>>>
>>>> Signed-off-by: Liu Hua 
>>>> ---
>>>>  drivers/irqchip/irq-gic.c | 26 ++
>>>>  1 file changed, 26 insertions(+)
>>>>
>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>> index b2648fc..7708df1 100644
>>>> --- a/drivers/irqchip/irq-gic.c
>>>> +++ b/drivers/irqchip/irq-gic.c
>>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>>return mask;
>>>>  }
>>>>  
>>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>>> +{
>>>> +  int bit = -1;
>>>> +
>>>> +  for_each_set_bit(bit, (unsigned long *)&active, 32)
>>>> +  writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>>> +}
>>>> +
>>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>>> +  void __iomem *cpu_base, int gic_irqs)
>>>> +{
>>>> +  int irq, offset;
>>>> +  u32 active;
>>>> +
>>>> +  for (irq = 0; irq < gic_irqs; irq += 32) {
>>>> +  offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>>> +  active = readl_relaxed(dist_base + offset);
>>>> +  if (!active)
>>>> +  continue;
>>>> +  gic_eois(active, irq, cpu_base);
>>>> +  }
>>>> +}
>>>> +
>>>> +
>>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>  {
>>>>unsigned int i;
>>>>u32 cpumask;
>>>>unsigned int gic_irqs = gic->gic_irqs;
>>>>void __iomem *base = gic_data_dist_base(gic);
>>>> +  void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>  
>>>>writel_relaxed(0, base + GIC_DIST_CTRL);
>>>>  
>>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data 
>>>> *gic)
>>>>  
>>>>gic_dist_config(base, gic_irqs, NULL);
>>>>  
>>>> +  gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>>writel_relaxed(1, base + GIC_DIST_CTRL);
>>>>  }
>>>
>>> So while this is solving a real issue, I don't think you can just fix it
>>> for the UP case. You'll have to fix the same thing for secondary CPUs
>>> (shouldn't be too hard to split things between local and global interrupts).
>> Hi Marc,
>>
>> Thanks very much for you reply!
>>
>> when I tried to implement your ideas. I found that: when kdump is deployed
>> and without my patch,
>>
>> (1) panic in PPI, the capture kernel can not boot up.
>> (2) panic in SPI, the capture kernel boot up regularly.
>>
>> I was confused and there may be something I did not catch. I glanced the 
>> kdump
>> code and found that function machine_kexec_mask_interrupts. It will clear the
>> GIC active state only if the IRQD_IRQ_INPROGRESS bit in 
>> d->state_use_accessors
>> is set.
>>
>> And the PPI handler does not set this flag. So there are two ways to solve 
>> this
>> problem.
>>
>>  (1) consider this problem common, as you and I thought before. we should 
>> fix secondary CPUs issues;
>>
>>
>>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
>>
>>  -(2) patch start---
>>  diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>  index a2b28a2..0a5dfe0 100644
>>  --- a/kernel/irq/chip.c
>> 

Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts

2014-08-06 Thread Liu hua
于 2014/8/4 17:43, Marc Zyngier 写道:
> Hi Liu,
> 
> On 04/08/14 05:17, Liu Hua wrote:
>> When using kdump on ARM platform, if kernel panics in interrupt handler
>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>> fails to boot.
>>
>> On this situation, We have read register GICC_IAR. But we have no chance
>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>> the state of this type interrupt remains active. And that makes gic not
>> deliver this type interrupt to cpu interface.
>>
>> So we should not assume that all interrut states of GIC are inactive when
>> kernel inittailize the GIC. This patch will identify these type interrupts
>> and deactive them
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  drivers/irqchip/irq-gic.c | 26 ++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>> index b2648fc..7708df1 100644
>> --- a/drivers/irqchip/irq-gic.c
>> +++ b/drivers/irqchip/irq-gic.c
>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>  return mask;
>>  }
>>  
>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>> +{
>> +int bit = -1;
>> +
>> +for_each_set_bit(bit, (unsigned long *)&active, 32)
>> +writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>> +}
>> +
>> +void gic_dist_clear_active(void __iomem *dist_base,
>> +void __iomem *cpu_base, int gic_irqs)
>> +{
>> +int irq, offset;
>> +u32 active;
>> +
>> +for (irq = 0; irq < gic_irqs; irq += 32) {
>> +offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>> +active = readl_relaxed(dist_base + offset);
>> +if (!active)
>> +continue;
>> +gic_eois(active, irq, cpu_base);
>> +}
>> +}
>> +
>> +
>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>  {
>>  unsigned int i;
>>  u32 cpumask;
>>  unsigned int gic_irqs = gic->gic_irqs;
>>  void __iomem *base = gic_data_dist_base(gic);
>> +void __iomem *cpu_base = gic_data_cpu_base(gic);
>>  
>>  writel_relaxed(0, base + GIC_DIST_CTRL);
>>  
>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data 
>> *gic)
>>  
>>  gic_dist_config(base, gic_irqs, NULL);
>>  
>> +gic_dist_clear_active(base, cpu_base, gic_irqs);
>>  writel_relaxed(1, base + GIC_DIST_CTRL);
>>  }
> 
> So while this is solving a real issue, I don't think you can just fix it
> for the UP case. You'll have to fix the same thing for secondary CPUs
> (shouldn't be too hard to split things between local and global interrupts).
Hi Marc,

Thanks very much for you reply!

when I tried to implement your ideas. I found that: when kdump is deployed
and without my patch,

(1) panic in PPI, the capture kernel can not boot up.
(2) panic in SPI, the capture kernel boot up regularly.

I was confused and there may be something I did not catch. I glanced the kdump
code and found that function machine_kexec_mask_interrupts. It will clear the
GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
is set.

And the PPI handler does not set this flag. So there are two ways to solve this
problem.

 (1) consider this problem common, as you and I thought before. we should fix 
secondary CPUs issues;


 (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:

-(2) patch start---
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index a2b28a2..0a5dfe0 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, 
struct irq_desc *desc)
if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);

+ raw_spin_lock(&desc->lock);
+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
+ raw_spin_unlock(&desc->lock);
+
trace_irq_handler_entry(irq, action);
res = action->handler(irq, dev_id);
trace_irq_handler_exit(irq, action, res);

+ raw_spin_lock(&desc->lock);
+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
+ raw_spin_unlock(&desc->lock);
+
if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
 }
-(2) patch end---

Way 2 seems to be needed anyway.
For way 1, I do not find another situation that the gic interrupt states 
remains active when kernel booting.
And for kdump process, Way 2 is enough.

What do you think about them?

Thanks,
Liu Hua

> Thanks,
> 
>   M.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-08-05 Thread Liu hua
于 2014/8/6 10:51, Kees Cook 写道:
> On Fri, May 30, 2014 at 12:25 PM, Nicolas Pitre
>  wrote:
>> On Fri, 30 May 2014, Rob Herring wrote:
>>
>>> There's work in flight to support early_ioremap, early console, and RO
>>> text patching which all use the fixmap region.
>>>
>>> There's a couple of options to solve this:
>>>
>>> - Only support up to 16 cpus. It could be anywhere between 17-31, but
>>> that seems somewhat unlikely. Are we really ever going to see 32-bit
>>> 32 core systems?
>>
>> I wouldn't rule that out.  I've seen 16-core ARM chips in 2008 (although
>> they didn't go into production).  Silly limitations like that always
>> come back to bite you.  And we have better alternatives.
>>
>>> - Reduce KM_TYPE_NR from 16 to 15. Based on the comment for it, we
>>> probably don't want to do that. Is increasing it to the default of 20
>>> worthwhile? Some of the options here would allow doing that.
>>> - Add 0xffe0-0xfff0 to the fixmap region. This would make
>>> fixmap span 2 PMDs with the top PMD having a mixture of uses like we
>>> had before.
>>
>> That would be my preferred approach.  Note here it could be
>> 0xffe0-0xfffe to include the whole of the previous fixmap area
>> curently unused.
>>
>>> - push the PCI i/o space down to 0xfec0 and make fixmap 4MB. This
>>> is a cleaner solution as the 2 PMDs are only used for fixmap. This may
>>> require some static mapping adjustments on some platforms.
>>
>> No need.  With the latest changes, the fixmap area is between 0xffc0
>> and 0xffe0 (there is apparently a mistake in
>> Documentation/arm/memory.txt).  So currently 0xff00-0xffc0 is
>> free, which makes the fixmap area far away from the PCI i/o area with
>> plenti of space in between.
> 
> So, it seems there is something wrong with this patch series. I had to
> revert "ARM: 8031/2: change fixmap mapping region to support 32 CPUs"
> to make other fixmap changes work correctly. I think this is due to
> the non-highmem config case moving the fixmap to a location where
> there is to page table entry...
Hi Kees,

Did this patch conflict with others, or it will at the future?

As Rob said "There's work in flight to support early_ioremap, early console, 
and RO
text patching which all use the fixmap region."  So if this patch will block 
kernel's
new feature, Should we makes new patch to changes it, not revert it. Because 
there are
arm platforms with more than 14 CPUs.

Thanks,
Liu Hua
> 
> -Kees
> 
>>
>>> - Same as previous option, but convert the PCI i/o space to fixmap
>>> entries. We don't really need all 2MB for PCI.
>>
>> See above.
>>
>>> Also, there is an error in the documentation below:
>>>
>>>>
>>>> Signed-off-by: Liu Hua 
>>>> ---
>>>>  Documentation/arm/memory.txt   |  2 +-
>>
>> Yep, good that you spotted it as well.  I failed to catch it during my
>> review so I'll send a patch.
>>
>>
>> Nicolas
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 0/1] irqchip: GIC: check and clear GIC interupt active state

2014-08-03 Thread Liu Hua
For this version of GIC codes, kernel assumes that all the interrupt
status of GIC is inactive. So the kernel does not check this when 
booting.

This is no problem on must sitations. But when kdump is deplayed.
And a panic occurs when an interrupt is being handled (may be PPI 
). We have no chance to write relative bit to GICC_EOIR. So this 
interrupt remains active. And GIC will not deliver this type 
interrupt to cpu interface. And the capture kernel may  fail to boot
becase of lacking of certain interrupt (such as timer interupt).

I have test this patch on arma9el(GIC v1), arma15el and arma15eb(GIC v2) 
platforms. And the tests passed.

changes from V1:

  - used for_each_set_bit instead of find_next_bit
  - removed the GIC version indentifying codes.
  - used one way to inactive GIC interupt states for all GIC version

Liu Hua (1):
  GIC: introduce method to deactive interupts

 drivers/irqchip/irq-gic.c | 26 ++
 1 file changed, 26 insertions(+)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 1/1] GIC: introduce method to deactive interupts

2014-08-03 Thread Liu Hua
When using kdump on ARM platform, if kernel panics in interrupt handler
(maybe PPI), the capture kernel can not recive certain interrupt, and 
fails to boot.

On this situation, We have read register GICC_IAR. But we have no chance
to write relative bit to register GICC_EOIR (kernel paniced before). So
the state of this type interrupt remains active. And that makes gic not
deliver this type interrupt to cpu interface.

So we should not assume that all interrut states of GIC are inactive when
kernel inittailize the GIC. This patch will identify these type interrupts
and deactive them

Signed-off-by: Liu Hua 
---
 drivers/irqchip/irq-gic.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index b2648fc..7708df1 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
return mask;
 }
 
+void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
+{
+   int bit = -1;
+
+   for_each_set_bit(bit, (unsigned long *)&active, 32)
+   writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
+}
+
+void gic_dist_clear_active(void __iomem *dist_base,
+   void __iomem *cpu_base, int gic_irqs)
+{
+   int irq, offset;
+   u32 active;
+
+   for (irq = 0; irq < gic_irqs; irq += 32) {
+   offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
+   active = readl_relaxed(dist_base + offset);
+   if (!active)
+   continue;
+   gic_eois(active, irq, cpu_base);
+   }
+}
+
+
 static void __init gic_dist_init(struct gic_chip_data *gic)
 {
unsigned int i;
u32 cpumask;
unsigned int gic_irqs = gic->gic_irqs;
void __iomem *base = gic_data_dist_base(gic);
+   void __iomem *cpu_base = gic_data_cpu_base(gic);
 
writel_relaxed(0, base + GIC_DIST_CTRL);
 
@@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
gic_dist_config(base, gic_irqs, NULL);
 
+   gic_dist_clear_active(base, cpu_base, gic_irqs);
writel_relaxed(1, base + GIC_DIST_CTRL);
 }
 
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/2] irqchip: GIC: check and clear GIC interupt active status

2014-07-13 Thread Liu hua
On 2014/7/11 20:35, Will Deacon wrote:
> [adding Marc]
> 
> On Fri, Jul 11, 2014 at 07:46:15AM +0100, Liu Hua wrote:
>> For this version of GIC codes, kernel assumes that all the interrupt
>> status of GIC is inactive. So the kernel does not check this when 
>> booting.
>>
>> This is no problem on must sitations. But when kdump is deplayed.
>> And a panic occurs when a interrupt is being handled (may be PPI 
>> and SPI). We have no chance to write relative bit to GICC_EOIR.
>> So this interrupt remains active. And GIC will not deliver this
>> type interrupt to cpu interface. And the capture kernel may 
>> fail to boot becase of lacking of certain interrupt (such as timer 
>> interupt).
>>
>>
>> I glanced over the GIC Architecture Specification, but did not 
>> find a simple way to deactive state of all interrupts. For GICv1,
>> I can only deal with one abnormal interrupt state one time. For 
>> GICv2, I can deactive 32 one time.
>>
>>
>> So guys, Do you know a better way to do that? 
> 
> What happens if, in the crash kernel, you disable the CPU interfaces
> (GICC_CTLR.ENABLE) then disable the distributor (GICD_CTLR.ENABLE) before
> enabling everything again in the reverse order? Is that enough to cause the
> GIC to drop any active states? It's not clear to me from a quick look at
> the TRM.
> 
Hi Will,

Thanks for your reply!

I did what you said at the beginning of "gic_dist_init". The active states
remained (panic in local timer interrupt (PPI))and the kernel failed to boot,
Did I do that at wrong place?

---
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index b6b0a81..94d6352 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -454,6 +455,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
void __iomem *base = gic_data_dist_base(gic);
void __iomem *cpu_base = gic_data_cpu_base(gic);

+ writel_relaxed(0, base + GIC_CPU_CTRL);
writel_relaxed(0, base + GIC_DIST_CTRL);

/*


As shown in GIC Architecture Specification manual,I think that the 
GICC_CTLR.ENABLE
and GICD_CTLR.ENABLE only control the delivering of the interrupt, not the 
active
states.

As GIC manual says "For every read of a valid Interrupt ID from the GICC_IAR, 
the
connected processor must perform a matching write to the GICC_EOIR". So we 
should
find a way to drop the active states when booting, if we do not remain these 
active
states by design.

Thanks,
Liu Hua




> Will
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/2] irqchip: GIC: check and clear GIC interupt active status

2014-07-11 Thread Liu Hua
For this version of GIC codes, kernel assumes that all the interrupt
status of GIC is inactive. So the kernel does not check this when 
booting.

This is no problem on must sitations. But when kdump is deplayed.
And a panic occurs when a interrupt is being handled (may be PPI 
and SPI). We have no chance to write relative bit to GICC_EOIR.
So this interrupt remains active. And GIC will not deliver this
type interrupt to cpu interface. And the capture kernel may 
fail to boot becase of lacking of certain interrupt (such as timer 
interupt).


I glanced over the GIC Architecture Specification, but did not 
find a simple way to deactive state of all interrupts. For GICv1,
I can only deal with one abnormal interrupt state one time. For 
GICv2, I can deactive 32 one time.


So guys, Do you know a better way to do that? 


Liu Hua (2):
  irqchip: gic: introduce ICPIDR2 register interface
  irqchip: GIC: introduce method to deactive interupts

 drivers/irqchip/irq-gic.c   | 57 +
 include/linux/irqchip/arm-gic.h |  1 +
 2 files changed, 58 insertions(+)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 1/2] irqchip: GIC: introduce ICPIDR2 register interface

2014-07-11 Thread Liu Hua
Peripheral ID2 Register provides a four-bit architecturally-defined
architecture revision field. So we can identify the GIC verison from
this register. It is useful sometimes.

Signed-off-by: Liu Hua 
---
 include/linux/irqchip/arm-gic.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 45e2d8c..872a562 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -38,6 +38,7 @@
 #define GIC_DIST_SOFTINT   0xf00
 #define GIC_DIST_SGI_PENDING_CLEAR 0xf10
 #define GIC_DIST_SGI_PENDING_SET   0xf20
+#define GIC_DIST_ICPIDR2   0xfe8
 
 #define GICH_HCR   0x0
 #define GICH_VTR   0x4
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 2/2] irqchip: GIC: introduce method to deactive interupts

2014-07-11 Thread Liu Hua
When using kdump on ARM platform, if kernel panics in interrupt handler
(maybe PPI or SPI), the capture kernel can not recive certain interrupt,
and fails to boot.

On this situation, We have read register GICC_IAR. But we have no chance
to write relative bit to register GICC_EOIR(kernel paniced before). So
the state of this type interrupt remains active. Ant that makes gic not
deliver this type interrupt to cpu interface.

So we should not assume that all interruts state of GIC is inactive when
kernel initailize the GIC. This patch identifies this type interrupts
and deactives them;

Signed-off-by: Liu Hua 
---
 drivers/irqchip/irq-gic.c | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index ddee133..d8620cf 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -352,12 +352,68 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
return mask;
 }
 
+void gic_v1_eois(u32 active, void __iomem *cpu_base)
+{
+   int bit = -1;
+
+   while ((bit = find_next_bit((unsigned long *) &active,
+   32, bit + 1)) < 32) {
+   writel_relaxed(bit, cpu_base + GIC_CPU_EOI);
+   }
+}
+
+void gic_v1_clear_active(void __iomem *dist_base,
+   void __iomem *cpu_base, int gic_irqs)
+{
+   int irq, offset;
+   u32 active;
+
+   for (irq = 0; irq < gic_irqs; irq += 32) {
+   offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
+   active = readl_relaxed(dist_base + offset);
+   if (!active)
+   continue;
+   gic_v1_eois(active, cpu_base);
+   }
+}
+
+void gic_v2_clear_active(void __iomem *dist_base, int gic_irqs)
+{
+   int irq, offset;
+   u32 active;
+
+   for (irq = 0; irq < gic_irqs; irq += 32) {
+   offset = irq * 4 / 32 + GIC_DIST_ACTIVE_SET;
+   active = readl_relaxed(dist_base + offset);
+   if (!active)
+   continue;
+   offset = irq * 4 / 32 + GIC_DIST_ACTIVE_CLEAR;
+   writel_relaxed(active, dist_base + offset);
+   }
+}
+
+void __init gic_dist_clear_active(void __iomem *dist_base,
+   void __iomem *cpu_base, int gic_irqs)
+{
+   u32 ArchRev;
+
+   ArchRev = readl_relaxed(dist_base + GIC_DIST_ICPIDR2);
+   ArchRev = (ArchRev >> 4) & 0xF;
+
+   if (ArchRev == 0x1) {
+   gic_v1_clear_active(dist_base, cpu_base, gic_irqs);
+   } else {
+   gic_v2_clear_active(dist_base, gic_irqs);
+   }
+}
+
 static void __init gic_dist_init(struct gic_chip_data *gic)
 {
unsigned int i;
u32 cpumask;
unsigned int gic_irqs = gic->gic_irqs;
void __iomem *base = gic_data_dist_base(gic);
+   void __iomem *cpu_base = gic_data_cpu_base(gic);
 
writel_relaxed(0, base + GIC_DIST_CTRL);
 
@@ -371,6 +427,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
 
gic_dist_config(base, gic_irqs, NULL);
+   gic_dist_clear_active(base, cpu_base, gic_irqs);
 
writel_relaxed(1, base + GIC_DIST_CTRL);
 }
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Should Pstore(ramoops) records customized information?

2014-06-27 Thread Liu hua
于 2014/6/26 8:57, Zhang, Yanmin 写道:
> 
> On 2014/6/25 21:08, Liu hua wrote:
>> 于 2014/6/25 8:41, Zhang, Yanmin 写道:
>>> On 2014/6/20 18:47, Liu hua wrote:
>>>> On 2014/6/20 7:42, Luck, Tony wrote:
>>>>
>>>>>> BTW, I note that "extern struct pstore_info *psinfo" locates in
>>>>>> fs/pstore/internal.h. So users out of directory "fs/pstore/" can not use 
>>>>>> pstore to
>>>>>> record messages. We do not want other kernel users to use pstore, right? 
>>>>>>  And can we
>>>>>> break this?
>>>>> Yes we can make some interface visible to the rest of the kernel ... 
>>>>> probably
>>>>> not the raw "*psinfo" though. Perhaps the pstore_alloc_ring_buffer() and
>>>>> pstore_write_ring_buffer() functions should be the ones exported to the
>>>>> rest of the kernel.
>>>>>
>>>>>> ditoo.. Since other backends like efi and erst may can not privide such 
>>>>>> ring buffer.
>>>>>> So pstore_alloc_ring_buffer should be a funciton pointer of pstore_info 
>>>>>> struct.
>>>>> Yes - that allows less capable backend like ERST and efivars to not 
>>>>> provide the
>>>>> service.  Since it becomes internal, you can drop the "pstore_" prefix.  
>>>>> E.g.
>>>>> something like:
>>>>>
>>>>> int pstore_alloc_ring_buffer(char *name, int size)
>>>>> {
>>>>>  return psinfo->alloc_ring_buffer(name, size);
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(pstore_alloc_ring_buffer);
>>>>>
>>>>> ... and you have to find/make some global header for the "extern" 
>>>>> declaration.
>>>> I will make these RFC patch series according to our discussion. Thanks you 
>>>> very to
>>>> valuable advice.
>>> Sorry for seeing your email late.We already worked out some patches to 
>>> restructure
>>> pstore. Would you like to try patchset 
>>> http://article.gmane.org/gmane.linux.kernel/1697680/?
>>>
>>> We have more patches available to add some flags to disable/enable specific 
>>> zones.
>> That's great! I have tried your patches. BTW, Your patches do not work on 
>> ARM platform,
>> before I changed linker scripts;
> 
> Initially, we just implemented it on x86. It's easy to extend it to ARM. 
> Mostly change the arm
> vmlinux.lds.S to add the sections. Pls. also change setup_arch to allocate 
> memory blocks for
> pstore.
> In the patchset, there is an example patch, including reserve memory and zone 
> examples.
> Pls. reference it.
> 
>>   And can we use this method in modules(I failed to do that)?
> 
> It's a good question. There are many approaches to support modules.
> 1) Define the zone in built-in files and export it.Then, you can use it in 
> module.
> 2) Define the zone and new tracer functions in built-in files and export
> the tracer functions.
> 
>>
>> After a quick glance and try, I think my idea is a little different from 
>> yours. I will reply you
>> later.
> 
> Pls. Share your opinions. We are improving pstore to make it easier to be 
> used.
> 

This feature can use in real products (actually we have done that), because 
usually several
mega-byte-size ram is enough and it is very useful for fault location. So I 
want that pstore
can be implemented in products, not just in labs.

These seems that there are at least two ways to make pstore visible to other 
kernel users:


(1) static allocation:(your way, maybe my description is not good, please 
correct me)

When kernel image is made, zones are determinate;
So if moudules want to use a zone, we should define it in kernel source 
before compiling
and export it;

  (a)advantage:

This method will not fail at most time if it passed at first time.

  (b)disadvantage:

Engineer should change the kernel source code if he want to get a zone 
to record something. For lab,
it is good enough; but for products, different products may use 
different kernel source codes if they
want record different messages. It is very expensive.

(2) dynamic alloction: (ring buffer similar to your zone)

(1) We should introduce metadata to describe the ring buffers in the 
ramoops bankend.
So when initializing, we just need to read metadata. then we know 
information of all ring buffers.
So we can read and manage all 

Re: Should Pstore(ramoops) records customized information?

2014-06-25 Thread Liu hua
于 2014/6/25 8:41, Zhang, Yanmin 写道:
> 
> On 2014/6/20 18:47, Liu hua wrote:
>> On 2014/6/20 7:42, Luck, Tony wrote:
>>
>>>> BTW, I note that "extern struct pstore_info *psinfo" locates in
>>>> fs/pstore/internal.h. So users out of directory "fs/pstore/" can not use 
>>>> pstore to
>>>> record messages. We do not want other kernel users to use pstore, right?  
>>>> And can we
>>>> break this?
>>> Yes we can make some interface visible to the rest of the kernel ... 
>>> probably
>>> not the raw "*psinfo" though. Perhaps the pstore_alloc_ring_buffer() and
>>> pstore_write_ring_buffer() functions should be the ones exported to the
>>> rest of the kernel.
>>>
>>>> ditoo.. Since other backends like efi and erst may can not privide such 
>>>> ring buffer.
>>>> So pstore_alloc_ring_buffer should be a funciton pointer of pstore_info 
>>>> struct.
>>> Yes - that allows less capable backend like ERST and efivars to not provide 
>>> the
>>> service.  Since it becomes internal, you can drop the "pstore_" prefix.  
>>> E.g.
>>> something like:
>>>
>>> int pstore_alloc_ring_buffer(char *name, int size)
>>> {
>>> return psinfo->alloc_ring_buffer(name, size);
>>> }
>>> EXPORT_SYMBOL_GPL(pstore_alloc_ring_buffer);
>>>
>>> ... and you have to find/make some global header for the "extern" 
>>> declaration.
>> I will make these RFC patch series according to our discussion. Thanks you 
>> very to
>> valuable advice.
> 
> Sorry for seeing your email late.We already worked out some patches to 
> restructure
> pstore. Would you like to try patchset 
> http://article.gmane.org/gmane.linux.kernel/1697680/?
> 
> We have more patches available to add some flags to disable/enable specific 
> zones.

That's great! I have tried your patches. BTW, Your patches do not work on ARM 
platform,
before I changed linker scripts; And can we use this method in modules(I failed 
to do that)?

After a quick glance and try, I think my idea is a little different from yours. 
I will reply you
later.


> Yanmin
> 
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Should Pstore(ramoops) records customized information?

2014-06-20 Thread Liu hua
On 2014/6/20 7:42, Luck, Tony wrote:

>> BTW, I note that "extern struct pstore_info *psinfo" locates in
>> fs/pstore/internal.h. So users out of directory "fs/pstore/" can not use 
>> pstore to
>> record messages. We do not want other kernel users to use pstore, right?  
>> And can we
>> break this?
> 
> Yes we can make some interface visible to the rest of the kernel ... probably
> not the raw "*psinfo" though. Perhaps the pstore_alloc_ring_buffer() and
> pstore_write_ring_buffer() functions should be the ones exported to the
> rest of the kernel.
> 
>> ditoo.. Since other backends like efi and erst may can not privide such ring 
>> buffer.
>> So pstore_alloc_ring_buffer should be a funciton pointer of pstore_info 
>> struct.
> 
> Yes - that allows less capable backend like ERST and efivars to not provide 
> the
> service.  Since it becomes internal, you can drop the "pstore_" prefix.  E.g.
> something like:
> 
> int pstore_alloc_ring_buffer(char *name, int size)
> {
>   return psinfo->alloc_ring_buffer(name, size);
> }
> EXPORT_SYMBOL_GPL(pstore_alloc_ring_buffer);
> 
> ... and you have to find/make some global header for the "extern" declaration.

I will make these RFC patch series according to our discussion. Thanks you very 
to
valuable advice.

Thanks,
Liu Hua




> 
> -Tony
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Should Pstore(ramoops) records customized information?

2014-06-19 Thread Liu hua
On 2014/6/19 1:50, Luck, Tony wrote:

Hi Tony,

Thanks for you reply.

>> (1) The backend (ramoops) provides servel memory regions staticly. Each 
>> region
>>  is a ring buffer, which does not connect with certain PSTORE_TYPE_ID. So no 
>> one
>>  can modify or use it before allocation.
>>
>> (2) A pstore user allocs a memory region, pstore will return a 
>> pstore_type_id.
>>
>>pstore_type_id = alloc_pstroe_region()
>>
>> (3) This user record certain message to this region.
>>
>>psinfo->write(pstore_type_id, ...)
> 
> Don't you need to match up the number of back-end ring buffer regions
> with the number of users in the kernel that call alloc_pstore_region()?
> 
> Or do you envision that the backend can create these regions on demand?

I have do some experiments on ramoops. This may be realizable.

This idea comes from real products'demands . Becasue ram is rather cheap and
we do not need to add new hardware, product engineers want to record several
kinds of messages into reserved ram. (including kernel snapshot, softlookup, 
ftrace;
panic, even the user-space events and so on). Different products usually care 
about
different messages.

So we realized a mechanism named "KBOX" to provide ring-buffer alloction on 
reserved memory.
Kernel users can allocate and use a ring buffer. I think pstore(ramoops) may 
also need
this feature.

BTW, I note that "extern struct pstore_info *psinfo" locates in
fs/pstore/internal.h. So users out of directory "fs/pstore/" can not use pstore 
to
record messages. We do not want other kernel users to use pstore, right?  And 
can we
break this?

> 
> Would different users need different sized regions? I think logging of
> console messages might be able to work with a smaller ring buffer
> than the ftrace logger. So perhaps we need a "size" argument when allocating?

Yes, I will add this to my RFC patches.
> 
> Since these "regions" are in fact "ring buffers", the name of the allocation
> routine should make that clear.  So call it "pstore_alloc_ring_buffer()"
Yes, ditto..

> After the system hangs/crashes ... how would you like pstore to
> name these objects in /sys/fs/pstore/ for applications to pick them
> up for analysis?  Maybe pstore_alloc_ring_buffer() needs a "char *name"
> argument as well as a size?
> 
ditoo.. Since other backends like efi and erst may can not privide such ring 
buffer.
So pstore_alloc_ring_buffer should be a funciton pointer of pstore_info struct.

Thanks very much again. if pstore can accept this feature, it will be a great 
news for us.
we will drop our "KBOX" gradually, using pstore instead. If necessary, I will 
try to send
patch series to do this. What do you think about it?

Thanks,
Liu Hua



> -Tony
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Should Pstore(ramoops) records customized information?

2014-06-18 Thread Liu hua
Hi Kees or Colin or Tony or Anton,

We are very interested in Pstore, which provides a mechanism to save information
when machine is going to die. It is much lighter than kdump. So we can deploy 
it on
our real products. Because our product runs on several platforms (x86,arm and 
mips),
we prefer ramoops as pstore backend. For ramoops backend, now we can save dmesg,
console and ftrace inforamtion to different memory regions. It is very good, but
can we do something more?

For kmsg_dumper or console, we mixed messages together. So some important 
meaasges
may be flooded with dispensable ones. Pstore does not provide a way to let us
determine which message to record, which to discard. And

So can we introduce a customized information recording mechanism?

Something like this:

(1) The backend (ramoops) provides servel memory regions staticly. Each region
 is a ring buffer, which does not connect with certain PSTORE_TYPE_ID. So no one
 can modify or use it before allocation.

(2) A pstore user allocs a memory region, pstore will return a pstore_type_id.

pstore_type_id = alloc_pstroe_region()

(3) This user record certain message to this region.

psinfo->write(pstore_type_id, ...)


By doing this:

(1) The console and ftrace message recording is also supported. we just need to 
call
alloc_pstore_region() before saving such messages.

(2) We can realize a mechanism like black box in aircraft. if we record certain 
kind of
messages to a sigle region. We do not need to care other type messages to 
overlap it.
we can allways get the latest messages of each type.

(3) Anyone in kernel or modules can use this mechanism, if they alloc a region.


Thanks,
Liu Hua


.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-06-09 Thread Liu hua
On 2014/5/31 3:25, Nicolas Pitre Wrote:
> On Fri, 30 May 2014, Rob Herring wrote:
> 
>> There's work in flight to support early_ioremap, early console, and RO
>> text patching which all use the fixmap region.
>>
>> There's a couple of options to solve this:
>>
>> - Only support up to 16 cpus. It could be anywhere between 17-31, but
>> that seems somewhat unlikely. Are we really ever going to see 32-bit
>> 32 core systems?
> 
> I wouldn't rule that out.  I've seen 16-core ARM chips in 2008 (although 
> they didn't go into production).  Silly limitations like that always 
> come back to bite you.  And we have better alternatives.New

 Now our team is woring on arma15 with 16 CPUs.
> 
>> - Reduce KM_TYPE_NR from 16 to 15. Based on the comment for it, we
>> probably don't want to do that. Is increasing it to the default of 20
>> worthwhile? Some of the options here would allow doing that.
>> - Add 0xffe0-0xfff0 to the fixmap region. This would make
>> fixmap span 2 PMDs with the top PMD having a mixture of uses like we
>> had before.
> 
> That would be my preferred approach.  Note here it could be 
> 0xffe0-0xfffe to include the whole of the previous fixmap area 
> curently unused.
> 
>> - push the PCI i/o space down to 0xfec0 and make fixmap 4MB. This
>> is a cleaner solution as the 2 PMDs are only used for fixmap. This may
>> require some static mapping adjustments on some platforms.
> 
> No need.  With the latest changes, the fixmap area is between 0xffc0 
> and 0xffe0 (there is apparently a mistake in 
> Documentation/arm/memory.txt).  So currently 0xff00-0xffc0 is 
> free, which makes the fixmap area far away from the PCI i/o area with 
> plenti of space in between.
> 
>> - Same as previous option, but convert the PCI i/o space to fixmap
>> entries. We don't really need all 2MB for PCI.
> 
> See above.
> 
>> Also, there is an error in the documentation below:
>>
>>>
>>> Signed-off-by: Liu Hua 
>>> ---
>>>  Documentation/arm/memory.txt   |  2 +-
> 
> Yep, good that you spotted it as well.  I failed to catch it during my 
> review so I'll send a patch.
> 

Very sorry for the mistake and ignoreing this mail. Maybe I should imporve
my email client!

Thanks again for Nicolas.

Thanks,
Liu Hua
> 
> Nicolas
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] ARM: kdump: Add vmcore_elf64_check_arch

2014-06-09 Thread Liu hua
于 2014/5/6 20:15, Will Deacon 写道:
> On Sat, May 03, 2014 at 02:44:46PM +0100, Liu Hua wrote:
>> Hi Will or Russell,
> 
> Hello,
> 
>> With CONFIG_LPAE=y, memory in 32-bit ARM systems can exceed
>> 4G. So if we use kdump in such systems. The capture kernel
>> should parse 64-bit elf header(parse_crash_elf64_headers).
>>
>> And this process can not pass because ARM linux only provides
>> zero vmcore_elf64_check_arch function by commit 4b3bf7aef.
>>
>> This patch adds check functions related to elf64 header.
> 
> [...]
> 
>> diff --git a/arch/arm/kernel/elf.c b/arch/arm/kernel/elf.c
>> index d0d1e83..452086a 100644
>> --- a/arch/arm/kernel/elf.c
>> +++ b/arch/arm/kernel/elf.c
>> @@ -38,6 +38,39 @@ int elf_check_arch(const struct elf32_hdr *x)
>>  }
>>  EXPORT_SYMBOL(elf_check_arch);
>>  
>> +int elf_check_arch_64(const struct elf64_hdr *x)
>> +{
>> +unsigned int eflags;
>> +
>> +/* Make sure it's an ARM executable */
>> +if (x->e_machine != EM_ARM)
>> +return 0;
>> +
>> +/* Make sure the entry address is reasonable */
>> +if (x->e_entry & 1) {
>> +if (!(elf_hwcap & HWCAP_THUMB))
>> +return 0;
>> +} else if (x->e_entry & 3)
>> +return 0;
>> +
>> +eflags = x->e_flags;
>> +if ((eflags & EF_ARM_EABI_MASK) == EF_ARM_EABI_UNKNOWN) {
>> +unsigned int flt_fmt;
>> +
>> +/* APCS26 is only allowed if the CPU supports it */
>> +if ((eflags & EF_ARM_APCS_26) && !(elf_hwcap & HWCAP_26BIT))
>> +return 0;
>> +
>> +flt_fmt = eflags & (EF_ARM_VFP_FLOAT | EF_ARM_SOFT_FLOAT);
>> +
>> +/* VFP requires the supporting code */
>> +if (flt_fmt == EF_ARM_VFP_FLOAT && !(elf_hwcap & HWCAP_VFP))
>> +return 0;
>> +}
>> +return 1;
>> +}
>> +EXPORT_SYMBOL(elf_check_arch_64);
> 
Hi Will,

Sorry to reply you so late. These days I am working on kdump feature for LPAE 
enabled
kernel. And now I think this patch is not good.

> This function looks identical to elf_check_arch. Why do we need to duplicate
> that code? You could use some pre-processor magic to make the core part of
> the functions agnostic to header type.

At the begging I think I should add elf_check_arch_64 just as elf_check_arch to
do complicated check.

But for ARM32, elf_check_arch_64 would not be used except for kdump. No programs
with elf64 header can be loaded into 32bit ARM kernel. So I think I can just 
check
the "e_machine", just as what other platform does. I think other checks is 
useless
for kdump.

> In fact, if elf_check_arch could handle both header types then the generic
> definition of vmcore_elf64_check_arch in include/linux/crash_dump.h would
> work out of the box.

As I mentioned, I afraid this will make the code hard to understand.( Sorry
for my former incorrect patch).

How about this patch?

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index f4b46d3..9424542 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -97,7 +97,7 @@ struct elf32_hdr;
 extern int elf_check_arch(const struct elf32_hdr *);
 #define elf_check_arch elf_check_arch

-#define vmcore_elf64_check_arch(x) (0)
+#define vmcore_elf64_check_arch(x) ((x)->e_machine == EM_ARM)

 extern int arm_elf_read_implies_exec(const struct elf32_hdr *, int);
 #define elf_read_implies_exec(ex,stk) arm_elf_read_implies_exec(&(ex), stk)


Thanks,
Liu Hua



> Will
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] kexec : add sparse memory related values to vmcore

2014-06-02 Thread Liu hua
On 2014/5/29 8:13, Simon Horman 写道:
> On Wed, May 28, 2014 at 09:49:56PM +0800, Liu Hua wrote:
>> This patch deales with sparse memory model.
>>
>> For ARM32 platforms, different vendors may define different
>> SECTION_SIZE_BITS, which we did not write to vmcore.
>>
>> For example:
>>
>>   1 arch/arm/mach-clps711x/include/mach/memory.h
>> #define SECTION_SIZE_BITS 24
>>   2 arch/arm/mach-exynos/include/mach/memory.h
>> #define SECTION_SIZE_BITS 28
>>   3 arch/arm/mach-sa1100/include/mach/memory.h
>> #define SECTION_SIZE_BITS 27
> 
> I wonder if this problem will eventually go away, or at least only
> apply to older platforms, as ARM moves towards multiplatform: a single
> kernel for more than one platform.


>> It is really a bad news for user space tools such as
>> makedumpfile and crash, who have to defines them as
>> macros. So for the same architecture, we may need to
>> recomile them to parse vmcores with different
>> SECTION_SIZE_BITS.
>>
>> And if we enable LPAE, MAX_PHYSMEM_SIZE can alse
>> be variable.
>>
>> This patch adds these SECTION_SIZE_BITS and MAX_PHYSMEM_SIZE
>> to vmcore. which makes user space tools more compatible.
>>
>> BTW, makedumpfile has queued the related patch.
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  kernel/kexec.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/kexec.c b/kernel/kexec.c
>> index bf0b929e..8b1a193 100644
>> --- a/kernel/kexec.c
>> +++ b/kernel/kexec.c
>> @@ -1577,6 +1577,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>>  VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>>  VMCOREINFO_STRUCT_SIZE(mem_section);
>>  VMCOREINFO_OFFSET(mem_section, section_mem_map);
>> +VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
>> +VMCOREINFO_NUMBER(SECTION_SIZE_BITS);
>>  #endif
>>  VMCOREINFO_STRUCT_SIZE(page);
>>  VMCOREINFO_STRUCT_SIZE(pglist_data);
>> -- 
>> 1.9.0
>>
>>
>> ___
>> kexec mailing list
>> ke...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] kexec : add sparse memory related values to vmcore

2014-05-30 Thread Liu hua
On 2014/5/29 8:13, Simon Horman wrote:
> On Wed, May 28, 2014 at 09:49:56PM +0800, Liu Hua wrote:
>> This patch deales with sparse memory model.
>>
>> For ARM32 platforms, different vendors may define different
>> SECTION_SIZE_BITS, which we did not write to vmcore.
>>
>> For example:
>>
>>   1 arch/arm/mach-clps711x/include/mach/memory.h
>> #define SECTION_SIZE_BITS 24
>>   2 arch/arm/mach-exynos/include/mach/memory.h
>> #define SECTION_SIZE_BITS 28
>>   3 arch/arm/mach-sa1100/include/mach/memory.h
>> #define SECTION_SIZE_BITS 27
> 
> I wonder if this problem will eventually go away, or at least only
> apply to older platforms, as ARM moves towards multiplatform: a single
> kernel for more than one platform.

For ARM32 platform, it may cost a long time. And when glancing over
the commit log of kernel, we can find this macro changed several times.
The user space tools must take care of all these changed for compatibility.

> 
>> It is really a bad news for user space tools such as
>> makedumpfile and crash, who have to defines them as
>> macros. So for the same architecture, we may need to
>> recomile them to parse vmcores with different
>> SECTION_SIZE_BITS.
>>
>> And if we enable LPAE, MAX_PHYSMEM_SIZE can alse
>> be variable.
>>
>> This patch adds these SECTION_SIZE_BITS and MAX_PHYSMEM_SIZE
>> to vmcore. which makes user space tools more compatible.
>>
>> BTW, makedumpfile has queued the related patch.
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  kernel/kexec.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/kexec.c b/kernel/kexec.c
>> index bf0b929e..8b1a193 100644
>> --- a/kernel/kexec.c
>> +++ b/kernel/kexec.c
>> @@ -1577,6 +1577,8 @@ static int __init crash_save_vmcoreinfo_init(void)
>>  VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
>>  VMCOREINFO_STRUCT_SIZE(mem_section);
>>  VMCOREINFO_OFFSET(mem_section, section_mem_map);
>> +VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
>> +VMCOREINFO_NUMBER(SECTION_SIZE_BITS);
>>  #endif
>>  VMCOREINFO_STRUCT_SIZE(page);
>>  VMCOREINFO_STRUCT_SIZE(pglist_data);
>> -- 
>> 1.9.0
>>
>>
>> ___
>> kexec mailing list
>> ke...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH] kexec : add sparse memory related values to vmcore

2014-05-28 Thread Liu Hua
This patch deales with sparse memory model.

For ARM32 platforms, different vendors may define different
SECTION_SIZE_BITS, which we did not write to vmcore.

For example:

  1 arch/arm/mach-clps711x/include/mach/memory.h
#define SECTION_SIZE_BITS 24
  2 arch/arm/mach-exynos/include/mach/memory.h
#define SECTION_SIZE_BITS 28
  3 arch/arm/mach-sa1100/include/mach/memory.h
#define SECTION_SIZE_BITS 27

It is really a bad news for user space tools such as
makedumpfile and crash, who have to defines them as
macros. So for the same architecture, we may need to
recomile them to parse vmcores with different
SECTION_SIZE_BITS.

And if we enable LPAE, MAX_PHYSMEM_SIZE can alse
be variable.

This patch adds these SECTION_SIZE_BITS and MAX_PHYSMEM_SIZE
to vmcore. which makes user space tools more compatible.

BTW, makedumpfile has queued the related patch.

Signed-off-by: Liu Hua 
---
 kernel/kexec.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index bf0b929e..8b1a193 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1577,6 +1577,8 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
+   VMCOREINFO_NUMBER(SECTION_SIZE_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/2] makedumpfile: get additional information from vmcore

2014-05-14 Thread Liu hua
于 2014/5/14 15:53, Atsushi Kumagai 写道:
>> On 2014/5/13 14:21, Atsushi Kumagai wrote:
>>> Hello Liu,
>>>
>>>> Now we define MAX_PHYSMEM_BITS and SECTION_SIZE_BITS as
>>>> macros. So if we deal with vmcores with different values
>>>> of these two macros. We have to recompile makedumpfile.
>>>
>>> There are other macros which have architecture-specific values
>>> (e.g. __PAGE_OFFSET), and some functions are specific to each
>>> architecture (e.g. vaddr_to_paddr()), so we need recompilation
>>> eventually.
>>>
>>> OTOH, we already don't need recompilation for the same architecture
>>> since the values of such macros are defined for each kernel version
>>> like below:
>>>
>>> #ifdef __x86_64__
>>> ...
>>> #define _MAX_PHYSMEM_BITS_ORIG  (40)
>>> #define _MAX_PHYSMEM_BITS_2_6_26(44)
>>> #define _MAX_PHYSMEM_BITS_2_6_31(46)
>>>
>>> So I don't think this patch is valuable.
>>
>> Hi Atsushi,
>>
>> For x86, it is not necessory. But for arm, different venders
>> may define different SECTION_SIZE_BITS. for example:
>>
>>   1 arch/arm/mach-clps711x/include/mach/memory.h
>> #define SECTION_SIZE_BITS 24
>>   2 arch/arm/mach-exynos/include/mach/memory.h
>> #define SECTION_SIZE_BITS 28
>>   4 arch/arm/mach-hisi/include/mach/memory.h
>> #define SECTION_SIZE_BITS 26
>>   8 arch/arm/mach-sa1100/include/mach/memory.h
>> #define SECTION_SIZE_BITS 27
>>
>> Perhaps we should find another way to let the userspace tools
>> to get the architecture-specific values.
> 
> I see, I think this description is better than the first one.
> 
> Now, makedumpfile can't get an appropriate values of the two macros since the
> values are variable even if the architecture and the kernel version are fixed
> (at least for arm), and we can't solve this without *manual code fixing*, 
> right?
> 
> In practice, the current code expects that all arm machines adopt Exynos
> processors, this is an problem definitely.
> 
>   #ifdef __arm__
>   #define KVBASE_MASK (0x)
>   #define KVBASE  (SYMBOL(_stext) & ~KVBASE_MASK)
>   #define _SECTION_SIZE_BITS  (28)
>   #define _MAX_PHYSMEM_BITS   (32)
> 
> I think it's better to fix the descriptions to get acceptability,
> but this patch is necessary from the view point of makedumpfile.
> So I recommend you to repost this patch set, then I'll accept it.
> 
Ok, Thanks for you suggest. I will repost this patch. By now no one
relpy my kernel patch related to this issue, named "[RFC PATCH 1/2]
kdump: add sparse memory related values to vmcore". Didn't I cc
the right person or something else?

BTW, For patch "[PATCH] makedumpfile: ARM: get correct mem_map offset",
Did I explain my idea clearly ? If not, I would like repost one with
more details.

Thanks,
Liu Hua

> 
> Thanks
> Atsushi Kumagai
> 
>>>
>>>> This patch makes makedumpfile get these two values from
>>>> vmcore info, if existing. It makes the makedumpfile more
>>>> compatible to vmcores with different section size.
>>>>
>>>> Signed-off-by: Liu Hua 
>>>> ---
>>>> makedumpfile.c | 17 +
>>>> makedumpfile.h |  2 ++
>>>> 2 files changed, 19 insertions(+)
>>>>
>>>> diff --git a/makedumpfile.c b/makedumpfile.c
>>>> index 6cf6e24..3cdf323 100644
>>>> --- a/makedumpfile.c
>>>> +++ b/makedumpfile.c
>>>> @@ -2111,6 +2111,8 @@ read_vmcoreinfo(void)
>>>>READ_NUMBER("PG_slab", PG_slab);
>>>>READ_NUMBER("PG_buddy", PG_buddy);
>>>>READ_NUMBER("PG_hwpoison", PG_hwpoison);
>>>> +  READ_NUMBER("SECTION_SIZE_BITS", SECTION_SIZE_BITS);
>>>> +  READ_NUMBER("MAX_PHYSMEM_BITS", MAX_PHYSMEM_BITS);
>>>>
>>>>READ_SRCFILE("pud_t", pud_t);
>>>>
>>>> @@ -2998,6 +3000,18 @@ initialize_bitmap_memory(void)
>>>> }
>>>>
>>>> int
>>>> +calibrate_machdep_info(void)
>>>> +{
>>>> +  if (NUMBER(MAX_PHYSMEM_BITS) > 0)
>>>> +  info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
>>>> +
>>>> +  if (NUMBER(SECTION_SIZE_BITS) > 0)
>>>> +  info->section_size_bits = NUMBER(SECTION_SIZE_BITS);
>>>> +
>>>> +  return TRUE;
>>>> +

Re: [RFC PATCH 2/2] makedumpfile: get additional information from vmcore

2014-05-13 Thread Liu hua
On 2014/5/13 14:21, Atsushi Kumagai wrote:
> Hello Liu,
> 
>> Now we define MAX_PHYSMEM_BITS and SECTION_SIZE_BITS as
>> macros. So if we deal with vmcores with different values
>> of these two macros. We have to recompile makedumpfile.
> 
> There are other macros which have architecture-specific values
> (e.g. __PAGE_OFFSET), and some functions are specific to each
> architecture (e.g. vaddr_to_paddr()), so we need recompilation
> eventually.
> 
> OTOH, we already don't need recompilation for the same architecture
> since the values of such macros are defined for each kernel version
> like below:
> 
> #ifdef __x86_64__
> ...
> #define _MAX_PHYSMEM_BITS_ORIG  (40)
> #define _MAX_PHYSMEM_BITS_2_6_26(44)
> #define _MAX_PHYSMEM_BITS_2_6_31(46)
> 
> So I don't think this patch is valuable.

Hi Atsushi,

For x86, it is not necessory. But for arm, different venders
may define different SECTION_SIZE_BITS. for example:

   1 arch/arm/mach-clps711x/include/mach/memory.h
 #define SECTION_SIZE_BITS 24
   2 arch/arm/mach-exynos/include/mach/memory.h
 #define SECTION_SIZE_BITS 28
   4 arch/arm/mach-hisi/include/mach/memory.h
 #define SECTION_SIZE_BITS 26
   8 arch/arm/mach-sa1100/include/mach/memory.h
 #define SECTION_SIZE_BITS 27

Perhaps we should find another way to let the userspace tools
to get the architecture-specific values.

Liu Hua

> 
> 
> Thanks
> Atsushi Kumagai
> 
>> This patch makes makedumpfile get these two values from
>> vmcore info, if existing. It makes the makedumpfile more
>> compatible to vmcores with different section size.
>>
>> Signed-off-by: Liu Hua 
>> ---
>> makedumpfile.c | 17 +
>> makedumpfile.h |  2 ++
>> 2 files changed, 19 insertions(+)
>>
>> diff --git a/makedumpfile.c b/makedumpfile.c
>> index 6cf6e24..3cdf323 100644
>> --- a/makedumpfile.c
>> +++ b/makedumpfile.c
>> @@ -2111,6 +2111,8 @@ read_vmcoreinfo(void)
>>  READ_NUMBER("PG_slab", PG_slab);
>>  READ_NUMBER("PG_buddy", PG_buddy);
>>  READ_NUMBER("PG_hwpoison", PG_hwpoison);
>> +READ_NUMBER("SECTION_SIZE_BITS", SECTION_SIZE_BITS);
>> +READ_NUMBER("MAX_PHYSMEM_BITS", MAX_PHYSMEM_BITS);
>>
>>  READ_SRCFILE("pud_t", pud_t);
>>
>> @@ -2998,6 +3000,18 @@ initialize_bitmap_memory(void)
>> }
>>
>> int
>> +calibrate_machdep_info(void)
>> +{
>> +if (NUMBER(MAX_PHYSMEM_BITS) > 0)
>> +info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
>> +
>> +if (NUMBER(SECTION_SIZE_BITS) > 0)
>> +info->section_size_bits = NUMBER(SECTION_SIZE_BITS);
>> +
>> +return TRUE;
>> +}
>> +
>> +int
>> initial(void)
>> {
>>  off_t offset;
>> @@ -3214,6 +3228,9 @@ out:
>>  if (debug_info && !get_machdep_info())
>>  return FALSE;
>>
>> +if (debug_info && !calibrate_machdep_info())
>> +return FALSE;
>> +
>>  if (is_xen_memory() && !get_dom0_mapnr())
>>  return FALSE;
>>
>> diff --git a/makedumpfile.h b/makedumpfile.h
>> index eb03688..7acb23a 100644
>> --- a/makedumpfile.h
>> +++ b/makedumpfile.h
>> @@ -1434,6 +1434,8 @@ struct number_table {
>>  longPG_hwpoison;
>>
>>  longPAGE_BUDDY_MAPCOUNT_VALUE;
>> +longSECTION_SIZE_BITS;
>> +longMAX_PHYSMEM_BITS;
>> };
>>
>> struct srcfile_table {
>> --
>> 1.9.0
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 1/2] kdump: add sparse memory related values to vmcore

2014-05-07 Thread Liu Hua
Now different platforms may have different sparse memory
related values, such as MAX_PHYSMEM_SIZE and SECTION_SIZE_BITS.

And user tools such as makedumpfile can not get these values
from the vmcore. It defines these value as macros. If we use
 makedumpfile to treate with vmcores with different SECTION
size. We must recompile it. It is awaste of time.

So this patch add related values to vmcore to notify the
user space tools to deal with this situation.

Signed-off-by: Liu Hua 
---
 kernel/kexec.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index bf0b929e..96f7c5b 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1577,6 +1577,9 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
+   VMCOREINFO_NUMBER(SECTION_SIZE_BITS);
+
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/2] transfer sparse memory related values

2014-05-07 Thread Liu Hua
Now different platforms may have different SECTION_SIZE_BITS and
MAX_PHYSMEM_BITS. But linux passes nothing related to the 
vmcore.

User space tools such as makedumpfile needs these to deal with 
the vmcore. So we must define these when compiling. So if we 
deal with aother vmcore. We may need to recompile the tools.

These patch series make kernel pass related infomation to
the user space tools, via vmcore. And makedumpfile can
get these when parsing vmcore. So the makdumpfile becomes
more compatible to vmcore with different section size.


Liu Hua (1):
  kdump: add sparse memory related values to vmcore
  makedumpfile: get additional information from vmcore

 kernel/kexec.c | 3 +++
 makedumpfile.c | 17 +
 makedumpfile.h |  2 ++
 3 files changed, 22 insertions(+)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 2/2] makedumpfile: get additional information from vmcore

2014-05-07 Thread Liu Hua
Now we define MAX_PHYSMEM_BITS and SECTION_SIZE_BITS as
macros. So if we deal with vmcores with different values
of these two macros. We have to recompile makedumpfile.

This patch makes makedumpfile get these two values from
vmcore info, if existing. It makes the makedumpfile more
compatible to vmcores with different section size.

Signed-off-by: Liu Hua 
---
 makedumpfile.c | 17 +
 makedumpfile.h |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index 6cf6e24..3cdf323 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -2111,6 +2111,8 @@ read_vmcoreinfo(void)
READ_NUMBER("PG_slab", PG_slab);
READ_NUMBER("PG_buddy", PG_buddy);
READ_NUMBER("PG_hwpoison", PG_hwpoison);
+   READ_NUMBER("SECTION_SIZE_BITS", SECTION_SIZE_BITS);
+   READ_NUMBER("MAX_PHYSMEM_BITS", MAX_PHYSMEM_BITS);
 
READ_SRCFILE("pud_t", pud_t);
 
@@ -2998,6 +3000,18 @@ initialize_bitmap_memory(void)
 }
 
 int
+calibrate_machdep_info(void)
+{
+   if (NUMBER(MAX_PHYSMEM_BITS) > 0)
+   info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+
+   if (NUMBER(SECTION_SIZE_BITS) > 0)
+   info->section_size_bits = NUMBER(SECTION_SIZE_BITS);
+
+   return TRUE;
+}
+
+int
 initial(void)
 {
off_t offset;
@@ -3214,6 +3228,9 @@ out:
if (debug_info && !get_machdep_info())
return FALSE;
 
+   if (debug_info && !calibrate_machdep_info())
+   return FALSE;
+
if (is_xen_memory() && !get_dom0_mapnr())
return FALSE;
 
diff --git a/makedumpfile.h b/makedumpfile.h
index eb03688..7acb23a 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1434,6 +1434,8 @@ struct number_table {
longPG_hwpoison;
 
longPAGE_BUDDY_MAPCOUNT_VALUE;
+   longSECTION_SIZE_BITS;
+   longMAX_PHYSMEM_BITS;
 };
 
 struct srcfile_table {
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/2] change ARM linux memory layout to support 32 CPUs

2014-05-03 Thread Liu hua
On 2014/4/23 7:50, Nicolas Pitre wrote:
> On Tue, 22 Apr 2014, Russell King - ARM Linux wrote:
> 
>> On Tue, Apr 15, 2014 at 07:06:05PM +0800, Liu Hua wrote:
>>> This patch series change fixmap mapping region to suppport 32 CPUs.
>>> Because the "top_pmd" covers 0xfffe - 0x(2M). And part 
>>> is used by vector table. So I move this region down to 0xffc0
>>>  - 0xffd. 
>>
>> Can you explain why you have submitted these patches to my patch tracker
>> with a copy to sta...@vger.kernel.org ?
>>
>> What makes these qualify for stable tree inclusion?  What regression are
>> they fixing?
>>
>> We don't put patches into the stable tree for things that /never/ worked
>> in the past.  We've never supported 32 CPUs so I don't think these
>> qualify.
> 
> Indeed.  The stable qualifier should be dropped.
> 
> 
> Nicolas
> 
> .
> 
I am very sorry to ignore this mail for ten days! We have a platform of 16
CPUs, for which these patches are necessary(15CPUs for old memory layout).
But only with these patches, the ARM linux can not run on it.(gic part
should also be changed)

So you are right. It seems to be a new feature. Sorry for this mistake.
If anything I should do, please inform me!

Thanks,
Liu Hua



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH] ARM: kdump: Add vmcore_elf64_check_arch

2014-05-03 Thread Liu Hua
Hi Will or Russell,

With CONFIG_LPAE=y, memory in 32-bit ARM systems can exceed
4G. So if we use kdump in such systems. The capture kernel
should parse 64-bit elf header(parse_crash_elf64_headers).

And this process can not pass because ARM linux only provides
zero vmcore_elf64_check_arch function by commit 4b3bf7aef.

This patch adds check functions related to elf64 header.

Thanks,
Liu Hua

Signed-off-by: Liu Hua 
---
 arch/arm/include/asm/elf.h |  4 +++-
 arch/arm/kernel/elf.c  | 33 +
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index f4b46d3..8651699 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -90,14 +90,16 @@ typedef struct user_fp elf_fpregset_t;
 extern char elf_platform[];
 
 struct elf32_hdr;
+struct elf64_hdr;
 
 /*
  * This is used to ensure we don't load something for the wrong architecture.
  */
 extern int elf_check_arch(const struct elf32_hdr *);
+extern int elf_check_arch_64(const struct elf64_hdr *);
 #define elf_check_arch elf_check_arch
 
-#define vmcore_elf64_check_arch(x) (0)
+#define vmcore_elf64_check_arch(x) (elf_check_arch_64(x))
 
 extern int arm_elf_read_implies_exec(const struct elf32_hdr *, int);
 #define elf_read_implies_exec(ex,stk) arm_elf_read_implies_exec(&(ex), stk)
diff --git a/arch/arm/kernel/elf.c b/arch/arm/kernel/elf.c
index d0d1e83..452086a 100644
--- a/arch/arm/kernel/elf.c
+++ b/arch/arm/kernel/elf.c
@@ -38,6 +38,39 @@ int elf_check_arch(const struct elf32_hdr *x)
 }
 EXPORT_SYMBOL(elf_check_arch);
 
+int elf_check_arch_64(const struct elf64_hdr *x)
+{
+   unsigned int eflags;
+
+   /* Make sure it's an ARM executable */
+   if (x->e_machine != EM_ARM)
+   return 0;
+
+   /* Make sure the entry address is reasonable */
+   if (x->e_entry & 1) {
+   if (!(elf_hwcap & HWCAP_THUMB))
+   return 0;
+   } else if (x->e_entry & 3)
+   return 0;
+
+   eflags = x->e_flags;
+   if ((eflags & EF_ARM_EABI_MASK) == EF_ARM_EABI_UNKNOWN) {
+   unsigned int flt_fmt;
+
+   /* APCS26 is only allowed if the CPU supports it */
+   if ((eflags & EF_ARM_APCS_26) && !(elf_hwcap & HWCAP_26BIT))
+   return 0;
+
+   flt_fmt = eflags & (EF_ARM_VFP_FLOAT | EF_ARM_SOFT_FLOAT);
+
+   /* VFP requires the supporting code */
+   if (flt_fmt == EF_ARM_VFP_FLOAT && !(elf_hwcap & HWCAP_VFP))
+   return 0;
+   }
+   return 1;
+}
+EXPORT_SYMBOL(elf_check_arch_64);
+
 void elf_set_personality(const struct elf32_hdr *x)
 {
unsigned int eflags = x->e_flags;
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/2] change ARM linux memory layout to support 32 CPUs

2014-04-15 Thread Liu Hua
This patch series change fixmap mapping region to suppport 32 CPUs.
Because the "top_pmd" covers 0xfffe - 0x(2M). And part 
is used by vector table. So I move this region down to 0xffc0
 - 0xffd. 


I have tested the patches on arma9(2 CPUs) and arma15(16 CPUs) platforms

Changes from v2:
---
- Removed two macros: FIX_KMAP_BEGIN and FIX_KMAP_END;
- Unchanged DMA mapping region related documentation; 

Changes from v1:
---
- changed documentation for ARM linux memory layout.
- moved fixmap mapping region, not just extended.


Liu Hua (2):
  ARM : fixmap : remove FIX_KMAP_BEGIN and FIX_KMAP_END
  ARM : change fixmap mapping region to support 32 CPUs

 Documentation/arm/memory.txt   |  2 +-
 arch/arm/include/asm/fixmap.h  | 21 -
 arch/arm/include/asm/highmem.h |  1 +
 arch/arm/mm/highmem.c  | 33 -
 arch/arm/mm/mmu.c  |  4 
 5 files changed, 34 insertions(+), 27 deletions(-)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-04-15 Thread Liu Hua
In 32-bit ARM systems, the fixmap mapping region can support
no more than 14 CPUs(total: 896k; one CPU: 64K). And we can
configure NR_CPUS up to 32. So there is a mismatch.

This patch moves fixmapping region downwards to region
0xffc0-0xffe0. Then the fixmap mapping region can
support up to 32 CPUs

Signed-off-by: Liu Hua 
---
 Documentation/arm/memory.txt   |  2 +-
 arch/arm/include/asm/fixmap.h  | 16 ++--
 arch/arm/include/asm/highmem.h |  1 +
 arch/arm/mm/highmem.c  | 27 +--
 arch/arm/mm/mmu.c  |  4 
 5 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 4bfb9ff..a9fc59b 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -41,7 +41,7 @@ fffe8000  fffeDTCM mapping area for platforms 
with
 fffe   fffe7fffITCM mapping area for platforms with
ITCM mounted inside the CPU.
 
-fff0   fffdFixmap mapping region.  Addresses provided
+fffc   ffdfFixmap mapping region.  Addresses provided
by fix_to_virt() will be located here.
 
 ffc0   ffefDMA memory mapping region.  Memory returned
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index be55ebc..74124b0 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -1,20 +1,8 @@
 #ifndef _ASM_FIXMAP_H
 #define _ASM_FIXMAP_H
 
-/*
- * Nothing too fancy for now.
- *
- * On ARM we already have well known fixed virtual addresses imposed by
- * the architecture such as the vector page which is located at 0x,
- * therefore a second level page table is already allocated covering
- * 0xfff0 upwards.
- *
- * The cache flushing code in proc-xscale.S uses the virtual area between
- * 0xfffe and 0xfffe.
- */
-
-#define FIXADDR_START  0xfff0UL
-#define FIXADDR_TOP0xfffeUL
+#define FIXADDR_START  0xffc0UL
+#define FIXADDR_TOP0xffe0UL
 #define FIXADDR_SIZE   (FIXADDR_TOP - FIXADDR_START)
 
 #define FIX_KMAP_NR_PTES   (FIXADDR_SIZE >> PAGE_SHIFT)
diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
index 91b99ab..5355795 100644
--- a/arch/arm/include/asm/highmem.h
+++ b/arch/arm/include/asm/highmem.h
@@ -18,6 +18,7 @@
} while (0)
 
 extern pte_t *pkmap_page_table;
+extern pte_t *fixmap_page_table;
 
 extern void *kmap_high(struct page *page);
 extern void kunmap_high(struct page *page);
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index e05e8ad..45aeaac 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -18,6 +18,21 @@
 #include 
 #include "mm.h"
 
+pte_t *fixmap_page_table;
+
+static inline void set_fixmap_pte(int idx, pte_t pte)
+{
+   unsigned long vaddr = __fix_to_virt(idx);
+   set_pte_ext(fixmap_page_table + idx, pte, 0);
+   local_flush_tlb_kernel_page(vaddr);
+}
+
+static inline pte_t get_fixmap_pte(unsigned long vaddr)
+{
+   unsigned long idx = __virt_to_fix(vaddr);
+   return *(fixmap_page_table + idx);
+}
+
 void *kmap(struct page *page)
 {
might_sleep();
@@ -69,14 +84,14 @@ void *kmap_atomic(struct page *page)
 * With debugging enabled, kunmap_atomic forces that entry to 0.
 * Make sure it was indeed properly unmapped.
 */
-   BUG_ON(!pte_none(get_top_pte(vaddr)));
+   BUG_ON(!pte_none(*(fixmap_page_table + idx)));
 #endif
/*
 * When debugging is off, kunmap_atomic leaves the previous mapping
 * in place, so the contained TLB flush ensures the TLB is updated
 * with the new mapping.
 */
-   set_top_pte(vaddr, mk_pte(page, kmap_prot));
+   set_fixmap_pte(idx, mk_pte(page, kmap_prot));
 
return (void *)vaddr;
 }
@@ -95,7 +110,7 @@ void __kunmap_atomic(void *kvaddr)
__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(vaddr != __fix_to_virt(idx));
-   set_top_pte(vaddr, __pte(0));
+   set_fixmap_pte(idx, __pte(0));
 #else
(void) idx;  /* to kill a warning */
 #endif
@@ -119,9 +134,9 @@ void *kmap_atomic_pfn(unsigned long pfn)
idx = type + KM_TYPE_NR * smp_processor_id();
vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
-   BUG_ON(!pte_none(get_top_pte(vaddr)));
+   BUG_ON(!pte_none(*(fixmap_page_table + idx)));
 #endif
-   set_top_pte(vaddr, pfn_pte(pfn, kmap_prot));
+   set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot));
 
return (void *)vaddr;
 }
@@ -133,5 +148,5 @@ struct page *kmap_atomic_to_page(const void *ptr)
if (vaddr < FIXADDR_START)
return virt_to_page(ptr);
 
-   return pte_page(get_top_pte(vaddr));
+   return pte_page(g

[PATCH v3 1/2] ARM : fixmap : remove FIX_KMAP_BEGIN and FIX_KMAP_END

2014-04-15 Thread Liu Hua
It seems that these two macros are not used by non
architecture specific code. And on ARM FIX_KMAP_BEGIN
equals zero.

This patch removes these two macros. Instead, using
FIX_KMAP_NR_PTES to tell the pte number belonged to
fixmap mapping region. The code will become clearer
when I introduce a bugfix on fixmap mapping region.

Reviewed-by: Nicolas Pitre 
Signed-off-by: Liu Hua 
---
 arch/arm/include/asm/fixmap.h | 5 ++---
 arch/arm/mm/highmem.c | 6 +++---
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index bbae919..be55ebc 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -17,8 +17,7 @@
 #define FIXADDR_TOP0xfffeUL
 #define FIXADDR_SIZE   (FIXADDR_TOP - FIXADDR_START)
 
-#define FIX_KMAP_BEGIN 0
-#define FIX_KMAP_END   (FIXADDR_SIZE >> PAGE_SHIFT)
+#define FIX_KMAP_NR_PTES   (FIXADDR_SIZE >> PAGE_SHIFT)
 
 #define __fix_to_virt(x)   (FIXADDR_START + ((x) << PAGE_SHIFT))
 #define __virt_to_fix(x)   (((x) - FIXADDR_START) >> PAGE_SHIFT)
@@ -27,7 +26,7 @@ extern void __this_fixmap_does_not_exist(void);
 
 static inline unsigned long fix_to_virt(const unsigned int idx)
 {
-   if (idx >= FIX_KMAP_END)
+   if (idx >= FIX_KMAP_NR_PTES)
__this_fixmap_does_not_exist();
return __fix_to_virt(idx);
 }
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 21b9e1b..e05e8ad 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -63,7 +63,7 @@ void *kmap_atomic(struct page *page)
type = kmap_atomic_idx_push();
 
idx = type + KM_TYPE_NR * smp_processor_id();
-   vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
/*
 * With debugging enabled, kunmap_atomic forces that entry to 0.
@@ -94,7 +94,7 @@ void __kunmap_atomic(void *kvaddr)
if (cache_is_vivt())
__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
-   BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
+   BUG_ON(vaddr != __fix_to_virt(idx));
set_top_pte(vaddr, __pte(0));
 #else
(void) idx;  /* to kill a warning */
@@ -117,7 +117,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 
type = kmap_atomic_idx_push();
idx = type + KM_TYPE_NR * smp_processor_id();
-   vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(!pte_none(get_top_pte(vaddr)));
 #endif
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-04-15 Thread Liu hua
于 2014/4/14 21:50, Nicolas Pitre 写道:
> On Mon, 14 Apr 2014, Liu hua wrote:
> 
>> Yes, it seems that FIX_KMAP_BEGIN and FIX_KMAP_END are not as important for
>> ARM than that for other architectures (MIPS PowerPC x86), whose 
>> FIX_KMAP_BEGIN
>> is not 0. I will reminder this in my patch. Anyone who need them can get
>> imformantion there.
>>
>>
>> Now the new patchs are following. Maybe I sould resend the patch series
>> with a new tag. If It is time to do that. Can I add this information
>> "Reviewed-by: Nicolas Pitre " ?
>>
>> Thanks,
>> Liu Hua.
>>
>> -patch 1-
>>
>> Subject: [PATCH 1/2] ARM : fixmap : remove FIX_KMAP_BEGIN and FIX_KMAP_END
>>
>> It seems that these two variables are not used by non
>> architecture specific code. And on ARM FIX_KMAP_BEGIN
>> equals zero; FIX_KMAP_END is totally not used by the
>> kernel.
>>
>> This patch removes these two variables. The code will
>> become clear when I introduce a bugfix on fixmap mapping
> 
> s/clear/clearer/
> 
>> region.
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  arch/arm/include/asm/fixmap.h | 12 
>>  arch/arm/mm/highmem.c |  6 +++---
>>  2 files changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
>> index bbae919..8675bb9 100644
>> --- a/arch/arm/include/asm/fixmap.h
>> +++ b/arch/arm/include/asm/fixmap.h
>> @@ -17,9 +17,13 @@
>>  #define FIXADDR_TOP 0xfffeUL
>>  #define FIXADDR_SIZE(FIXADDR_TOP - FIXADDR_START)
>>
>> -#define FIX_KMAP_BEGIN  0
>> -#define FIX_KMAP_END(FIXADDR_SIZE >> PAGE_SHIFT)
>> -
>> +/* Notice : FIX_KMAP_END and FIX_KMAP_BEGIN are removed.
>> + *
>> + * Instead, using FIX_KMAP_NR_PTES to tell the pte number
>> + * belonged to fixmap mapping region.
>> + *
>> + */
> 
> Please move this comment to the commit log instead.  This is not 
> important enough to occupy that much space in the code.
> 
>> +#define FIX_KMAP_NR_PTES(FIXADDR_SIZE >> PAGE_SHIFT)
>>  #define __fix_to_virt(x)(FIXADDR_START + ((x) << PAGE_SHIFT))
>>  #define __virt_to_fix(x)(((x) - FIXADDR_START) >> PAGE_SHIFT)
>>
>> @@ -27,7 +31,7 @@ extern void __this_fixmap_does_not_exist(void);
>>
>>  static inline unsigned long fix_to_virt(const unsigned int idx)
>>  {
>> -if (idx >= FIX_KMAP_END)
>> +if (idx >= FIX_KMAP_NR_PTES)
>>  __this_fixmap_does_not_exist();
>>  return __fix_to_virt(idx);
>>  }
>> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
>> index 21b9e1b..e05e8ad 100644
>> --- a/arch/arm/mm/highmem.c
>> +++ b/arch/arm/mm/highmem.c
>> @@ -63,7 +63,7 @@ void *kmap_atomic(struct page *page)
>>  type = kmap_atomic_idx_push();
>>
>>  idx = type + KM_TYPE_NR * smp_processor_id();
>> -vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
>> +vaddr = __fix_to_virt(idx);
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>>  /*
>>   * With debugging enabled, kunmap_atomic forces that entry to 0.
>> @@ -94,7 +94,7 @@ void __kunmap_atomic(void *kvaddr)
>>  if (cache_is_vivt())
>>  __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>> -BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
>> +BUG_ON(vaddr != __fix_to_virt(idx));
>>  set_top_pte(vaddr, __pte(0));
>>  #else
>>  (void) idx;  /* to kill a warning */
>> @@ -117,7 +117,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
>>
>>  type = kmap_atomic_idx_push();
>>  idx = type + KM_TYPE_NR * smp_processor_id();
>> -vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
>> +vaddr = __fix_to_virt(idx);
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>>  BUG_ON(!pte_none(get_top_pte(vaddr)));
>>  #endif
>> -- 
>> 1.9.0
> 
> With the above details fixed you may add:
> 
> Reviewed-by: Nicolas Pitre 
> 
>> -patch 2-
>>
>>
>>
>> Subject: [PATCH 2/2] ARM : change fixmap mapping region to support 32 CPUs
>>
>> In 32-bit ARM systems, the fixmap mapping region can support
>> no more than 14 CPUs(total: 896k; one CPU: 64K). And we can
>> configure NR_CPUS up to 32. So there is a mismatch.
>>
>> This patch moves fixmapping region downwards to region

Re: [PATCH 2/3] ARM : kdump : add arch_crash_save_vmcoreinfo

2014-04-14 Thread Liu hua
于 2014/4/14 19:37, Will Deacon 写道:
> On Thu, Mar 27, 2014 at 08:00:39AM +0000, Liu Hua wrote:
>> For vmcore generated by LPAE enabled kernel, user space
>> utility such as crash needs additional infomation to
>> parse.
>>
>> So this patch add arch_crash_save_vmcoreinfo as what PAE enabled
>> i386 linux does.
> 
> Looks sensible to me:
> 
>   Reviewed-by: Will Deacon 
> 
> Will

Hi Will,

Thanks to you reply. How about the first one of the patch
series named "[PATCH 1/3] ARM : kdump : Add LPAE support".

Now the ARM linux will simply return error when parse an
LPAE enabled kernel, becausethe commit 4b3bf7ae provide
zero vmcore_elf64_check_arch(). So if we want parse LPAE
enabled kernel, we need that one.

Thanks,
Liu Hua
> 
>> Signed-off-by: Liu Hua 
>> To: Russell King 
>> Cc: Stephen Warren  
>> Cc: Will Deacon 
>> Cc: Vijaya Kumar K 
>> Cc: 
>> Cc: 
>> Cc: 
>> ---
>>  arch/arm/kernel/machine_kexec.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm/kernel/machine_kexec.c 
>> b/arch/arm/kernel/machine_kexec.c
>> index f0d180d..8cf0996 100644
>> --- a/arch/arm/kernel/machine_kexec.c
>> +++ b/arch/arm/kernel/machine_kexec.c
>> @@ -184,3 +184,10 @@ void machine_kexec(struct kimage *image)
>>  
>>  soft_restart(reboot_entry_phys);
>>  }
>> +
>> +void arch_crash_save_vmcoreinfo(void)
>> +{
>> +#ifdef CONFIG_ARM_LPAE
>> +VMCOREINFO_CONFIG(ARM_LPAE);
>> +#endif
>> +}
>> -- 
>> 1.9.0
>>
>>
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-04-14 Thread Liu hua
于 2014/4/14 1:34, Nicolas Pitre 写道:
> On Sun, 13 Apr 2014, Liu hua wrote:
> 
>> Hi Nicolas.
>>
>> Sure, your suggestion made my patch looks better. How about that :
>>
>> Thanks,
>> Liu Hua
> 
> There is something else that bothers me.
> 
>> +unsigned long idx = __virt_to_fix(vaddr);
>> +idx -= FIX_KMAP_BEGIN;
>> +return *(fixmap_page_table + idx);
> 
> FIX_KMAP_BEGIN represents the starting point for mapping fixmap indices 
> to page table entries.  So here you should _add_ the 
> FIX_KMAP_BEGIN offset to the page table index not substract it.  
> Currently FIX_KMAP_BEGIN is defined to 0 but still.
> 
>> +}
>> +
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>> -BUG_ON(!pte_none(get_top_pte(vaddr)));
>> +BUG_ON(!pte_none(*(fixmap_page_table + idx)));
> 
> Ditto here.
> 
> In fact, this FIX_KMAP_BEGIN is only creating more confusing code for 
> nothing as we define it to 0 anyway, and it is not used by non 
> architecture specific code.  So I'd suggest you create a patch to be 
> applied before this one that simply gets rid of FIX_KMAP_BEGIN and 
> FIX_KMAP_END altogether.  We could reintroduce them back if ever they're 
> needed.
> 
> 
> Nicolas
> 
Hi Nicolas,

Yes, it seems that FIX_KMAP_BEGIN and FIX_KMAP_END are not as important for
ARM than that for other architectures (MIPS PowerPC x86), whose FIX_KMAP_BEGIN
is not 0. I will reminder this in my patch. Anyone who need them can get
imformantion there.


Now the new patchs are following. Maybe I sould resend the patch series
with a new tag. If It is time to do that. Can I add this information
"Reviewed-by: Nicolas Pitre " ?

Thanks,
Liu Hua.

-patch 1-

Subject: [PATCH 1/2] ARM : fixmap : remove FIX_KMAP_BEGIN and FIX_KMAP_END

It seems that these two variables are not used by non
architecture specific code. And on ARM FIX_KMAP_BEGIN
equals zero; FIX_KMAP_END is totally not used by the
kernel.

This patch removes these two variables. The code will
become clear when I introduce a bugfix on fixmap mapping
region.

Signed-off-by: Liu Hua 
---
 arch/arm/include/asm/fixmap.h | 12 
 arch/arm/mm/highmem.c |  6 +++---
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index bbae919..8675bb9 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -17,9 +17,13 @@
 #define FIXADDR_TOP0xfffeUL
 #define FIXADDR_SIZE   (FIXADDR_TOP - FIXADDR_START)

-#define FIX_KMAP_BEGIN 0
-#define FIX_KMAP_END   (FIXADDR_SIZE >> PAGE_SHIFT)
-
+/* Notice : FIX_KMAP_END and FIX_KMAP_BEGIN are removed.
+ *
+ * Instead, using FIX_KMAP_NR_PTES to tell the pte number
+ * belonged to fixmap mapping region.
+ *
+ */
+#define FIX_KMAP_NR_PTES   (FIXADDR_SIZE >> PAGE_SHIFT)
 #define __fix_to_virt(x)   (FIXADDR_START + ((x) << PAGE_SHIFT))
 #define __virt_to_fix(x)   (((x) - FIXADDR_START) >> PAGE_SHIFT)

@@ -27,7 +31,7 @@ extern void __this_fixmap_does_not_exist(void);

 static inline unsigned long fix_to_virt(const unsigned int idx)
 {
-   if (idx >= FIX_KMAP_END)
+   if (idx >= FIX_KMAP_NR_PTES)
__this_fixmap_does_not_exist();
return __fix_to_virt(idx);
 }
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 21b9e1b..e05e8ad 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -63,7 +63,7 @@ void *kmap_atomic(struct page *page)
type = kmap_atomic_idx_push();

idx = type + KM_TYPE_NR * smp_processor_id();
-   vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
/*
 * With debugging enabled, kunmap_atomic forces that entry to 0.
@@ -94,7 +94,7 @@ void __kunmap_atomic(void *kvaddr)
if (cache_is_vivt())
__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
-   BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
+   BUG_ON(vaddr != __fix_to_virt(idx));
set_top_pte(vaddr, __pte(0));
 #else
(void) idx;  /* to kill a warning */
@@ -117,7 +117,7 @@ void *kmap_atomic_pfn(unsigned long pfn)

type = kmap_atomic_idx_push();
idx = type + KM_TYPE_NR * smp_processor_id();
-   vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   vaddr = __fix_to_virt(idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(!pte_none(get_top_pte(vaddr)));
 #endif
-- 
1.9.0




-patch 2-



Subject: [PATCH 2/2] ARM : change fixmap mapping region to support 32 CPUs

In 32-bit ARM systems, the fixmap mapping region can support
no more than 14 CPUs(total: 896k; one CPU: 64K). And we c

Re: [PATCH v2 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-04-13 Thread Liu hua
On 2014/4/12 11:26, Nicolas Pitre wrote:
> On Fri, 11 Apr 2014, Liu Hua wrote:
> 
>> In 32-bit ARM systems, the fixmap mapping region can support
>> no more than 14 CPUs(total: 896k; one CPU: 64K). And we can
>> configure NR_CPUS up to 32. So there is a mismatch.
>>
>> This patch moves fixmapping region downwards to region
>> 0xffc0-0xffe0 . Then the fixmap mapping region can
>> support up to 32 CPUs.
>>
>> Signed-off-by: Liu Hua 
> 
> Comments below.
> 
>> ---
>>  Documentation/arm/memory.txt   |  2 +-
>>  arch/arm/include/asm/fixmap.h  |  4 ++--
>>  arch/arm/include/asm/highmem.h |  1 +
>>  arch/arm/mm/highmem.c  | 10 +-
>>  arch/arm/mm/mm.h   |  7 +++
>>  arch/arm/mm/mmu.c  |  4 
>>  mm/highmem.c   |  1 +
>>  7 files changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
>> index 8a361c0..4bca737 100644
>> --- a/Documentation/arm/memory.txt
>> +++ b/Documentation/arm/memory.txt
>> @@ -41,7 +41,7 @@ fffe8000   fffeDTCM mapping area for platforms 
>> with
>>  fffefffe7fffITCM mapping area for platforms with
>>  ITCM mounted inside the CPU.
>>  
>> -fff0fffdFixmap mapping region.  Addresses provided
>> +ffc0ffdfFixmap mapping region.  Addresses provided
>>  by fix_to_virt() will be located here.
>>  
>>  ff00ffbfReserved for future expansion of DMA
>> diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
>> index bbae919..014a70d 100644
>> --- a/arch/arm/include/asm/fixmap.h
>> +++ b/arch/arm/include/asm/fixmap.h
>> @@ -13,8 +13,8 @@
>>   * 0xfffe and 0xfffe.
>>   */
>>  
>> -#define FIXADDR_START   0xfff0UL
>> -#define FIXADDR_TOP 0xfffeUL
>> +#define FIXADDR_START   0xffc0UL
>> +#define FIXADDR_TOP 0xffe0UL
>>  #define FIXADDR_SIZE(FIXADDR_TOP - FIXADDR_START)
>>  
>>  #define FIX_KMAP_BEGIN  0
>> diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
>> index 91b99ab..5355795 100644
>> --- a/arch/arm/include/asm/highmem.h
>> +++ b/arch/arm/include/asm/highmem.h
>> @@ -18,6 +18,7 @@
>>  } while (0)
>>  
>>  extern pte_t *pkmap_page_table;
>> +extern pte_t *fixmap_page_table;
>>  
>>  extern void *kmap_high(struct page *page);
>>  extern void kunmap_high(struct page *page);
>> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
>> index 21b9e1b..9bc8988 100644
>> --- a/arch/arm/mm/highmem.c
>> +++ b/arch/arm/mm/highmem.c
>> @@ -69,14 +69,14 @@ void *kmap_atomic(struct page *page)
>>   * With debugging enabled, kunmap_atomic forces that entry to 0.
>>   * Make sure it was indeed properly unmapped.
>>   */
>> -BUG_ON(!pte_none(get_top_pte(vaddr)));
>> +BUG_ON(!pte_none(*(fixmap_page_table + idx)));
>>  #endif
>>  /*
>>   * When debugging is off, kunmap_atomic leaves the previous mapping
>>   * in place, so the contained TLB flush ensures the TLB is updated
>>   * with the new mapping.
>>   */
>> -set_top_pte(vaddr, mk_pte(page, kmap_prot));
>> +set_fixmap_pte(idx, mk_pte(page, kmap_prot));
>>  
>>  return (void *)vaddr;
>>  }
>> @@ -95,7 +95,7 @@ void __kunmap_atomic(void *kvaddr)
>>  __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>>  BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
>> -set_top_pte(vaddr, __pte(0));
>> +set_fixmap_pte(idx, __pte(0));
>>  #else
>>  (void) idx;  /* to kill a warning */
>>  #endif
>> @@ -119,9 +119,9 @@ void *kmap_atomic_pfn(unsigned long pfn)
>>  idx = type + KM_TYPE_NR * smp_processor_id();
>>  vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
>>  #ifdef CONFIG_DEBUG_HIGHMEM
>> -BUG_ON(!pte_none(get_top_pte(vaddr)));
>> +BUG_ON(!pte_none(*(fixmap_page_table + idx)));
>>  #endif
>> -set_top_pte(vaddr, pfn_pte(pfn, kmap_prot));
>> +set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot));
>>  
>>  return (void *)vaddr;
>>  }
>> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
>> index 7ea641b..3460d73 100644
>> --- a/arch/arm/mm/mm.h
>> +++ b/a

Re: [PATCH v2 1/2] ARM : DMA : remove useless information about DMA

2014-04-13 Thread Liu hua
On 2014/4/12 22:32, Nicolas Pitre write:
> On Sat, 12 Apr 2014, Liu hua wrote:
> 
>> Hi Nicolas,
>>
>> Your version is better. you tell me this in the former letters.
>> So I am very sorry to forget to check that.
>>
>> May be I should remake this second patch to fit your change. What do
>> you think about that patch?
> 
> You may simply drop your first patch and only keep the other two.
> 
Sure. But your patch and my second one both change Document/arm/memory.txt.
I am afraid there will be conflicts when someone tests my patch before
yours getting into the mainline.

Maybe I can spilt the document-changing part to a new patch. And send it
after yours are in the mainline.

Liu Hua
> 
> Nicolas
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] ARM : DMA : remove useless information about DMA

2014-04-11 Thread Liu hua
Hi Nicolas,

Your version is better. you tell me this in the former letters.
So I am very sorry to forget to check that.

May be I should remake this second patch to fit your change. What do
you think about that patch?

Liu Hua

On 2014/4/12 11:12, Nicolas Pitre wrote:
> On Fri, 11 Apr 2014, Liu Hua wrote:
> 
>> Because commit e9da6e9905e6 has remove custom consistent dma
>> region. So the related variable and document should be removed
>>
>> Signed-off-by: Liu Hua 
> 
> Acked-by: Nicolas Pitre 
> 
> Incidentally I sent an identical patch to RMK's patch system:
> 
> http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=8023/1
> 
> Either version is fine with me.
> 
> 
> Nicolas
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] ARM : DMA : remove useless information about DMA

2014-04-11 Thread Liu Hua
Because commit e9da6e9905e6 has remove custom consistent dma
region. So the related variable and document should be removed

Signed-off-by: Liu Hua 
---
 Documentation/arm/memory.txt  | 4 
 arch/arm/include/asm/memory.h | 2 --
 2 files changed, 6 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 4bfb9ff..8a361c0 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -44,10 +44,6 @@ fffe fffe7fffITCM mapping area for platforms 
with
 fff0   fffdFixmap mapping region.  Addresses provided
by fix_to_virt() will be located here.
 
-ffc0   ffefDMA memory mapping region.  Memory returned
-   by the dma_alloc_xxx functions will be
-   dynamically mapped here.
-
 ff00   ffbfReserved for future expansion of DMA
mapping region.
 
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 02fa255..2b75146 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -83,8 +83,6 @@
  */
 #define IOREMAP_MAX_ORDER  24
 
-#define CONSISTENT_END (0xffe0UL)
-
 #else /* CONFIG_MMU */
 
 /*
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] ARM : change fixmap mapping region to support 32 CPUs

2014-04-11 Thread Liu Hua
In 32-bit ARM systems, the fixmap mapping region can support
no more than 14 CPUs(total: 896k; one CPU: 64K). And we can
configure NR_CPUS up to 32. So there is a mismatch.

This patch moves fixmapping region downwards to region
0xffc0-0xffe0 . Then the fixmap mapping region can
support up to 32 CPUs.

Signed-off-by: Liu Hua 
---
 Documentation/arm/memory.txt   |  2 +-
 arch/arm/include/asm/fixmap.h  |  4 ++--
 arch/arm/include/asm/highmem.h |  1 +
 arch/arm/mm/highmem.c  | 10 +-
 arch/arm/mm/mm.h   |  7 +++
 arch/arm/mm/mmu.c  |  4 
 mm/highmem.c   |  1 +
 7 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 8a361c0..4bca737 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -41,7 +41,7 @@ fffe8000  fffeDTCM mapping area for platforms 
with
 fffe   fffe7fffITCM mapping area for platforms with
ITCM mounted inside the CPU.
 
-fff0   fffdFixmap mapping region.  Addresses provided
+ffc0   ffdfFixmap mapping region.  Addresses provided
by fix_to_virt() will be located here.
 
 ff00   ffbfReserved for future expansion of DMA
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index bbae919..014a70d 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -13,8 +13,8 @@
  * 0xfffe and 0xfffe.
  */
 
-#define FIXADDR_START  0xfff0UL
-#define FIXADDR_TOP0xfffeUL
+#define FIXADDR_START  0xffc0UL
+#define FIXADDR_TOP0xffe0UL
 #define FIXADDR_SIZE   (FIXADDR_TOP - FIXADDR_START)
 
 #define FIX_KMAP_BEGIN 0
diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
index 91b99ab..5355795 100644
--- a/arch/arm/include/asm/highmem.h
+++ b/arch/arm/include/asm/highmem.h
@@ -18,6 +18,7 @@
} while (0)
 
 extern pte_t *pkmap_page_table;
+extern pte_t *fixmap_page_table;
 
 extern void *kmap_high(struct page *page);
 extern void kunmap_high(struct page *page);
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index 21b9e1b..9bc8988 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -69,14 +69,14 @@ void *kmap_atomic(struct page *page)
 * With debugging enabled, kunmap_atomic forces that entry to 0.
 * Make sure it was indeed properly unmapped.
 */
-   BUG_ON(!pte_none(get_top_pte(vaddr)));
+   BUG_ON(!pte_none(*(fixmap_page_table + idx)));
 #endif
/*
 * When debugging is off, kunmap_atomic leaves the previous mapping
 * in place, so the contained TLB flush ensures the TLB is updated
 * with the new mapping.
 */
-   set_top_pte(vaddr, mk_pte(page, kmap_prot));
+   set_fixmap_pte(idx, mk_pte(page, kmap_prot));
 
return (void *)vaddr;
 }
@@ -95,7 +95,7 @@ void __kunmap_atomic(void *kvaddr)
__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
 #ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
-   set_top_pte(vaddr, __pte(0));
+   set_fixmap_pte(idx, __pte(0));
 #else
(void) idx;  /* to kill a warning */
 #endif
@@ -119,9 +119,9 @@ void *kmap_atomic_pfn(unsigned long pfn)
idx = type + KM_TYPE_NR * smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
 #ifdef CONFIG_DEBUG_HIGHMEM
-   BUG_ON(!pte_none(get_top_pte(vaddr)));
+   BUG_ON(!pte_none(*(fixmap_page_table + idx)));
 #endif
-   set_top_pte(vaddr, pfn_pte(pfn, kmap_prot));
+   set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot));
 
return (void *)vaddr;
 }
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 7ea641b..3460d73 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -1,6 +1,7 @@
 #ifdef CONFIG_MMU
 #include 
 #include 
+#include 
 
 /* the upper-most page table pointer */
 extern pmd_t *top_pmd;
@@ -25,6 +26,12 @@ static inline void set_top_pte(unsigned long va, pte_t pte)
local_flush_tlb_kernel_page(va);
 }
 
+static inline void set_fixmap_pte(int idx, pte_t pte)
+{
+   unsigned long vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+   set_pte_ext(fixmap_page_table + idx, pte, 0);
+   local_flush_tlb_kernel_page(vaddr);
+}
 static inline pte_t get_top_pte(unsigned long va)
 {
pte_t *ptep = pte_offset_kernel(top_pmd, va);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b68c6b2..09c0a16 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mm.h"
 #include "tcm.h"
@@ -1359,6 +1360,9 @@ static void __init kmap_init(void)
 #ifdef CONFIG_HIGHMEM
pkmap_page_table = early_pte_alloc(

[PATCH v2 0/2] change ARM linux memory layout to support 32 CPUs

2014-04-11 Thread Liu Hua
Hi Nicolas or Russell,

This patch series change fixmap mapping region to suppport 32 CPUs.
Because the "top_pmd" covers 0xfffe - 0x(2M). And part 
is used by vector table. So I move this region down to 0xffc0
 - 0xffd. 


I have tested the patches on arma9(2 CPUs) and arma15(16 CPUs) platforms

BTW, As we know we can configure NR_CPUS up to 32. So we need 2048K 
at most. But for ARM systems with less cpus, there is a waste 
of virtual address. So should we change its size according to
NR_CPUS, as what MIPS linux does ?

Changes from v1:
---
- changed documentation for ARM linux memory layout.
- moved fixmap mapping region, not just extended.

Liu Hua (2):
  ARM : DMA : remove useless information about DMA
  ARM : extend fixmap mapping region to support 32 CPUs

 Documentation/arm/memory.txt   |  8 ++--
 arch/arm/include/asm/fixmap.h  |  4 ++--
 arch/arm/include/asm/highmem.h |  1 +
 arch/arm/include/asm/memory.h  |  2 --
 arch/arm/mm/highmem.c  | 10 +-
 arch/arm/mm/mm.h   |  7 +++
 arch/arm/mm/mmu.c  |  4 
 mm/highmem.c   |  1 +
 8 files changed, 22 insertions(+), 15 deletions(-)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-28 Thread Liu Hua
As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep and with print :

[  205.452934] schedule_timeout: wrong timeout value ff83

and then the funtion watchdog will call schedule_timeout_interruptible
again and again. The screen will be filled with
"schedule_timeout: wrong timeout value ff83"

This patch does some check and correction in sysctl, to let the
function schedule_timeout_interruptible allways get the valid parameter.

Signed-off-by: Liu Hua 
Tested-by: Satoru Takeuchi 
Cc: sta...@vger.kernel.org # 3.4+
---
 Documentation/sysctl/kernel.txt | 1 +
 kernel/sysctl.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index e55124e..855d9b3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -317,6 +317,7 @@ for more than this value report a warning.
 This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 
 0: means infinite timeout - no checking done.
+Possible values to set are in range {0..LONG_MAX/HZ}.
 
 ==
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 49e13e1..aae21e8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
+/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */
+#ifdef CONFIG_DETECT_HUNG_TASK
+static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
+#endif
+
 #ifdef CONFIG_INOTIFY_USER
 #include 
 #endif
@@ -995,6 +1000,7 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof(unsigned long),
.mode   = 0644,
.proc_handler   = proc_dohung_task_timeout_secs,
+   .extra2 = &hung_task_timeout_max,
},
{
.procname   = "hung_task_warnings",
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] kexec : ARM : add LPAE support

2014-03-27 Thread Liu Hua
For 32-bit ARM systems with CONFIG_ARM_LPAE=y, when kexec utility
loads the crash kernel. 32-bit elf header is not enough if the
physical address exceeds 4G.

This patch check whether the largest physical address of the system
exceeds 4G. If so, kexec creates 64-bit elf header.Otherwise it
creates 32-bit elf header.

Signed-off-by: Liu Hua 
To: Simon Horman 
Cc: Vivek Goyal 
Cc: 
Cc: 
Cc: 
---
 kexec/arch/arm/crashdump-arm.c | 23 ---
 kexec/kexec-iomem.c|  8 
 kexec/kexec.h  |  4 ++--
 3 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/kexec/arch/arm/crashdump-arm.c b/kexec/arch/arm/crashdump-arm.c
index 0cd6935..d1133cd 100644
--- a/kexec/arch/arm/crashdump-arm.c
+++ b/kexec/arch/arm/crashdump-arm.c
@@ -20,6 +20,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
+#include 
 #include 
 #include 
 #include 
@@ -75,8 +76,8 @@ unsigned long phys_offset;
  * regions is placed in @crash_memory_nr_ranges.
  */
 static int crash_range_callback(void *UNUSED(data), int UNUSED(nr),
-   char *str, unsigned long base,
-   unsigned long length)
+   char *str, unsigned long long base,
+   unsigned long long length)
 {
struct memory_range *range;
 
@@ -276,6 +277,7 @@ int load_crashdump_segments(struct kexec_info *info, char 
*mod_cmdline)
unsigned long bufsz;
void *buf;
int err;
+   int last_ranges;
 
/*
 * First fetch all the memory (RAM) ranges that we are going to pass to
@@ -292,10 +294,25 @@ int load_crashdump_segments(struct kexec_info *info, char 
*mod_cmdline)
phys_offset = usablemem_rgns.ranges->start;
dbgprintf("phys_offset: %#lx\n", phys_offset);
 
-   err = crash_create_elf32_headers(info, &elf_info,
+   last_ranges = usablemem_rgns.size - 1;
+   if (last_ranges < 0)
+   last_ranges = 0;
+
+   if (crash_memory_ranges[last_ranges].end > ULONG_MAX) {
+
+   /* for support arm LPAE and arm64 */
+   elf_info.class = ELFCLASS64;
+
+   err = crash_create_elf64_headers(info, &elf_info,
 usablemem_rgns.ranges,
 usablemem_rgns.size, &buf, &bufsz,
 ELF_CORE_HEADER_ALIGN);
+   } else {
+   err = crash_create_elf32_headers(info, &elf_info,
+usablemem_rgns.ranges,
+usablemem_rgns.size, &buf, &bufsz,
+ELF_CORE_HEADER_ALIGN);
+   }
if (err)
return err;
 
diff --git a/kexec/kexec-iomem.c b/kexec/kexec-iomem.c
index 0396713..485a2e8 100644
--- a/kexec/kexec-iomem.c
+++ b/kexec/kexec-iomem.c
@@ -26,8 +26,8 @@ int kexec_iomem_for_each_line(char *match,
  int (*callback)(void *data,
  int nr,
  char *str,
- unsigned long base,
- unsigned long length),
+ unsigned long long base,
+ unsigned long long length),
  void *data)
 {
const char *iomem = proc_iomem();
@@ -65,8 +65,8 @@ int kexec_iomem_for_each_line(char *match,
 
 static int kexec_iomem_single_callback(void *data, int nr,
   char *UNUSED(str),
-  unsigned long base,
-  unsigned long length)
+  unsigned long long base,
+  unsigned long long length)
 {
struct memory_range *range = data;
 
diff --git a/kexec/kexec.h b/kexec/kexec.h
index 2bd6e96..ecc4681 100644
--- a/kexec/kexec.h
+++ b/kexec/kexec.h
@@ -279,8 +279,8 @@ int kexec_iomem_for_each_line(char *match,
  int (*callback)(void *data,
  int nr,
  char *str,
- unsigned long base,
- unsigned long length),
+ unsigned long long base,
+ unsigned long long length),
  void *data);
 int parse_iomem_single(char *str, uint64_t *start, uint64_t *end);
 const char * proc_iomem(void);
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a

[PATCH 2/3] ARM : kdump : add arch_crash_save_vmcoreinfo

2014-03-27 Thread Liu Hua
For vmcore generated by LPAE enabled kernel, user space
utility such as crash needs additional infomation to
parse.

So this patch add arch_crash_save_vmcoreinfo as what PAE enabled
i386 linux does.

Signed-off-by: Liu Hua 
To: Russell King 
Cc: Stephen Warren  
Cc: Will Deacon 
Cc: Vijaya Kumar K 
Cc: 
Cc: 
Cc: 
---
 arch/arm/kernel/machine_kexec.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index f0d180d..8cf0996 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -184,3 +184,10 @@ void machine_kexec(struct kimage *image)
 
soft_restart(reboot_entry_phys);
 }
+
+void arch_crash_save_vmcoreinfo(void)
+{
+#ifdef CONFIG_ARM_LPAE
+   VMCOREINFO_CONFIG(ARM_LPAE);
+#endif
+}
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] ARM : kdump : Add LPAE support

2014-03-27 Thread Liu Hua
With CONFIG_LPAE=y, memory in 32-bit ARM systems can exceed
4G. So if we use kdump in such systems. The capture kernel
should parse 64-bit elf header(parse_crash_elf64_headers).

And this process can not pass because ARM linux does not
supply related check function.

This patch adds check functions related of elf64 header.

Signed-off-by: Liu Hua 
To: Russell King 
Cc: Dan Aloni 
Cc: Catalin Marinas 
Cc: 
Cc: 
Cc: 
---
 arch/arm/include/asm/elf.h |  5 -
 arch/arm/kernel/elf.c  | 33 +
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index f4b46d3..6e02a6d 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -90,14 +90,17 @@ typedef struct user_fp elf_fpregset_t;
 extern char elf_platform[];
 
 struct elf32_hdr;
+struct elf64_hdr;
 
 /*
  * This is used to ensure we don't load something for the wrong architecture.
  */
 extern int elf_check_arch(const struct elf32_hdr *);
+extern int elf_check_arch_64(const struct elf64_hdr *);
 #define elf_check_arch elf_check_arch
 
-#define vmcore_elf64_check_arch(x) (0)
+#define vmcore_elf64_check_arch(x) (elf_check_arch_64(x) || \
+   vmcore_elf_check_arch_cross(x))
 
 extern int arm_elf_read_implies_exec(const struct elf32_hdr *, int);
 #define elf_read_implies_exec(ex,stk) arm_elf_read_implies_exec(&(ex), stk)
diff --git a/arch/arm/kernel/elf.c b/arch/arm/kernel/elf.c
index d0d1e83..452086a 100644
--- a/arch/arm/kernel/elf.c
+++ b/arch/arm/kernel/elf.c
@@ -38,6 +38,39 @@ int elf_check_arch(const struct elf32_hdr *x)
 }
 EXPORT_SYMBOL(elf_check_arch);
 
+int elf_check_arch_64(const struct elf64_hdr *x)
+{
+   unsigned int eflags;
+
+   /* Make sure it's an ARM executable */
+   if (x->e_machine != EM_ARM)
+   return 0;
+
+   /* Make sure the entry address is reasonable */
+   if (x->e_entry & 1) {
+   if (!(elf_hwcap & HWCAP_THUMB))
+   return 0;
+   } else if (x->e_entry & 3)
+   return 0;
+
+   eflags = x->e_flags;
+   if ((eflags & EF_ARM_EABI_MASK) == EF_ARM_EABI_UNKNOWN) {
+   unsigned int flt_fmt;
+
+   /* APCS26 is only allowed if the CPU supports it */
+   if ((eflags & EF_ARM_APCS_26) && !(elf_hwcap & HWCAP_26BIT))
+   return 0;
+
+   flt_fmt = eflags & (EF_ARM_VFP_FLOAT | EF_ARM_SOFT_FLOAT);
+
+   /* VFP requires the supporting code */
+   if (flt_fmt == EF_ARM_VFP_FLOAT && !(elf_hwcap & HWCAP_VFP))
+   return 0;
+   }
+   return 1;
+}
+EXPORT_SYMBOL(elf_check_arch_64);
+
 void elf_set_personality(const struct elf32_hdr *x)
 {
unsigned int eflags = x->e_flags;
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-26 Thread Liu hua
于 2014/3/26 0:25, Satoru Takeuchi 写道:
> At Tue, 25 Mar 2014 16:58:58 +0800,
> Liu hua wrote:
>>
>> 于 2014/3/24 4:50, Satoru Takeuchi 写道:
>>> At Sun, 23 Mar 2014 15:54:04 +0800,
>>> Liu Hua wrote:
>>>>
>>>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
>>>> larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
>>>> watchdog will return immediately without sleep and with print :
>>>>
>>>> [  205.452934] schedule_timeout: wrong timeout value ff83
>>>>
>>>> and then the funtion watchdog will call schedule_timeout_interruptible 
>>>> again
>>>> and again. The screen will be filled with
>>>>"schedule_timeout: wrong timeout value ff83"
>>>>
>>>> This patch does some check and correction in timeout_jiffies, to let the
>>>> function schedule_timeout_interruptible allways get the valid parameter.
>>>>
>>>> Cc: 
>>>> Signed-off-by: Liu Hua 
>>>> ---
>>>>  kernel/hung_task.c | 8 ++--
>>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>>>> index 6df6149..f992286 100644
>>>> --- a/kernel/hung_task.c
>>>> +++ b/kernel/hung_task.c
>>>> @@ -174,8 +174,12 @@ static void check_hung_uninterruptible_tasks(unsigned 
>>>> long timeout)
>>>>  
>>>>  static unsigned long timeout_jiffies(unsigned long timeout)
>>>>  {
>>>> -  /* timeout of 0 will disable the watchdog */
>>>> -  return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>>>> +  /* timeout of 0 or >= LONG_MAX/HZ will disable the watchdog */
>>>> +  if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT))
>>>
>>> You should check whether sysctl_hung_task_timeout_sec > 
>>> MAX_SCHEDULE_TIMEOUT/HZ
>>> or not when setting this parameter instead. Then this check ins't necessary 
>>> here.
>>>
>>> # Just FYI, MAX_SCHEDULE_TIMEOUT should be MAX_SCHEDULE_TIMEOUT/HZ here. 
>>>
>>> Thanks,
>>> Satoru
>>
>>  Yes, how about this :
> 
> I confirmed the followings.
> 
>  - 3.14-rc8: system hunged up with "hung_task_timeout_secs > LONG_MAX/HZ".
>  - 3.14-rc8 with your patch: works fine. I can't set the above mentioned 
> value any more.
> 
> Writing possible values (0..LONG_MAX/HZ) in Documentation/sysctl/kernel.txt
> make this patch better.
> 
> Thanks,
> Satoru

Thanks to you attention and suggestion. I remade this patch as following.
Is it appropriate to be reposted with tag "PATCH v3"

Subject: [PATCH v3] hung_task : check the value of 
"sysctl_hung_task_timeout_sec"

As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep and with print :

[  205.452934] schedule_timeout: wrong timeout value ff83

and then the funtion watchdog will call schedule_timeout_interruptible
again and again. The screen will be filled with
"schedule_timeout: wrong timeout value ff83"

This patch does some check and correction in sysctl, to let the
function schedule_timeout_interruptible allways get the valid parameter.

Signed-off-by: Liu Hua 
Tested-by: Satoru Takeuchi 
---
 Documentation/sysctl/kernel.txt | 1 +
 kernel/sysctl.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index e55124e..855d9b3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -317,6 +317,7 @@ for more than this value report a warning.
 This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.

 0: means infinite timeout - no checking done.
+Possible values to set are in range {0..LONG_MAX/HZ}.

 ==

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 49e13e1..aae21e8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;

+/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */
+#ifdef CONFIG_DETECT_HUNG_TASK
+static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
+#endif
+
 #ifdef CONFIG_INOTIFY_USER
 #include 
 #endif
@@ -995,6 +1000,7 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof(unsigned long),
.mode   = 0644,
.proc_handler   = proc_dohung_task_timeout_secs,
+   .extra2 = &hung_task_timeout_max,
},
{
.procname   = "hung_task_warnings",
-- 
1.9.0

Thanks,
Liu Hua

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-25 Thread Liu hua
于 2014/3/24 4:50, Satoru Takeuchi 写道:
> At Sun, 23 Mar 2014 15:54:04 +0800,
> Liu Hua wrote:
>>
>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
>> larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
>> watchdog will return immediately without sleep and with print :
>>
>> [  205.452934] schedule_timeout: wrong timeout value ff83
>>
>> and then the funtion watchdog will call schedule_timeout_interruptible again
>> and again. The screen will be filled with
>>  "schedule_timeout: wrong timeout value ff83"
>>
>> This patch does some check and correction in timeout_jiffies, to let the
>> function schedule_timeout_interruptible allways get the valid parameter.
>>
>> Cc: 
>> Signed-off-by: Liu Hua 
>> ---
>>  kernel/hung_task.c | 8 ++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 6df6149..f992286 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -174,8 +174,12 @@ static void check_hung_uninterruptible_tasks(unsigned 
>> long timeout)
>>  
>>  static unsigned long timeout_jiffies(unsigned long timeout)
>>  {
>> -/* timeout of 0 will disable the watchdog */
>> -return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>> +/* timeout of 0 or >= LONG_MAX/HZ will disable the watchdog */
>> +if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT))
> 
> You should check whether sysctl_hung_task_timeout_sec > 
> MAX_SCHEDULE_TIMEOUT/HZ
> or not when setting this parameter instead. Then this check ins't necessary 
> here.
> 
> # Just FYI, MAX_SCHEDULE_TIMEOUT should be MAX_SCHEDULE_TIMEOUT/HZ here. 
> 
> Thanks,
> Satoru

 Yes, how about this :

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 49e13e1..aae21e8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;

+/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */
+#ifdef CONFIG_DETECT_HUNG_TASK
+static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
+#endif
+
 #ifdef CONFIG_INOTIFY_USER
 #include 
 #endif
@@ -995,6 +1000,7 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof(unsigned long),
        .mode   = 0644,
.proc_handler   = proc_dohung_task_timeout_secs,
+   .extra2 = &hung_task_timeout_max,
},
{
.procname   = "hung_task_warnings",
-- 
1.9.0

Thanks
Liu Hua



> 
>> +return MAX_SCHEDULE_TIMEOUT;
>> +
>> +return (timeout * HZ) < MAX_SCHEDULE_TIMEOUT ?
>> +timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>>  }
>>  
>>  /*
>> -- 
>> 1.9.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe stable" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC] ARM: extend fixmap mapping region to support 30 CPUs

2014-03-24 Thread Liu Hua
In 32-bit ARM systems, the fixmap mapping region can support
no more than 14 CPUs(total: 896k; one CPU: 64K). And we can
configure NR_CPUS up to 32. So there is a mismatch.

This patch extends the fixmapping region downwards to boundary
of DMA mapping region(0xffe0-0xfffe). Then the fixmap
mapping region can support up to 30 CPUs.

There seems to be no easy way to support 32 CPUs by simply 
changing memory layout on ARM Linux. So I also limit the 
maximal CPU number one can configure.


Signed-off-by: Liu Hua 
---
 Documentation/arm/memory.txt  | 4 ++--
 arch/arm/Kconfig  | 4 ++--
 arch/arm/include/asm/fixmap.h | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 4bfb9ff..cc31560 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -41,10 +41,10 @@ fffe8000fffeDTCM mapping area for platforms 
with
 fffe   fffe7fffITCM mapping area for platforms with
ITCM mounted inside the CPU.
 
-fff0   fffdFixmap mapping region.  Addresses provided
+ffe0   fffdFixmap mapping region.  Addresses provided
by fix_to_virt() will be located here.
 
-ffc0   ffefDMA memory mapping region.  Memory returned
+ffc0   ffdfDMA memory mapping region.  Memory returned
by the dma_alloc_xxx functions will be
dynamically mapped here.
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e254198..f599040 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1600,8 +1600,8 @@ config PAGE_OFFSET
default 0xC000
 
 config NR_CPUS
-   int "Maximum number of CPUs (2-32)"
-   range 2 32
+   int "Maximum number of CPUs (2-30)"
+   range 2 30
depends on SMP
default "4"
 
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index bbae919..38c9ffd 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -13,7 +13,7 @@
  * 0xfffe and 0xfffe.
  */
 
-#define FIXADDR_START  0xfff0UL
+#define FIXADDR_START  0xffe0UL
 #define FIXADDR_TOP0xfffeUL
 #define FIXADDR_SIZE   (FIXADDR_TOP - FIXADDR_START)
 
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-23 Thread Liu Hua
As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep and with print :

[  205.452934] schedule_timeout: wrong timeout value ff83

and then the funtion watchdog will call schedule_timeout_interruptible again
and again. The screen will be filled with
"schedule_timeout: wrong timeout value ff83"

This patch does some check and correction in timeout_jiffies, to let the
function schedule_timeout_interruptible allways get the valid parameter.

Cc: 
Signed-off-by: Liu Hua 
---
 kernel/hung_task.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 6df6149..f992286 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -174,8 +174,12 @@ static void check_hung_uninterruptible_tasks(unsigned long 
timeout)
 
 static unsigned long timeout_jiffies(unsigned long timeout)
 {
-   /* timeout of 0 will disable the watchdog */
-   return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
+   /* timeout of 0 or >= LONG_MAX/HZ will disable the watchdog */
+   if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT))
+   return MAX_SCHEDULE_TIMEOUT;
+
+   return (timeout * HZ) < MAX_SCHEDULE_TIMEOUT ?
+   timeout * HZ : MAX_SCHEDULE_TIMEOUT;
 }
 
 /*
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: kdump: Avoid overflow when converting pfn to physaddr

2014-03-21 Thread Liu hua
On 2014/3/18 18:48, Russell King - ARM Linux wrote:
> On Tue, Mar 18, 2014 at 06:20:42PM +0800, Liu Hua wrote:
>> When we configure CONFIG_LPAE=y, pfn << PAGE_SHIFT will
>> overflow if pfn >= 0x10 in copy_oldmem_page.
>>
>> So use __pfn_to_phys for converting.
> 
> Yes.  The sad thing is that if you grep the kernel for similar things,
> it's littered with this problem.  I'm not sure whether anyone
> particularly "owns" the crash_dump.c file - Mika Westerberg and
> Olaf Hering were the last two to touch it... I guess put this in my
> patch system please.
> 
> Thanks.
> 

Yes, I found this problem in serval places after a quick review. I will
do a check on this.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: kdump: Avoid overflow when converting pfn to physaddr

2014-03-18 Thread Liu Hua
When we configure CONFIG_LPAE=y, pfn << PAGE_SHIFT will
overflow if pfn >= 0x10 in copy_oldmem_page.

So use __pfn_to_phys for converting.

Signed-off-by: Liu Hua 
---
 arch/arm/kernel/crash_dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/crash_dump.c b/arch/arm/kernel/crash_dump.c
index 90c50d4..5d1286d 100644
--- a/arch/arm/kernel/crash_dump.c
+++ b/arch/arm/kernel/crash_dump.c
@@ -39,7 +39,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
if (!csize)
return 0;
 
-   vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
+   vaddr = ioremap(__pfn_to_phys(pfn), PAGE_SIZE);
if (!vaddr)
return -ENOMEM;
 
-- 
1.8.5.5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ARM]Fixmap mapping region is not enough for system of 14+ CPUs.

2014-03-13 Thread Liu hua
On 2014/3/13 1:08, Nicolas Pitre wrote:
> On Wed, 12 Mar 2014, Liu hua wrote:
> 
>> Hi Russell, Will or Nicolas,
>>
>> (In this mail, we only discuss ARM 32-bit linux.)
>>
>> As we know, the region (0xfff0-0xfffd) is reserved as fixmap
>> mapping region.
>>
>> The function "kmap_atomic" maps highmem pages to this region referring
>> to CPUID and per-cpu variable "__kmap_atomic_idx" via
>>
>>   idx = type + KM_TYPE_NR * smp_processor_id();
>>   vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
>>
>> Size of region used by one cpu is 0x1 (KM_TYPE_NR << PAGE_SHIFT).
>> And the total size of the fixmap mapping region is 0xe.
>> (only support 14 CPUs).
>>
>> So in a system of more than 14 CPUs, this region is not large enough.
>> should we change the memory layout on ARM Linux to support 14+ cpu system ?
>> Or can we do anything else to support that ?
> 
> How many CPUs do you have?
> 
> What about the following patch?  If this doesn't work for you then more 
> intrusive changes will be needed.
> 
> diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
> index 68ea615c2a..254f2df08d 100644
> --- a/arch/arm/include/asm/fixmap.h
> +++ b/arch/arm/include/asm/fixmap.h
> @@ -14,7 +14,17 @@
>   */
>  
>  #define FIXADDR_START0xfff0UL
> +
> +#if !defined(CONFIG_HAVE_TCM) && !defined(CONFIG_CPU_XSCALE)
> +/*
> + * If no TCM nor on on a XScale then enlarge the fixmap area to
> + * accommodate up to 30 CPUs.
> + */
> +#define FIXADDR_END  0xUL
> +#else
>  #define FIXADDR_END  0xfffeUL
One cpu need 0x1-size region; So this patch can only support
uo to 15 CPUs
> +#endif
> +
>  #define FIXADDR_TOP  (FIXADDR_END - PAGE_SIZE)
>  
>  enum fixed_addresses {
> 
> 
> Nicolas
> 

We have 16 CPUs in our system. And your patch cannot work if
the CPU number exceeds 15.

In addition, we can configure NR_CPUS up to 32. So should we
solve this problem completely?

There is an 1M-size hole between DMA mapping region and fixmap
mapping region(0xffe0-0xfff0).if this region are belonged
to fixmap mapping region, it can only support 30 CPUs.

Since we have alloced a second level page table to coverring
0xffe0 - 0x(2M). And 0x upwards used as
vector pages. So if the left region is all used as fixmap mapping
region(size : 0x1f). Up to 31 CPUs is supported.


The following patch can solve our problem(up to 30 CPUs), But it
reduce the DMA region. What do you think about it ?
Is is necessory to solve this problem completely(up to 32 CPUs)
?


Liu Hua

-
diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h
index 68ea615..8379891 100644
--- a/arch/arm/include/asm/fixmap.h
+++ b/arch/arm/include/asm/fixmap.h
@@ -13,7 +13,7 @@
  * 0xfffe and 0xfffe.
  */

-#define FIXADDR_START  0xfff0UL
+#define FIXADDR_START0xffe0UL
 #define FIXADDR_END0xfffeUL
 #define FIXADDR_TOP(FIXADDR_END - PAGE_SIZE)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 4afb376..0c40674 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -83,7 +83,7 @@
  */
 #define IOREMAP_MAX_ORDER  24

-#define CONSISTENT_END (0xffe0UL)
+#define CONSISTENT_END   (0xffd0UL)

 #else /* CONFIG_MMU */






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ARM]Fixmap mapping region is not enough for system of 14+ CPUs.

2014-03-12 Thread Liu hua
Hi Russell, Will or Nicolas,

(In this mail, we only discuss ARM 32-bit linux.)

As we know, the region (0xfff0-0xfffd) is reserved as fixmap
mapping region.

The function "kmap_atomic" maps highmem pages to this region referring
to CPUID and per-cpu variable "__kmap_atomic_idx" via

  idx = type + KM_TYPE_NR * smp_processor_id();
  vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);

Size of region used by one cpu is 0x1 (KM_TYPE_NR << PAGE_SHIFT).
And the total size of the fixmap mapping region is 0xe.
(only support 14 CPUs).

So in a system of more than 14 CPUs, this region is not large enough.
should we change the memory layout on ARM Linux to support 14+ cpu system ?
Or can we do anything else to support that ?


Thanks,

Liu Hua






.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-09 Thread Liu hua
on 2014/3/6 23:35, Paul Gortmaker wrote:
> On 14-03-06 02:19 AM, Liu hua wrote:
>> As sysctl_hung_task_timeout_sec is unsigned long, when this value is
>> larger then LONG_MAX, the function schedule_timeout_interruptible in
>> watchdog will return immediately without sleep :
>>
>> for example (in x86_64 platform):
>>
>> linux# echo 0x > /proc/sys/kernel/hung_task_timeout_secs
>>
>> [   66.798350] schedule_timeout: wrong timeout value ff06
>> [   66.800064] schedule_timeout: wrong timeout value ff06
>> [   66.801774] schedule_timeout: wrong timeout value ff06
>> [   66.803488] schedule_timeout: wrong timeout value ff06
>> [   66.805225] schedule_timeout: wrong timeout value ff06
>>
>> The screen was filled with "schedule_timeout: wrong timeout value
>> ff06" and the system stalled.
>>
>> So I do some check and correction in timeout_jiffies, to let the function
>> schedule_timeout_interruptible allways get the valid parameter.
>>
>> Signed-off-by: Liu Hua 
>> ---
>>  kernel/hung_task.c | 11 ++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 06bb141..ef96650 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -186,7 +186,16 @@ static void check_hung_uninterruptible_tasks(unsigned 
>> long timeout)
>>  static unsigned long timeout_jiffies(unsigned long timeout)
>>  {
>>  /* timeout of 0 will disable the watchdog */
> 
> 
> You are breaking the above functionality/feature by declaring
> zero invalid.
> 
> Paul.
> --
> 
Actually the patch will disable the watchdog if the timeout is illegal(except 
0) for
schedule_timeout_interruptible.
I will make a new patch that disables the watchdog when the timeout is 0 or 
above
LONG_MAX without printing errors ?

What do you think?

Liu Hua
>> -return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>> +if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT)) {
>> +pr_err("%s : wrong timeout value %lx\n",
>> +__func__, timeout);
>> +pr_err("Timeout value is set to MAX_SCHEDULE_TIMEOUT(%lx) 
>> now.\n",
>> +MAX_SCHEDULE_TIMEOUT);
>> +return MAX_SCHEDULE_TIMEOUT;
>> +}
>> +
>> +return (timeout * HZ) < MAX_SCHEDULE_TIMEOUT ?
>> +timeout * HZ : MAX_SCHEDULE_TIMEOUT;
>>  }
>>
>>  /*
>>
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] hung_task : check the value of "sysctl_hung_task_timeout_sec"

2014-03-05 Thread Liu hua
As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep :

for example (in x86_64 platform):

linux# echo 0x > /proc/sys/kernel/hung_task_timeout_secs

[   66.798350] schedule_timeout: wrong timeout value ff06
[   66.800064] schedule_timeout: wrong timeout value ff06
[   66.801774] schedule_timeout: wrong timeout value ff06
[   66.803488] schedule_timeout: wrong timeout value ff06
[   66.805225] schedule_timeout: wrong timeout value ff06

The screen was filled with "schedule_timeout: wrong timeout value
ff06" and the system stalled.

So I do some check and correction in timeout_jiffies, to let the function
schedule_timeout_interruptible allways get the valid parameter.

Signed-off-by: Liu Hua 
---
 kernel/hung_task.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 06bb141..ef96650 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -186,7 +186,16 @@ static void check_hung_uninterruptible_tasks(unsigned long 
timeout)
 static unsigned long timeout_jiffies(unsigned long timeout)
 {
/* timeout of 0 will disable the watchdog */
-   return timeout ? timeout * HZ : MAX_SCHEDULE_TIMEOUT;
+   if ((timeout == 0) || (timeout > MAX_SCHEDULE_TIMEOUT)) {
+   pr_err("%s : wrong timeout value %lx\n",
+   __func__, timeout);
+   pr_err("Timeout value is set to MAX_SCHEDULE_TIMEOUT(%lx) 
now.\n",
+   MAX_SCHEDULE_TIMEOUT);
+   return MAX_SCHEDULE_TIMEOUT;
+   }
+
+   return (timeout * HZ) < MAX_SCHEDULE_TIMEOUT ?
+   timeout * HZ : MAX_SCHEDULE_TIMEOUT;
 }

 /*
-- 
1.8.5.5.dirty


.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/