Link up issue on ethernet/realtek/r8169.c

2014-04-03 Thread Jagan Teki
Hi,

I'm using TP-LINK PCIe NIC card on my PCIe Root Complex.

Probing looks fine by setting:
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
PCI: enabling device :00:00.0 (0140 -> 0143)
PCI: enabling device :01:00.0 (0140 -> 0143)
r8169 :01:00.0 eth0: RTL8168e/8111e at 0xf006a000,
c0:4a:00:07:18:d4, XID 0c20 IRQ 89
r8169 :01:00.0 eth0: jumbo features [frames: 9200 bytes, tx
checksumming: ko]

But I have an issue while in link is up

$ ifconfig 192.168.1.10 up
r8169 :01:00.0: Direct firmware load failed with error -2
r8169 :01:00.0: Falling back to user helper
r8169 :01:00.0 eth0: unable to load firmware patch
rtl_nic/rtl8168e-2.fw (-2)
r8169 :01:00.0 eth0: link down

Looks like firmware loading is failed in open call, Can any one
suggest how to debug further.

thanks!
-- 
Jagan.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: msync: require either MS_ASYNC or MS_SYNC

2014-04-03 Thread Christoph Hellwig
Guys, I don't really see why you get so worked up about this.  There is
lots and lots of precedent of Linux allowing non-Posix (or non-standard
in general) arguments to system calls.  Even ones that don't have
symbolic names defined for them (the magic 3 open argument for device
files).

Given that we historicaly allowed the 0 argument to msync we'll have to
keep supporting it to not break existing userspace, and adding warnings
triggered by userspace that the person running the system usually can't
fix for something that is entirely harmless at runtime isn't going to
win you friends either.

A "strictly Posix" environment that catches all this sounds fine to me,
but it's something that should in the userspace c runtime, not the
kernel.  The kernel has never been about strict Posix implementations,
it sometimes doesn't even make it easy to implement the semantics in
user land, which is a bit unfortunate.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: Add driver for Palmas clk32kg and clk32kgaudio clocks

2014-04-03 Thread Peter Ujfalusi
On 04/03/2014 04:49 PM, Nishanth Menon wrote:
> On 04/03/2014 05:52 AM, Peter Ujfalusi wrote:
> [...]
>>  .../devicetree/bindings/clock/clk-palmas.txt   |  35 +++
>>  drivers/clk/Kconfig|   7 +
>>  drivers/clk/Makefile   |   1 +
>>  drivers/clk/clk-palmas.c   | 307 
>> +
>>  include/dt-bindings/mfd/palmas.h   |  18 ++
>>  5 files changed, 368 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/clock/clk-palmas.txt
>>  create mode 100644 drivers/clk/clk-palmas.c
>>  create mode 100644 include/dt-bindings/mfd/palmas.h
> 
> 
> Only complaint i have is based on :
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2a9330010bea5982a5c6593824bc036bf62d67b7

Oh, I see.
I'll wait for comments and resend as a series. Probably it is going to be
better if I rename the documentation file as well to
bindings/clock/ti,palmas-clk.txt
I think having two separate document for the two clock is overkill.

> you may want to split the new binding off to a separate patch.
> Otherwise, personally, I think it is an good evolution, thanks for
> doing it.
> 
> 


-- 
Péter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 10/36] hrtimer: use base->index instead of basenum in switch_hrtimer_base()

2014-04-03 Thread Viresh Kumar
In switch_hrtimer_base() we have created a local variable basenum which is set
to base->index initially. This variable is used at only one place. It makes code
more readable if we remove this variable and use base->index directly.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 3b29023..04e4c77 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -202,11 +202,10 @@ switch_hrtimer_base(struct hrtimer *timer, struct 
hrtimer_clock_base *base,
struct hrtimer_cpu_base *new_cpu_base;
int this_cpu = smp_processor_id();
int cpu = get_nohz_timer_target(pinned);
-   int basenum = base->index;
 
 again:
new_cpu_base = &per_cpu(hrtimer_bases, cpu);
-   new_base = &new_cpu_base->clock_base[basenum];
+   new_base = &new_cpu_base->clock_base[base->index];
 
if (base != new_base) {
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 01/36] hrtimer: replace 'tab' with 'space' after 'comma'

2014-04-03 Thread Viresh Kumar
Currently we have a 'tab' here instead of 'space' after 'comma'. Replace it with
space.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index d55092c..a1120a0 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -128,7 +128,7 @@ static void hrtimer_get_softirq_time(struct 
hrtimer_cpu_base *base)
base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
base->clock_base[HRTIMER_BASE_TAI].softirq_time =
-   ktime_add(xtim, ktime_set(tai_offset, 0));
+   ktime_add(xtim, ktime_set(tai_offset, 0));
 }
 
 /*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 02/36] hrtimer: Fix comment mistake over hrtimer_force_reprogram()

2014-04-03 Thread Viresh Kumar
The comment was probably added when there were only two clock bases available
for hrtimers, now there are four.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a1120a0..3c05140 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -530,8 +530,7 @@ static inline int hrtimer_hres_active(void)
 }
 
 /*
- * Reprogram the event source with checking both queues for the
- * next event
+ * Reprogram the event source with checking all queues for the next event.
  * Called with interrupts disabled and base->lock held
  */
 static void
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 11/36] hrtimer: no need to rewrite '1' to hrtimer_hres_enabled

2014-04-03 Thread Viresh Kumar
High Resolution feature can be enabled/disabled from bootargs if we have a
string 'highres=' followed by 'on' or 'off'. The default value of this variable
is '1'.  When 'on' is passed as bootarg, we don't have to overwrite this
variable by '1'.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 04e4c77..b0fbf12 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -509,9 +509,7 @@ static int __init setup_hrtimer_hres(char *str)
 {
if (!strcmp(str, "off"))
hrtimer_hres_enabled = 0;
-   else if (!strcmp(str, "on"))
-   hrtimer_hres_enabled = 1;
-   else
+   else if (strcmp(str, "on"))
return 0;
return 1;
 }
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 03/36] hrtimer: fix routine names in comments

2014-04-03 Thread Viresh Kumar
As majority of this code came from existing kernel/timer.c, few comments still
had name of routines from timers framework. Fix them.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 3c05140..c87e896 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -142,7 +142,7 @@ static void hrtimer_get_softirq_time(struct 
hrtimer_cpu_base *base)
  * means that all timers which are tied to this base via timer->base are
  * locked, and the base itself is locked too.
  *
- * So __run_timers/migrate_timers can safely modify all timers which could
+ * So __run_hrtimer/migrate_hrtimers can safely modify all timers which could
  * be found on the lists/queues.
  *
  * When the timer's base is locked, and the timer removed from list, it is
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 04/36] hrtimer: remove {} around a single liner 'for' loop in migrate_hrtimers()

2014-04-03 Thread Viresh Kumar
'for' loop in migrate_hrtimers() has only one line in its body and so doesn't
require {} as per coding guidelines. Remove it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index c87e896..514b53a 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1712,10 +1712,9 @@ static void migrate_hrtimers(int scpu)
raw_spin_lock(&new_base->lock);
raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
+   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++)
migrate_hrtimer_list(&old_base->clock_base[i],
 &new_base->clock_base[i]);
-   }
 
raw_spin_unlock(&old_base->lock);
raw_spin_unlock(&new_base->lock);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 12/36] hrtimer: use base->hres_active directly instead of hrtimer_hres_active()

2014-04-03 Thread Viresh Kumar
retrigger_next_event() is defined within #ifdef CONFIG_HIGH_RES_TIMERS as we
already have pointer to base available. So it makes more sense to simple use
base->hres_active instead of doing this by calling hrtimer_hres_active():

__this_cpu_read(hrtimer_bases.hres_active)

Also the same reason apply to code in __remove_hrtimer().

There is one more noticeable issue with __remove_hrtimer() without this patch.
We are checking hres_active of *this cpu's* base, where as it is not guanrateed
at all that __remove_hrtimer() would be called on this CPU. Specially from the
migration code, timer's CPU is already down.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index b0fbf12..ad5b7ba 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -672,7 +672,7 @@ static void retrigger_next_event(void *arg)
 {
struct hrtimer_cpu_base *base = &__get_cpu_var(hrtimer_bases);
 
-   if (!hrtimer_hres_active())
+   if (!base->hres_active)
return;
 
raw_spin_lock(&base->lock);
@@ -897,7 +897,7 @@ static void __remove_hrtimer(struct hrtimer *timer,
if (&timer->node == next_timer) {
 #ifdef CONFIG_HIGH_RES_TIMERS
/* Reprogram the clock event device. if enabled */
-   if (reprogram && hrtimer_hres_active()) {
+   if (reprogram && base->cpu_base->hres_active) {
ktime_t expires;
 
expires = ktime_sub(hrtimer_get_expires(timer),
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 05/36] hrtimer: Coalesce format fragments in printk()

2014-04-03 Thread Viresh Kumar
Breaking format fragments into multiple lines hits readability of code. Even if
it goes over 80 column width, its better to keep them together.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 514b53a..4843238 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -694,8 +694,8 @@ static int hrtimer_switch_to_hres(void)
 
if (tick_init_highres()) {
local_irq_restore(flags);
-   printk(KERN_WARNING "Could not switch to high resolution "
-   "mode on CPU %d\n", cpu);
+   printk(KERN_WARNING "Could not switch to high resolution mode 
on CPU %d\n",
+   cpu);
return 0;
}
base->hres_active = 1;
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 07/36] hrtimer: replace sizeof(struct hrtimer) with sizeof(*timer)

2014-04-03 Thread Viresh Kumar
Linux coding guidelines says:

The preferred form for passing a size of a struct is the following:
p = kmalloc(sizeof(*p), ...);

But __hrtimer_init() wasn't following that. Fix it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index f6b1968..db580ab 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1166,7 +1166,7 @@ static void __hrtimer_init(struct hrtimer *timer, 
clockid_t clock_id,
struct hrtimer_cpu_base *cpu_base;
int base;
 
-   memset(timer, 0, sizeof(struct hrtimer));
+   memset(timer, 0, sizeof(*timer));
 
cpu_base = &__raw_get_cpu_var(hrtimer_bases);
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 15/36] hrtimer: don't emulate notifier call to initialize timer base

2014-04-03 Thread Viresh Kumar
In hrtimers_init() we need to call init_hrtimers_cpu() for boot cpu. For this,
currently we are emulating a call to hotplug notifier. Probably this was done
initially to get rid of code redundancy. But this sequence always called a
single routine, i.e. init_hrtimers_cpu(), and so calling that routine directly
would be better. This would get rid of emulating a notifier call, few typecasts
and the extra steps we are doing in notifier callback.

So, this patch calls init_hrtimers_cpu() directly from hrtimers_init().

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 833db9f..5b0cbe7 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1753,8 +1753,7 @@ static struct notifier_block hrtimers_nb = {
 
 void __init hrtimers_init(void)
 {
-   hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE,
- (void *)(long)smp_processor_id());
+   init_hrtimers_cpu(smp_processor_id());
register_cpu_notifier(&hrtimers_nb);
 #ifdef CONFIG_HIGH_RES_TIMERS
open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 09/36] hrtimer: call hrtimer_set_expires_range() from hrtimer_set_expires_range_ns()

2014-04-03 Thread Viresh Kumar
hrtimer_set_expires_range() and hrtimer_set_expires_range_ns() have almost same
implementations and so we can actually call hrtimer_set_expires_range() from
hrtimer_set_expires_range_ns() internally instead of duplicating code.

Signed-off-by: Viresh Kumar 
---
 include/linux/hrtimer.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index e7a8d3f..17c08ca 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -207,8 +207,7 @@ static inline void hrtimer_set_expires_range(struct hrtimer 
*timer, ktime_t time
 
 static inline void hrtimer_set_expires_range_ns(struct hrtimer *timer, ktime_t 
time, unsigned long delta)
 {
-   timer->_softexpires = time;
-   timer->node.expires = ktime_add_safe(time, ns_to_ktime(delta));
+   hrtimer_set_expires_range(timer, time, ns_to_ktime(delta));
 }
 
 static inline void hrtimer_set_expires_tv64(struct hrtimer *timer, s64 tv64)
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 08/36] hrtimer: move unlock_hrtimer_base() upwards

2014-04-03 Thread Viresh Kumar
unlock_hrtimer_base() was handled specially at a separate place earlier as
lock_hrtimer_base() had separate definitions for SMP and non-SMP cases, but
unlock_hrtimer_base() had only a single definition. And so probably it was kept
at the end of this #ifdef/endif CONFIG_SMP. But this #ifdef ends right after
lock_hrtimer_base()'s definition (Atleast in the current code) and so we can
move unlock_hrtimer_base() upwards, i.e. closer to its counterparts. This
improves readability of the code.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index db580ab..3b29023 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -254,6 +254,12 @@ lock_hrtimer_base(const struct hrtimer *timer, unsigned 
long *flags)
 
 #endif /* !CONFIG_SMP */
 
+static inline
+void unlock_hrtimer_base(const struct hrtimer *timer, unsigned long *flags)
+{
+   raw_spin_unlock_irqrestore(&timer->base->cpu_base->lock, *flags);
+}
+
 /*
  * Functions for the union type storage format of ktime_t which are
  * too large for inlining:
@@ -805,15 +811,6 @@ static inline void timer_stats_account_hrtimer(struct 
hrtimer *timer)
 #endif
 }
 
-/*
- * Counterpart to lock_hrtimer_base above:
- */
-static inline
-void unlock_hrtimer_base(const struct hrtimer *timer, unsigned long *flags)
-{
-   raw_spin_unlock_irqrestore(&timer->base->cpu_base->lock, *flags);
-}
-
 /**
  * hrtimer_forward - forward the timer expiry
  * @timer: hrtimer to forward
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: Try the BIOS reboot method before the PCI reboot method

2014-04-03 Thread Ingo Molnar

* Steven Rostedt  wrote:

> On Thu, 03 Apr 2014 07:10:47 -0700
> "H. Peter Anvin"  wrote:
> 
> > Could you tell which of these modes work on your box:

Basically my thinking is that the patch should be reverted, if my fix 
below does not work.

I distilled your test results into:

  reboot=t   # triple fault  ok
  reboot=k   # keyboard ctrl FAIL
  reboot=b   # BIOS  ok
  reboot=a   # ACPI  FAIL
  reboot=e   # EFI   FAIL   [system has no EFI]
  reboot=p   # PCI 0xcf9 FAIL
 
And I think it's pretty obvious that we should only try 0xcf9 as a 
last resort - if at all. For some reason the 0xcf9 reboot method got 
marked 'safe' - why is that? If only pci_direct_probe() had funny 
extra lines /* like this */ ...

The other observation is that (on this box) we should try the 'BIOS' 
method before the PCI method.

Thirdly, CF9_COND is a total misnomer - it should be something like 
CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...

[ Plus all the BOOT_ flags are total misnomers as well, why aren't 
  they named REBOOT_ ...? ]

Anyway, the patch below fixes the worst problems:

 - it orders the actual reboot logic to follow the reboot ordering 
   pattern - it was in a pretty random order before for no good 
   reason.

 - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and 
   BOOT_CF9_SAFE flags to make the code more obvious.

 - it tries the BIOS reboot method before the PCI reboot method.

Only build tested.

Alternatively we could just use the following reboot order:

 * 1) If the FADT has the ACPI reboot register flag set, try it
 * 2) If still alive, write to the keyboard controller
 * 3) If still alive, write to the ACPI reboot register again
 * 4) If still alive, write to the keyboard controller again
 * 5) If still alive, call the EFI runtime service to reboot
 * 6) If still alive, force a triple fault

I.e. eliminate the 'PCI' and 'BIOS' methods from our default sequence, 
as both are documented as being able to hang some boxes.

Thanks,

Ingo

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 654b465..527dbcb 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -114,8 +114,8 @@ EXPORT_SYMBOL(machine_real_restart);
  */
 static int __init set_pci_reboot(const struct dmi_system_id *d)
 {
-   if (reboot_type != BOOT_CF9) {
-   reboot_type = BOOT_CF9;
+   if (reboot_type != BOOT_CF9_FORCE) {
+   reboot_type = BOOT_CF9_FORCE;
pr_info("%s series board detected. Selecting %s-method for 
reboots.\n",
d->ident, "PCI");
}
@@ -468,10 +468,15 @@ void __attribute__((weak)) mach_reboot_fixups(void)
  * 6) If still alive, write to the PCI IO port 0xCF9 to reboot
  * 7) If still alive, inform BIOS to do a proper reboot
  *
- * If the machine is still alive at this stage, it gives up. We default to
- * following the same pattern, except that if we're still alive after (7) we'll
- * try to force a triple fault and then cycle between hitting the keyboard
- * controller and doing that
+ * If the machine is still alive at this stage, it gives up.
+ *
+ * We default to following the same pattern, except that we try
+ * (7) [BIOS] before (6) [PCI], and we add 8): try to force a
+ * triple fault and then cycle between hitting the keyboard
+ * controller and doing that.
+ *
+ * This means that this function can never return, it can misbehave
+ * by not rebooting properly and hanging.
  */
 static void native_machine_emergency_restart(void)
 {
@@ -492,6 +497,11 @@ static void native_machine_emergency_restart(void)
for (;;) {
/* Could also try the reset bit in the Hammer NB */
switch (reboot_type) {
+   case BOOT_ACPI:
+   acpi_reboot();
+   reboot_type = BOOT_KBD;
+   break;
+
case BOOT_KBD:
mach_reboot_fixups(); /* For board specific fixups */
 
@@ -509,43 +519,29 @@ static void native_machine_emergency_restart(void)
}
break;
 
-   case BOOT_TRIPLE:
-   load_idt(&no_idt);
-   __asm__ __volatile__("int3");
-
-   /* We're probably dead after this, but... */
-   reboot_type = BOOT_KBD;
-   break;
-
-   case BOOT_BIOS:
-   machine_real_restart(MRR_BIOS);
-
-   /* We're probably dead after this, but... */
-   reboot_type = BOOT_TRIPLE;
-   break;
-
-   case BOOT_ACPI:
-   acpi_reboot();
-   reboot_type = BOOT_KBD;
-   break;
-
case BOOT_EFI:

[PATCH V2 18/36] hrtimer: rewrite switch_hrtimer_base() to remove extra indentation level

2014-04-03 Thread Viresh Kumar
Complete bottom part of switch_hrtimer_base() is part of a 'if' block and so all
code present in that block has extra indentation level before it. Rewrite it to
remove this extra indentation level.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 51 ++-
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 58b5e3f..fe13dcf 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -207,33 +207,34 @@ again:
new_cpu_base = &per_cpu(hrtimer_bases, cpu);
new_base = &new_cpu_base->clock_base[base->index];
 
-   if (base != new_base) {
-   /*
-* We are trying to move timer to new_base.
-* However we can't change timer's base while it is running,
-* so we keep it on the same CPU. No hassle vs. reprogramming
-* the event source in the high resolution case. The softirq
-* code will take care of this when the timer function has
-* completed. There is no conflict as we hold the lock until
-* the timer is enqueued.
-*/
-   if (unlikely(hrtimer_callback_running(timer)))
-   return base;
-
-   /* See the comment in lock_timer_base() */
-   timer->base = NULL;
-   raw_spin_unlock(&base->cpu_base->lock);
-   raw_spin_lock(&new_base->cpu_base->lock);
+   if (base == new_base)
+   return base;
 
-   if (cpu != this_cpu && hrtimer_check_target(timer, new_base)) {
-   cpu = this_cpu;
-   raw_spin_unlock(&new_base->cpu_base->lock);
-   raw_spin_lock(&base->cpu_base->lock);
-   timer->base = base;
-   goto again;
-   }
-   timer->base = new_base;
+   /*
+* We are trying to move timer to new_base. However we can't change
+* timer's base while it is running, so we keep it on the same CPU. No
+* hassle vs. reprogramming the event source in the high resolution
+* case. The softirq code will take care of this when the timer function
+* has completed. There is no conflict as we hold the lock until the
+* timer is enqueued.
+*/
+   if (unlikely(hrtimer_callback_running(timer)))
+   return base;
+
+   /* See the comment in lock_timer_base() */
+   timer->base = NULL;
+   raw_spin_unlock(&base->cpu_base->lock);
+   raw_spin_lock(&new_base->cpu_base->lock);
+
+   if (cpu != this_cpu && hrtimer_check_target(timer, new_base)) {
+   cpu = this_cpu;
+   raw_spin_unlock(&new_base->cpu_base->lock);
+   raw_spin_lock(&base->cpu_base->lock);
+   timer->base = base;
+   goto again;
}
+
+   timer->base = new_base;
return new_base;
 }
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 14/36] hrtimer: reorder code in __remove_hrtimer()

2014-04-03 Thread Viresh Kumar
This patch reorders code within __remove_hrtimer() routine to achieve this:
- no need to check if timer is the next timer to expire when high resolution
  mode isn't configured in kernel. So, move this check within the #ifdef/endif
  block.
- Validate 'reprogram' and hrtimer_hres_active() first as without these we don't
  need to check if 'timer' is the next one to fire.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 476ad5d..833db9f 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -891,19 +891,18 @@ static void __remove_hrtimer(struct hrtimer *timer,
 
next_timer = timerqueue_getnext(&base->active);
timerqueue_del(&base->active, &timer->node);
-   if (&timer->node == next_timer) {
+
 #ifdef CONFIG_HIGH_RES_TIMERS
-   /* Reprogram the clock event device. if enabled */
-   if (reprogram && base->cpu_base->hres_active) {
-   ktime_t expires;
-
-   expires = ktime_sub(hrtimer_get_expires(timer),
-   base->offset);
-   if (base->cpu_base->expires_next.tv64 == expires.tv64)
-   hrtimer_force_reprogram(base->cpu_base, 1);
-   }
-#endif
+   /* Reprogram the clock event device. if enabled */
+   if (reprogram && base->cpu_base->hres_active &&
+   &timer->node == next_timer) {
+   ktime_t expires;
+
+   expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
+   if (base->cpu_base->expires_next.tv64 == expires.tv64)
+   hrtimer_force_reprogram(base->cpu_base, 1);
}
+#endif
if (!timerqueue_getnext(&base->active))
base->cpu_base->active_bases &= ~(1 << base->index);
 out:
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 16/36] hrtimer: Create hrtimer_get_monoexpires()

2014-04-03 Thread Viresh Kumar
Following code is repeated at many places:
ktime_sub(hrtimer_get_expires(timer), base->offset);

and so it makes sense to create a separate inlined routine for this.

Signed-off-by: Viresh Kumar 
---
 include/linux/hrtimer.h |  6 ++
 kernel/hrtimer.c| 11 +--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 17c08ca..d1836cb 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -233,6 +233,12 @@ static inline ktime_t hrtimer_get_expires(const struct 
hrtimer *timer)
return timer->node.expires;
 }
 
+static inline ktime_t hrtimer_get_monoexpires(const struct hrtimer *timer,
+   struct hrtimer_clock_base *base)
+{
+   return ktime_sub(hrtimer_get_expires(timer), base->offset);
+}
+
 static inline ktime_t hrtimer_get_softexpires(const struct hrtimer *timer)
 {
return timer->_softexpires;
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 5b0cbe7..1a1fdc0 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -184,7 +184,7 @@ hrtimer_check_target(struct hrtimer *timer, struct 
hrtimer_clock_base *new_base)
if (!new_base->cpu_base->hres_active)
return 0;
 
-   expires = ktime_sub(hrtimer_get_expires(timer), new_base->offset);
+   expires = hrtimer_get_monoexpires(timer, new_base);
return expires.tv64 <= new_base->cpu_base->expires_next.tv64;
 #else
return 0;
@@ -554,7 +554,7 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, 
int skip_equal)
continue;
timer = container_of(next, struct hrtimer, node);
 
-   expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
+   expires = hrtimer_get_monoexpires(timer, base);
/*
 * clock_was_set() has changed base->offset so the
 * result might be negative. Fix it up to prevent a
@@ -588,7 +588,7 @@ static int hrtimer_reprogram(struct hrtimer *timer,
 struct hrtimer_clock_base *base)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
-   ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
+   ktime_t expires = hrtimer_get_monoexpires(timer, base);
int res;
 
WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
@@ -898,7 +898,7 @@ static void __remove_hrtimer(struct hrtimer *timer,
&timer->node == next_timer) {
ktime_t expires;
 
-   expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
+   expires = hrtimer_get_monoexpires(timer, base);
if (base->cpu_base->expires_next.tv64 == expires.tv64)
hrtimer_force_reprogram(base->cpu_base, 1);
}
@@ -1309,8 +1309,7 @@ retry:
if (basenow.tv64 < hrtimer_get_softexpires_tv64(timer)) 
{
ktime_t expires;
 
-   expires = ktime_sub(hrtimer_get_expires(timer),
-   base->offset);
+   expires = hrtimer_get_monoexpires(timer, base);
if (expires.tv64 < 0)
expires.tv64 = KTIME_MAX;
if (expires.tv64 < expires_next.tv64)
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 17/36] hrtimer: don't check if timer is queued in __remove_hrtimer()

2014-04-03 Thread Viresh Kumar
__remove_hrtimer() is called from three locations: remove_hrtimer(),
__run_hrtimer() and migrate_hrtimer_list(). And all these guarantee that timer
was queued earlier. And so there is no need to check if the timer is queued or
not in __remove_hrtimer().

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 1a1fdc0..58b5e3f 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -886,8 +886,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 unsigned long newstate, int reprogram)
 {
struct timerqueue_node *next_timer;
-   if (!(timer->state & HRTIMER_STATE_ENQUEUED))
-   goto out;
 
next_timer = timerqueue_getnext(&base->active);
timerqueue_del(&base->active, &timer->node);
@@ -903,10 +901,9 @@ static void __remove_hrtimer(struct hrtimer *timer,
hrtimer_force_reprogram(base->cpu_base, 1);
}
 #endif
+   timer->state = newstate;
if (!timerqueue_getnext(&base->active))
base->cpu_base->active_bases &= ~(1 << base->index);
-out:
-   timer->state = newstate;
 }
 
 /*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 20/36] hrtimer: replace base by new_base to get resolution: __hrtimer_start_range_ns()

2014-04-03 Thread Viresh Kumar
This code was added long back by following commit:

commit 06027bdd278a32a84b273e41db68a5db8ffd2bb6
Author: Ingo Molnar 
Date:   Tue Feb 14 13:53:15 2006 -0800

[PATCH] hrtimer: round up relative start time on low-res arches

Don't know if it was a mistake or was intentional. But probably we must use
new_base instead of base here to get resolution. Things might be working
smoothly as resolution might be same for both the bases in most of the cases.

Also commit log of above commit has this: "This will go away with the GTOD
framework". So, should we get this removed?

Cc: Ingo Molnar 
Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 2ac423d..458b952 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -964,7 +964,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 * timeouts. This will go away with the GTOD framework.
 */
 #ifdef CONFIG_TIME_LOW_RES
-   tim = ktime_add_safe(tim, base->resolution);
+   tim = ktime_add_safe(tim, new_base->resolution);
 #endif
}
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 23/36] hrtimer: create for_each_active_base()

2014-04-03 Thread Viresh Kumar
There are many places where we need to iterate over all the currently active
clock bases for a particular cpu_base. Create for_each_active_base() to simplify
code at those places.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 379d21a..ceadfa5 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -113,6 +113,19 @@ static inline bool base_on_this_cpu(struct 
hrtimer_clock_base *base)
 }
 
 /*
+ * for_each_active_base: iterate over all active clock bases
+ * @_index: 'int' variable for internal purpose
+ * @_base: holds pointer to a active clock base
+ * @_cpu_base: cpu base to iterate on
+ * @_active_bases: 'unsigned int' variable for internal purpose
+ */
+#define for_each_active_base(_index, _base, _cpu_base, _active_bases)  \
+   for ((_active_bases) = (_cpu_base)->active_bases;   \
+   (_index) = ffs(_active_bases),  \
+   (_base) = (_cpu_base)->clock_base + (_index) - 1, (_index); \
+   (_active_bases) &= ~(1 << ((_index) - 1)))
+
+/*
  * Get the coarse grained time at the softirq based on xtime and
  * wall_to_monotonic.
  */
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 24/36] hrtimer: Use for_each_active_base() to iterate over active clock bases

2014-04-03 Thread Viresh Kumar
There are various places where we are currently running a loop of
HRTIMER_MAX_CLOCK_BASES iterations. We just run 'continue;' if there are no
timers added to a clock base. Instead we can use the new for_each_active_base()
routine to iterate over only the bases which are currently active.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 58 ++--
 1 file changed, 23 insertions(+), 35 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ceadfa5..b3ab19a 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -557,21 +557,17 @@ static inline int hrtimer_hres_active(void)
 static void
 hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
 {
-   int i;
-   struct hrtimer_clock_base *base = cpu_base->clock_base;
+   struct hrtimer_clock_base *base;
+   struct hrtimer *timer;
ktime_t expires, expires_next;
+   unsigned int active_bases;
+   int i;
 
expires_next.tv64 = KTIME_MAX;
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
-   struct hrtimer *timer;
-   struct timerqueue_node *next;
-
-   next = timerqueue_getnext(&base->active);
-   if (!next)
-   continue;
-   timer = container_of(next, struct hrtimer, node);
-
+   for_each_active_base(i, base, cpu_base, active_bases) {
+   timer = container_of(timerqueue_getnext(&base->active),
+struct hrtimer, node);
expires = hrtimer_get_monoexpires(timer, base);
/*
 * clock_was_set() has changed base->offset so the
@@ -1131,23 +1127,19 @@ EXPORT_SYMBOL_GPL(hrtimer_get_remaining);
 ktime_t hrtimer_get_next_event(void)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
-   struct hrtimer_clock_base *base = cpu_base->clock_base;
+   struct hrtimer_clock_base *base;
ktime_t delta, mindelta = { .tv64 = KTIME_MAX };
+   struct hrtimer *timer;
+   unsigned int active_bases;
unsigned long flags;
int i;
 
raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
if (!hrtimer_hres_active()) {
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
-   struct hrtimer *timer;
-   struct timerqueue_node *next;
-
-   next = timerqueue_getnext(&base->active);
-   if (!next)
-   continue;
-
-   timer = container_of(next, struct hrtimer, node);
+   for_each_active_base(i, base, cpu_base, active_bases) {
+   timer = container_of(timerqueue_getnext(&base->active),
+struct hrtimer, node);
delta.tv64 = hrtimer_get_expires_tv64(timer);
delta = ktime_sub(delta, base->get_time());
if (delta.tv64 < mindelta.tv64)
@@ -1270,7 +1262,9 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t 
*now)
 void hrtimer_interrupt(struct clock_event_device *dev)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+   struct hrtimer_clock_base *base;
ktime_t expires_next, now, entry_time, delta;
+   unsigned int active_bases;
int i, retries = 0;
 
BUG_ON(!cpu_base->hres_active);
@@ -1290,15 +1284,10 @@ retry:
 */
cpu_base->expires_next.tv64 = KTIME_MAX;
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-   struct hrtimer_clock_base *base;
+   for_each_active_base(i, base, cpu_base, active_bases) {
struct timerqueue_node *node;
ktime_t basenow;
 
-   if (!(cpu_base->active_bases & (1 << i)))
-   continue;
-
-   base = cpu_base->clock_base + i;
basenow = ktime_add(now, base->offset);
 
while ((node = timerqueue_getnext(&base->active))) {
@@ -1468,16 +1457,13 @@ void hrtimer_run_queues(void)
struct timerqueue_node *node;
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
struct hrtimer_clock_base *base;
+   unsigned int active_bases;
int index, gettime = 1;
 
if (hrtimer_hres_active())
return;
 
-   for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-   base = &cpu_base->clock_base[index];
-   if (!timerqueue_getnext(&base->active))
-   continue;
-
+   for_each_active_base(index, base, cpu_base, active_bases) {
if (gettime) {
hrtimer_get_softirq_time(cpu_base);
gettime = 0;
@@ -1697,6 +1683,8 @@ static void migrate_hrtimer_list(struct 
hrtimer_clock_base *old_base,
 static void migrate_hrtimers(int scpu)
 {
struct hrt

[PATCH V2 26/36] hrtimer: take lock only once for a cpu_base in hrtimer_run_queues()

2014-04-03 Thread Viresh Kumar
We are taking cpu_base->lock for every clock-base in hrtimer_run_queues() and
there is nothing in there which prevents us to take this lock only once. Modify
code to take lock only once for a cpu_base.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 2d9a7e2..c712960 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1466,8 +1466,8 @@ void hrtimer_run_queues(void)
if (cpu_base->active_bases)
hrtimer_get_softirq_time(cpu_base);
 
+   raw_spin_lock(&cpu_base->lock);
for_each_active_base(index, base, cpu_base, active_bases) {
-   raw_spin_lock(&cpu_base->lock);
while ((node = timerqueue_getnext(&base->active))) {
struct hrtimer *timer;
 
@@ -1478,8 +1478,8 @@ void hrtimer_run_queues(void)
 
__run_hrtimer(timer, &base->softirq_time);
}
-   raw_spin_unlock(&cpu_base->lock);
}
+   raw_spin_unlock(&cpu_base->lock);
 }
 
 /*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 28/36] hrtimer: remove 'base' parameter from remove_timer() and __remove_timer()

2014-04-03 Thread Viresh Kumar
clock 'base' can be obtained easily by doing timer->base and remove_timer()/
__remove_timer() never gets anything else than timer->base as its parameter.
Which means, these routines doesn't require this parameter. Remove it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 81e0251..a404436 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -895,10 +895,10 @@ static int enqueue_hrtimer(struct hrtimer *timer,
  * reprogram to zero. This is useful, when the context does a reprogramming
  * anyway (e.g. timer interrupt)
  */
-static void __remove_hrtimer(struct hrtimer *timer,
-struct hrtimer_clock_base *base,
-unsigned long newstate, int reprogram)
+static void __remove_hrtimer(struct hrtimer *timer, unsigned long newstate,
+int reprogram)
 {
+   struct hrtimer_clock_base *base = timer->base;
struct timerqueue_node *next_timer;
 
next_timer = timerqueue_getnext(&base->active);
@@ -921,11 +921,8 @@ static void __remove_hrtimer(struct hrtimer *timer,
timer->state = newstate;
 }
 
-/*
- * remove hrtimer, called with base lock held
- */
-static inline int
-remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
+/* remove hrtimer, called with base lock held */
+static inline int remove_hrtimer(struct hrtimer *timer)
 {
unsigned long state;
 
@@ -947,7 +944,7 @@ remove_hrtimer(struct hrtimer *timer, struct 
hrtimer_clock_base *base)
 * move the timer base in switch_hrtimer_base.
 */
state = timer->state & HRTIMER_STATE_CALLBACK;
-   __remove_hrtimer(timer, base, state, base_on_this_cpu(base));
+   __remove_hrtimer(timer, state, base_on_this_cpu(timer->base));
return 1;
 }
 
@@ -962,7 +959,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
base = lock_hrtimer_base(timer, &flags);
 
/* Remove an active timer from the queue: */
-   ret = remove_hrtimer(timer, base);
+   ret = remove_hrtimer(timer);
 
if (mode & HRTIMER_MODE_REL) {
tim = ktime_add_safe(tim, base->get_time());
@@ -1064,14 +1061,13 @@ EXPORT_SYMBOL_GPL(hrtimer_start);
  */
 int hrtimer_try_to_cancel(struct hrtimer *timer)
 {
-   struct hrtimer_clock_base *base;
unsigned long flags;
int ret = -1;
 
-   base = lock_hrtimer_base(timer, &flags);
+   lock_hrtimer_base(timer, &flags);
 
if (!hrtimer_callback_running(timer))
-   ret = remove_hrtimer(timer, base);
+   ret = remove_hrtimer(timer);
 
unlock_hrtimer_base(timer, &flags);
 
@@ -1223,7 +1219,7 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t 
*now)
WARN_ON(!irqs_disabled());
 
debug_deactivate(timer);
-   __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
+   __remove_hrtimer(timer, HRTIMER_STATE_CALLBACK, 0);
timer_stats_account_hrtimer(timer);
fn = timer->function;
 
@@ -1660,7 +1656,7 @@ static void migrate_hrtimer_list(struct 
hrtimer_clock_base *old_base,
 * timer could be seen as !active and just vanish away
 * under us on another CPU
 */
-   __remove_hrtimer(timer, old_base, HRTIMER_STATE_MIGRATE, 0);
+   __remove_hrtimer(timer, HRTIMER_STATE_MIGRATE, 0);
timer->base = new_base;
/*
 * Enqueue the timers on the new cpu. This does not
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 29/36] hrtimer: remove 'base' parameter from switch_hrtimer_base()

2014-04-03 Thread Viresh Kumar
clock 'base' can be obtained easily by doing timer->base and
switch_hrtimer_base() never gets anything else than timer->base as its
parameter. And so this routines doesn't require this parameter. Remove it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a404436..c35dc36 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -212,10 +212,9 @@ hrtimer_check_target(struct hrtimer *timer, struct 
hrtimer_clock_base *new_base)
  * Switch the timer base to the current CPU when possible.
  */
 static inline struct hrtimer_clock_base *
-switch_hrtimer_base(struct hrtimer *timer, struct hrtimer_clock_base *base,
-   int pinned)
+switch_hrtimer_base(struct hrtimer *timer, int pinned)
 {
-   struct hrtimer_clock_base *new_base;
+   struct hrtimer_clock_base *new_base, *base = timer->base;
struct hrtimer_cpu_base *new_cpu_base;
int this_cpu = smp_processor_id();
int cpu = get_nohz_timer_target(pinned);
@@ -267,7 +266,7 @@ lock_hrtimer_base(const struct hrtimer *timer, unsigned 
long *flags)
return base;
 }
 
-# define switch_hrtimer_base(t, b, p)  (b)
+# define switch_hrtimer_base(t, p) (t->base)
 
 #endif /* !CONFIG_SMP */
 
@@ -978,7 +977,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
hrtimer_set_expires_range_ns(timer, tim, delta_ns);
 
/* Switch the timer base, if necessary: */
-   new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
+   new_base = switch_hrtimer_base(timer, mode & HRTIMER_MODE_PINNED);
 
timer_stats_hrtimer_set_start_info(timer);
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 34/36] hrtimer: make enqueue_hrtimer() return void

2014-04-03 Thread Viresh Kumar
enqueue_hrtimer() routine is called from three places and only one of them
effectively uses its return value. Also, by its name enqueue_hrtimer() isn't
supposed to return "if the queued timer is the leftmost". So it makes more sense
to separate this routine into two parts, first one enqueues a timer and the
other one tells if the timer is leftmost or not.

Signed-off-by: Viresh Kumar 
---
 include/linux/hrtimer.h |  5 +
 kernel/hrtimer.c| 10 --
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index d1836cb..435ac4c 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -263,6 +263,11 @@ static inline ktime_t hrtimer_expires_remaining(const 
struct hrtimer *timer)
return ktime_sub(timer->node.expires, timer->base->get_time());
 }
 
+static inline int hrtimer_is_leftmost(struct hrtimer *timer)
+{
+   return &timer->node == timer->base->active.next;
+}
+
 #ifdef CONFIG_HIGH_RES_TIMERS
 struct clock_event_device;
 
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ea620e5..d62fe32 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -856,7 +856,7 @@ EXPORT_SYMBOL_GPL(hrtimer_forward);
  *
  * Returns 1 when the new timer is the leftmost timer in the tree.
  */
-static int enqueue_hrtimer(struct hrtimer *timer)
+static void enqueue_hrtimer(struct hrtimer *timer)
 {
struct hrtimer_clock_base *base = timer->base;
 
@@ -870,8 +870,6 @@ static int enqueue_hrtimer(struct hrtimer *timer)
 * state of a possibly running callback.
 */
timer->state |= HRTIMER_STATE_ENQUEUED;
-
-   return (&timer->node == base->active.next);
 }
 
 /*
@@ -943,7 +941,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 {
struct hrtimer_clock_base *base;
unsigned long flags;
-   int ret, leftmost;
+   int ret;
 
lock_hrtimer_base(timer, &flags);
base = timer->base;
@@ -972,7 +970,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 
timer_stats_hrtimer_set_start_info(timer);
 
-   leftmost = enqueue_hrtimer(timer);
+   enqueue_hrtimer(timer);
 
/*
 * Only allow reprogramming if the new base is on this CPU.
@@ -980,7 +978,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 *
 * XXX send_remote_softirq() ?
 */
-   if (leftmost && base_on_this_cpu(timer->base)
+   if (hrtimer_is_leftmost(timer) && base_on_this_cpu(timer->base)
&& hrtimer_enqueue_reprogram(timer)) {
if (wakeup) {
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 32/36] hrtimer: make switch_hrtimer_base() return void

2014-04-03 Thread Viresh Kumar
switch_hrtimer_base() always sets timer->base to the right base and so the
caller can obtain it easily. So, this routine doesn't need to return anything.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index fcbabcf..e581bba 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -211,8 +211,7 @@ hrtimer_check_target(struct hrtimer *timer, struct 
hrtimer_clock_base *new_base)
 /*
  * Switch the timer base to the current CPU when possible.
  */
-static inline struct hrtimer_clock_base *
-switch_hrtimer_base(struct hrtimer *timer, int pinned)
+static inline void switch_hrtimer_base(struct hrtimer *timer, int pinned)
 {
struct hrtimer_clock_base *new_base, *base = timer->base;
struct hrtimer_cpu_base *new_cpu_base;
@@ -224,7 +223,7 @@ again:
new_base = &new_cpu_base->clock_base[base->index];
 
if (base == new_base)
-   return base;
+   return;
 
/*
 * We are trying to move timer to new_base. However we can't change
@@ -235,7 +234,7 @@ again:
 * timer is enqueued.
 */
if (unlikely(hrtimer_callback_running(timer)))
-   return base;
+   return;
 
/* See the comment in lock_timer_base() */
timer->base = NULL;
@@ -251,7 +250,6 @@ again:
}
 
timer->base = new_base;
-   return new_base;
 }
 
 #else /* CONFIG_SMP */
@@ -266,7 +264,7 @@ lock_hrtimer_base(const struct hrtimer *timer, unsigned 
long *flags)
return base;
 }
 
-# define switch_hrtimer_base(t, p) (t->base)
+static inline void switch_hrtimer_base(struct hrtimer *timer, int pinned) {}
 
 #endif /* !CONFIG_SMP */
 
@@ -949,7 +947,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
unsigned long delta_ns, const enum hrtimer_mode mode,
int wakeup)
 {
-   struct hrtimer_clock_base *base, *new_base;
+   struct hrtimer_clock_base *base;
unsigned long flags;
int ret, leftmost;
 
@@ -975,7 +973,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
hrtimer_set_expires_range_ns(timer, tim, delta_ns);
 
/* Switch the timer base, if necessary: */
-   new_base = switch_hrtimer_base(timer, mode & HRTIMER_MODE_PINNED);
+   switch_hrtimer_base(timer, mode & HRTIMER_MODE_PINNED);
 
timer_stats_hrtimer_set_start_info(timer);
 
@@ -987,7 +985,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 *
 * XXX send_remote_softirq() ?
 */
-   if (leftmost && base_on_this_cpu(new_base)
+   if (leftmost && base_on_this_cpu(timer->base)
&& hrtimer_enqueue_reprogram(timer)) {
if (wakeup) {
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 30/36] hrtimer: remove 'base' parameter from enqueue_hrtimer()

2014-04-03 Thread Viresh Kumar
clock 'base' can be obtained easily by doing timer->base and enqueue_hrtimer()
never gets anything else than timer->base as its parameter. And so this routines
doesn't require this parameter. Remove it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index c35dc36..abbf155 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -867,9 +867,10 @@ EXPORT_SYMBOL_GPL(hrtimer_forward);
  *
  * Returns 1 when the new timer is the leftmost timer in the tree.
  */
-static int enqueue_hrtimer(struct hrtimer *timer,
-  struct hrtimer_clock_base *base)
+static int enqueue_hrtimer(struct hrtimer *timer)
 {
+   struct hrtimer_clock_base *base = timer->base;
+
debug_activate(timer);
 
timerqueue_add(&base->active, &timer->node);
@@ -981,7 +982,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 
timer_stats_hrtimer_set_start_info(timer);
 
-   leftmost = enqueue_hrtimer(timer, new_base);
+   leftmost = enqueue_hrtimer(timer);
 
/*
 * Only allow reprogramming if the new base is on this CPU.
@@ -1210,8 +1211,7 @@ EXPORT_SYMBOL_GPL(hrtimer_get_res);
 
 static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
 {
-   struct hrtimer_clock_base *base = timer->base;
-   struct hrtimer_cpu_base *cpu_base = base->cpu_base;
+   struct hrtimer_cpu_base *cpu_base = timer->base->cpu_base;
enum hrtimer_restart (*fn)(struct hrtimer *);
int restart;
 
@@ -1240,7 +1240,7 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t 
*now)
 */
if (restart != HRTIMER_NORESTART) {
BUG_ON(timer->state != HRTIMER_STATE_CALLBACK);
-   enqueue_hrtimer(timer, base);
+   enqueue_hrtimer(timer);
}
 
WARN_ON_ONCE(!(timer->state & HRTIMER_STATE_CALLBACK));
@@ -1665,7 +1665,7 @@ static void migrate_hrtimer_list(struct 
hrtimer_clock_base *old_base,
 * sort out already expired timers and reprogram the
 * event device.
 */
-   enqueue_hrtimer(timer, new_base);
+   enqueue_hrtimer(timer);
 
/* Clear the migration state bit */
timer->state &= ~HRTIMER_STATE_MIGRATE;
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 36/36] timer: don't emulate notifier call to initialize timer base

2014-04-03 Thread Viresh Kumar
In init_timers() we need to call init_timers_cpu() for boot cpu. For this,
currently we are emulating a call to hotplug notifier. Probably this was done
initially to get rid of code redundancy. But this sequence always called a
single routine, i.e. init_timers_cpu(), and so calling that routine directly
would be better. This would get rid of emulating a notifier call, few typecasts
and the extra steps we are doing in notifier callback.

So, this patch calls init_timers_cpu() directly from init_timers().

Signed-off-by: Viresh Kumar 
---
 kernel/timer.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 4360edc..d13eb56 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1666,15 +1666,9 @@ static struct notifier_block timers_nb = {
 
 void __init init_timers(void)
 {
-   int err;
-
/* ensure there are enough low bits for flags in timer->base pointer */
BUILD_BUG_ON(__alignof__(struct tvec_base) & TIMER_FLAG_MASK);
-
-   err = timer_cpu_notify(&timers_nb, (unsigned long)CPU_UP_PREPARE,
-  (void *)(long)smp_processor_id());
-   BUG_ON(err != NOTIFY_OK);
-
+   BUG_ON(init_timers_cpu(smp_processor_id()));
init_timer_stats();
register_cpu_notifier(&timers_nb);
open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] topology: Fix compilation warning when not in SMP

2014-04-03 Thread Vincent Stehlé
The topology_##name() macro does not use its argument when CONFIG_SMP is not
set, as it ultimately calls the cpu_data() macro.

So we avoid maintaining a possibly unused `cpu' variable, to avoid the
following compilation warning:

  drivers/base/topology.c: In function ‘show_physical_package_id’:
  drivers/base/topology.c:103:118: warning: unused variable ‘cpu’ 
[-Wunused-variable]
   define_id_show_func(physical_package_id);

  drivers/base/topology.c: In function ‘show_core_id’:
  drivers/base/topology.c:106:106: warning: unused variable ‘cpu’ 
[-Wunused-variable]
   define_id_show_func(core_id);

This can be seen with e.g. x86 defconfig and CONFIG_SMP not set.

Signed-off-by: Vincent Stehlé 
Cc: Greg Kroah-Hartman 
Cc:  # 3.10.x
Cc:  # 3.13.x
Cc:  # 3.14.x
---
 drivers/base/topology.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index ad9d177..c928576 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -39,8 +39,7 @@
 static ssize_t show_##name(struct device *dev, \
struct device_attribute *attr, char *buf)   \
 {  \
-   unsigned int cpu = dev->id; \
-   return sprintf(buf, "%d\n", topology_##name(cpu));  \
+   return sprintf(buf, "%d\n", topology_##name(dev->id));  \
 }
 
 #if defined(topology_thread_cpumask) || defined(topology_core_cpumask) || \
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 35/36] timer: simplify CPU_UP_PREPARE notifier code path

2014-04-03 Thread Viresh Kumar
Currently we are returning notifier_from_errno() from CPU_UP_PREPARE notifier
when we detect there is an error while calling init_timers_cpu().
notifier_from_errno() already has enough checks within to do something similar.
And so we can call it directly without checking if there was an error or not.

Reviewed-by: Srivatsa S. Bhat 
Signed-off-by: Viresh Kumar 
---
 kernel/timer.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 1d35dda..4360edc 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1646,9 +1646,7 @@ static int timer_cpu_notify(struct notifier_block *self,
case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN:
err = init_timers_cpu(cpu);
-   if (err < 0)
-   return notifier_from_errno(err);
-   break;
+   return notifier_from_errno(err);
 #ifdef CONFIG_HOTPLUG_CPU
case CPU_DEAD:
case CPU_DEAD_FROZEN:
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 31/36] hrtimer: remove 'base' parameter from hrtimer_{enqueue_}reprogram()

2014-04-03 Thread Viresh Kumar
clock 'base' can be obtained easily by doing timer->base and hrtimer_reprogram()
& hrtimer_enqueue_reprogram() never gets anything else than timer->base as its
parameter. And so these routines doesn't require this parameter. Remove it.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index abbf155..fcbabcf 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -597,11 +597,10 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base 
*cpu_base, int skip_equal)
  *
  * Called with interrupts disabled and base->cpu_base.lock held
  */
-static int hrtimer_reprogram(struct hrtimer *timer,
-struct hrtimer_clock_base *base)
+static int hrtimer_reprogram(struct hrtimer *timer)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
-   ktime_t expires = hrtimer_get_monoexpires(timer, base);
+   ktime_t expires = hrtimer_get_monoexpires(timer, timer->base);
int res;
 
WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
@@ -661,10 +660,9 @@ static inline void hrtimer_init_hres(struct 
hrtimer_cpu_base *base)
  * check happens. The timer gets enqueued into the rbtree. The reprogramming
  * and expiry check is done in the hrtimer_interrupt or in the softirq.
  */
-static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-   struct hrtimer_clock_base *base)
+static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer)
 {
-   return base->cpu_base->hres_active && hrtimer_reprogram(timer, base);
+   return timer->base->cpu_base->hres_active && hrtimer_reprogram(timer);
 }
 
 static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
@@ -743,8 +741,7 @@ void clock_was_set_delayed(void)
 static inline int hrtimer_hres_active(void) { return 0; }
 static inline int hrtimer_is_hres_enabled(void) { return 0; }
 static inline int hrtimer_switch_to_hres(void) { return 0; }
-static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
-   struct hrtimer_clock_base *base)
+static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer)
 {
return 0;
 }
@@ -991,7 +988,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 * XXX send_remote_softirq() ?
 */
if (leftmost && base_on_this_cpu(new_base)
-   && hrtimer_enqueue_reprogram(timer, new_base)) {
+   && hrtimer_enqueue_reprogram(timer)) {
if (wakeup) {
/*
 * We need to drop cpu_base->lock to avoid a
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 33/36] hrtimer: make lock_hrtimer_base() return void

2014-04-03 Thread Viresh Kumar
lock_hrtimer_base() always returns after taking lock and so timer->base can't
change further. So, callers of this routine can simply do timer->base to get the
base and so we can make this routine return void.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index e581bba..ea620e5 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -166,9 +166,7 @@ static void hrtimer_get_softirq_time(struct 
hrtimer_cpu_base *base)
  * possible to set timer->base = NULL and drop the lock: the timer remains
  * locked.
  */
-static
-struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
-unsigned long *flags)
+static void lock_hrtimer_base(const struct hrtimer *timer, unsigned long 
*flags)
 {
struct hrtimer_clock_base *base;
 
@@ -177,7 +175,7 @@ struct hrtimer_clock_base *lock_hrtimer_base(const struct 
hrtimer *timer,
if (likely(base != NULL)) {
raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);
if (likely(base == timer->base))
-   return base;
+   return;
/* The timer has migrated to another CPU: */
raw_spin_unlock_irqrestore(&base->cpu_base->lock, 
*flags);
}
@@ -254,14 +252,10 @@ again:
 
 #else /* CONFIG_SMP */
 
-static inline struct hrtimer_clock_base *
+static inline void
 lock_hrtimer_base(const struct hrtimer *timer, unsigned long *flags)
 {
-   struct hrtimer_clock_base *base = timer->base;
-
-   raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);
-
-   return base;
+   raw_spin_lock_irqsave(&timer->base->cpu_base->lock, *flags);
 }
 
 static inline void switch_hrtimer_base(struct hrtimer *timer, int pinned) {}
@@ -951,7 +945,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
unsigned long flags;
int ret, leftmost;
 
-   base = lock_hrtimer_base(timer, &flags);
+   lock_hrtimer_base(timer, &flags);
+   base = timer->base;
 
/* Remove an active timer from the queue: */
ret = remove_hrtimer(timer);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 21/36] hrtimer: create base_on_this_cpu()

2014-04-03 Thread Viresh Kumar
We had this code at two places to find if a clock base belongs to current CPU:
base->cpu_base == &__get_cpu_var(hrtimer_bases)

Better to get a inlined routine for that.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 458b952..2d5bb9d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -107,6 +107,10 @@ static inline int hrtimer_clockid_to_base(clockid_t 
clock_id)
return hrtimer_clock_to_base_table[clock_id];
 }
 
+static inline bool base_on_this_cpu(struct hrtimer_clock_base *base)
+{
+   return base->cpu_base == &__get_cpu_var(hrtimer_bases);
+}
 
 /*
  * Get the coarse grained time at the softirq based on xtime and
@@ -933,8 +937,7 @@ remove_hrtimer(struct hrtimer *timer, struct 
hrtimer_clock_base *base)
 * move the timer base in switch_hrtimer_base.
 */
state = timer->state & HRTIMER_STATE_CALLBACK;
-   __remove_hrtimer(timer, base, state,
-base->cpu_base == &__get_cpu_var(hrtimer_bases));
+   __remove_hrtimer(timer, base, state, base_on_this_cpu(base));
return 1;
 }
 
@@ -980,7 +983,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t 
tim,
 *
 * XXX send_remote_softirq() ?
 */
-   if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases)
+   if (leftmost && base_on_this_cpu(new_base)
&& hrtimer_enqueue_reprogram(timer, new_base)) {
if (wakeup) {
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 19/36] hrtimer: rewrite remove_hrtimer() to remove extra indentation level

2014-04-03 Thread Viresh Kumar
Complete bottom part of remove_hrtimer() is part of a 'if' block and so all code
present in that block has extra indentation level before it. Rewrite it to
remove this extra indentation level.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 46 ++
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index fe13dcf..2ac423d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -913,31 +913,29 @@ static void __remove_hrtimer(struct hrtimer *timer,
 static inline int
 remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
 {
-   if (hrtimer_is_queued(timer)) {
-   unsigned long state;
-   int reprogram;
+   unsigned long state;
 
-   /*
-* Remove the timer and force reprogramming when high
-* resolution mode is active and the timer is on the current
-* CPU. If we remove a timer on another CPU, reprogramming is
-* skipped. The interrupt event on this CPU is fired and
-* reprogramming happens in the interrupt handler. This is a
-* rare case and less expensive than a smp call.
-*/
-   debug_deactivate(timer);
-   timer_stats_hrtimer_clear_start_info(timer);
-   reprogram = base->cpu_base == &__get_cpu_var(hrtimer_bases);
-   /*
-* We must preserve the CALLBACK state flag here,
-* otherwise we could move the timer base in
-* switch_hrtimer_base.
-*/
-   state = timer->state & HRTIMER_STATE_CALLBACK;
-   __remove_hrtimer(timer, base, state, reprogram);
-   return 1;
-   }
-   return 0;
+   if (!hrtimer_is_queued(timer))
+   return 0;
+
+   /*
+* Remove the timer and force reprogramming when high resolution mode is
+* active and the timer is on the current CPU. If we remove a timer on
+* another CPU, reprogramming is skipped. The interrupt event on this
+* CPU is fired and reprogramming happens in the interrupt handler. This
+* is a rare case and less expensive than a smp call.
+*/
+   debug_deactivate(timer);
+   timer_stats_hrtimer_clear_start_info(timer);
+
+   /*
+* We must preserve the CALLBACK state flag here, otherwise we could
+* move the timer base in switch_hrtimer_base.
+*/
+   state = timer->state & HRTIMER_STATE_CALLBACK;
+   __remove_hrtimer(timer, base, state,
+base->cpu_base == &__get_cpu_var(hrtimer_bases));
+   return 1;
 }
 
 int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 22/36] hrtimer: clear active_bases as soon as the timer is removed

2014-04-03 Thread Viresh Kumar
Timers are removed from the red-black trees from __remove_hrtimer() and after
removing timer from the tree, it calls hrtimer_force_reprogram() which might use
value of cpu_base->active_bases. If the timer being removed is the last one on
that clock base, then cpu_base->active_bases wouldn't give the right value, as
there are no timers queued on the base but active base still marks it as active.

So, clear entry from active_bases as soon as timer is removed.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 2d5bb9d..379d21a 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -895,6 +895,9 @@ static void __remove_hrtimer(struct hrtimer *timer,
next_timer = timerqueue_getnext(&base->active);
timerqueue_del(&base->active, &timer->node);
 
+   if (!timerqueue_getnext(&base->active))
+   base->cpu_base->active_bases &= ~(1 << base->index);
+
 #ifdef CONFIG_HIGH_RES_TIMERS
/* Reprogram the clock event device. if enabled */
if (reprogram && base->cpu_base->hres_active &&
@@ -907,8 +910,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
}
 #endif
timer->state = newstate;
-   if (!timerqueue_getnext(&base->active))
-   base->cpu_base->active_bases &= ~(1 << base->index);
 }
 
 /*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 27/36] hrtimer: call switch_hrtimer_base() after setting new expiry time

2014-04-03 Thread Viresh Kumar
In switch_hrtimer_base() we are calling hrtimer_check_target() which guarantees
this:

/*
 * With HIGHRES=y we do not migrate the timer when it is expiring
 * before the next event on the target cpu because we cannot reprogram
 * the target cpu hardware and we would cause it to fire late.
 *
 * Called with cpu_base->lock of target cpu held.
 */

But switch_hrtimer_base() is only called from one place, i.e.
__hrtimer_start_range_ns() and at the point (where we call
switch_hrtimer_base()) expiration time is not yet known as we call this routine
later:

hrtimer_set_expires_range_ns()

To fix this, we need to find the updated expiry time before calling
switch_hrtimer_base() from hrtimer_set_expires_range_ns().

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index c712960..81e0251 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -964,11 +964,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, 
ktime_t tim,
/* Remove an active timer from the queue: */
ret = remove_hrtimer(timer, base);
 
-   /* Switch the timer base, if necessary: */
-   new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
-
if (mode & HRTIMER_MODE_REL) {
-   tim = ktime_add_safe(tim, new_base->get_time());
+   tim = ktime_add_safe(tim, base->get_time());
/*
 * CONFIG_TIME_LOW_RES is a temporary way for architectures
 * to signal that they simply return xtime in
@@ -977,12 +974,15 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, 
ktime_t tim,
 * timeouts. This will go away with the GTOD framework.
 */
 #ifdef CONFIG_TIME_LOW_RES
-   tim = ktime_add_safe(tim, new_base->resolution);
+   tim = ktime_add_safe(tim, base->resolution);
 #endif
}
 
hrtimer_set_expires_range_ns(timer, tim, delta_ns);
 
+   /* Switch the timer base, if necessary: */
+   new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
+
timer_stats_hrtimer_set_start_info(timer);
 
leftmost = enqueue_hrtimer(timer, new_base);
@@ -1000,7 +1000,7 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, 
ktime_t tim,
 * We need to drop cpu_base->lock to avoid a
 * lock ordering issue vs. rq->lock.
 */
-   raw_spin_unlock(&new_base->cpu_base->lock);
+   raw_spin_unlock(&timer->base->cpu_base->lock);
raise_softirq_irqoff(HRTIMER_SOFTIRQ);
local_irq_restore(flags);
return ret;
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 25/36] hrtimer: call hrtimer_get_softirq_time() only if cpu_base->active_bases is set

2014-04-03 Thread Viresh Kumar
We need to call hrtimer_get_softirq_time() only once for a cpu_base from
hrtimer_run_queues(). And it shouldn't be called if there are no timers queued
for that cpu_base.

Currently we are managing this with help of a variable: gettime. This part of
code can be simplified by using cpu_base->active_bases instead. With this we can
get rid of the 'if' block from the loop iterating over all clock bases.

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index b3ab19a..2d9a7e2 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1458,19 +1458,16 @@ void hrtimer_run_queues(void)
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
struct hrtimer_clock_base *base;
unsigned int active_bases;
-   int index, gettime = 1;
+   int index;
 
if (hrtimer_hres_active())
return;
 
-   for_each_active_base(index, base, cpu_base, active_bases) {
-   if (gettime) {
-   hrtimer_get_softirq_time(cpu_base);
-   gettime = 0;
-   }
+   if (cpu_base->active_bases)
+   hrtimer_get_softirq_time(cpu_base);
 
+   for_each_active_base(index, base, cpu_base, active_bases) {
raw_spin_lock(&cpu_base->lock);
-
while ((node = timerqueue_getnext(&base->active))) {
struct hrtimer *timer;
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 13/36] hrtimer: don't check state of base->hres_active in hrtimer_switch_to_hres()

2014-04-03 Thread Viresh Kumar
Caller of hrtimer_switch_to_hres(), i.e. hrtimer_run_pending(), has already
verified this by calling hrtimer_hres_active() and so we don't need to do it
again in hrtimer_switch_to_hres().

Signed-off-by: Viresh Kumar 
---
 kernel/hrtimer.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ad5b7ba..476ad5d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -690,9 +690,6 @@ static int hrtimer_switch_to_hres(void)
struct hrtimer_cpu_base *base = &per_cpu(hrtimer_bases, cpu);
unsigned long flags;
 
-   if (base->hres_active)
-   return 1;
-
local_irq_save(flags);
 
if (tick_init_highres()) {
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 00/36] timers/hrtimers: Cleanups & Fixes

2014-04-03 Thread Viresh Kumar
Hi Thomas,

I know you are not going to look at these before end of this merge window and
you wanted to have a look at V1 before me posting these. But I am reposting them
now due to these reasons:
- Need to resend my cpu isolation (cpuset.quiesce) patches which are based of
  these
- Few patches are dropped/merged/fixed/updated and so all the patches from V1
  wouldn't have made sense
- There were some new patches as well which I wanted to send

These have gone through fair bit of testing via kbuild system maintained by
Fengguang Wu.


These are some minor cleanups and potential bug fixes in there. These are based
of tip/timers-core-for-linus ..

V1 of most of these patches (~28) were posted here:
https://lkml.org/lkml/2014/3/26/107
https://lkml.org/lkml/2014/3/28/148

V1->V2:
- few new patches:
  - patches around for_each_active_base()
  - hrtimer: call switch_hrtimer_base() after setting new expiry time
  - Some other minor cleanups
- few patches are dropped
- few are merged together as they covered same stuff
- rebased all patches and moved the patches removing parameters or return values
  at the bottom, so that others can be applied easily. Though as per my last
  mail, it doesn't look like they are making the 'text' segments any bigger.

Viresh Kumar (36):
  hrtimer: replace 'tab' with 'space' after 'comma'
  hrtimer: Fix comment mistake over hrtimer_force_reprogram()
  hrtimer: fix routine names in comments
  hrtimer: remove {} around a single liner 'for' loop in
migrate_hrtimers()
  hrtimer: Coalesce format fragments in printk()
  hrtimer: remove dummy definition of hrtimer_force_reprogram()
  hrtimer: replace sizeof(struct hrtimer) with sizeof(*timer)
  hrtimer: move unlock_hrtimer_base() upwards
  hrtimer: call hrtimer_set_expires_range() from
hrtimer_set_expires_range_ns()
  hrtimer: use base->index instead of basenum in switch_hrtimer_base()
  hrtimer: no need to rewrite '1' to hrtimer_hres_enabled
  hrtimer: use base->hres_active directly instead of
hrtimer_hres_active()
  hrtimer: don't check state of base->hres_active in
hrtimer_switch_to_hres()
  hrtimer: reorder code in __remove_hrtimer()
  hrtimer: don't emulate notifier call to initialize timer base
  hrtimer: Create hrtimer_get_monoexpires()
  hrtimer: don't check if timer is queued in __remove_hrtimer()
  hrtimer: rewrite switch_hrtimer_base() to remove extra indentation
level
  hrtimer: rewrite remove_hrtimer() to remove extra indentation level
  hrtimer: replace base by new_base to get resolution:
__hrtimer_start_range_ns()
  hrtimer: create base_on_this_cpu()
  hrtimer: clear active_bases as soon as the timer is removed
  hrtimer: create for_each_active_base()
  hrtimer: Use for_each_active_base() to iterate over active clock
bases
  hrtimer: call hrtimer_get_softirq_time() only if
cpu_base->active_bases is set
  hrtimer: take lock only once for a cpu_base in hrtimer_run_queues()
  hrtimer: call switch_hrtimer_base() after setting new expiry time
  hrtimer: remove 'base' parameter from remove_timer() and
__remove_timer()
  hrtimer: remove 'base' parameter from switch_hrtimer_base()
  hrtimer: remove 'base' parameter from enqueue_hrtimer()
  hrtimer: remove 'base' parameter from hrtimer_{enqueue_}reprogram()
  hrtimer: make switch_hrtimer_base() return void
  hrtimer: make lock_hrtimer_base() return void
  hrtimer: make enqueue_hrtimer() return void
  timer: simplify CPU_UP_PREPARE notifier code path
  timer: don't emulate notifier call to initialize timer base

 include/linux/hrtimer.h |  14 +-
 kernel/hrtimer.c| 365 ++--
 kernel/timer.c  |  12 +-
 3 files changed, 179 insertions(+), 212 deletions(-)

-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] irqchip/irq-crossbar: not allocating enough memory

2014-04-03 Thread Sricharan R
On Thursday 03 April 2014 12:51 PM, Dan Carpenter wrote:
> We are allocating the size of a pointer and not the size of the data.
> This will lead to memory corruption.
>
> There isn't actually a "cb_device" struct, btw.  The code is only able
> to compile because GCC knows that all pointers are the same size.
>
> Fixes: 96ca848ef7ea ('DRIVERS: IRQCHIP: CROSSBAR: Add support for Crossbar 
> IP')
> Signed-off-by: Dan Carpenter 
>
> diff --git a/drivers/irqchip/irq-crossbar.c b/drivers/irqchip/irq-crossbar.c
> index fc817d2..3d15d16 100644
> --- a/drivers/irqchip/irq-crossbar.c
> +++ b/drivers/irqchip/irq-crossbar.c
> @@ -107,7 +107,7 @@ static int __init crossbar_of_init(struct device_node 
> *node)
>   int i, size, max, reserved = 0, entry;
>   const __be32 *irqsr;
>  
> - cb = kzalloc(sizeof(struct cb_device *), GFP_KERNEL);
> + cb = kzalloc(sizeof(*cb), GFP_KERNEL);
>  
>   if (!cb)
>   return -ENOMEM;
Yes. correct. Thanks for the catch.

Acked-by: Sricharan R 


Regards,
 Sricharan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCHC 0/3] sched/idle : find the idlest cpu with cpuidle info

2014-04-03 Thread Len Brown
Hi Daniel,

Interesting idea.

The benefit of this patch is to reduce power.
Have you been able to measure a power reduction, via power meter, or
via built-in RAPL power meter?
(turbostat will show RAPL watts, or if you have constant quantity of
work, use turbostat -J)

thanks,
-Len
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 0/2] FAULT_AROUND_ORDER patchset performance data for powerpc

2014-04-03 Thread Madhavan Srinivasan
Kirill A. Shutemov with faultaround patchset introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.

This patchset creates infrastructure to move the FAULT_AROUND_ORDER
to arch/ using Kconfig. This will enable architecture maintainers
to decide on suitable FAULT_AROUND_ORDER value based on
performance data for that architecture. First patch also adds
FAULT_AROUND_ORDER Kconfig element for X86. Second patch list
out the performance numbers for powerpc (platform pseries) and
adds FAULT_AROUND_ORDER Kconfig element for powerpc.

V2 Changes:
  Created Kconfig parameter for FAULT_AROUND_ORDER
  Added check in do_read_fault to handle FAULT_AROUND_ORDER value of 0
  Made changes in commit messages.

Madhavan Srinivasan (2):
  mm: move FAULT_AROUND_ORDER to arch/
  mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc

 arch/powerpc/platforms/pseries/Kconfig |5 +
 arch/x86/Kconfig   |4 
 include/linux/mm.h |9 +
 mm/memory.c|   12 +---
 4 files changed, 23 insertions(+), 7 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ktap and ebpf integration

2014-04-03 Thread Alexei Starovoitov
On Thu, Apr 3, 2014 at 6:21 PM, Jovi Zhangwei  wrote:
> Hi Alexei,
>
> We talked a lot on ktap and ebpf integration in these days,
> Now I think we can put into deeply to thinking out some
> technical issues in there.
>
> Firstly, I want to make sure you are support this ktap and
> ebpf integration direction, I aware you have ongoing 'bpf filter'
> patch set work, which actually overlapping with ktap integration
> efforts (IMO the interface should be unified and simple for user,
>  so I think filter debugfs file is not a good interface), so please let
> me know your answer about this.

I think the more choices users have the better.
I'll continue with C based filters and you can continue with ktap
syntax. That's ok. We can share all kernel pieces.
Like:
1.
user: C -> llvm -> obj_file
kernel: obj_file -> ibpf_verifier -> ibpf execution engine
2.
user: ktap language -> ktap_compiler -> obj_file
kernel: obj_file -> ibpf_verifier -> ibpf execution engine

> If the answer is yes, then we can go through ebpf core
> improvement, for example:

In the architecture I'm proposing there are three main pieces:
- user facing language and userspace compiler into ibpf
  instruction set stored into object file format like ELF
  or something simpler
- in kernel loader of that object file, license and instruction verifier
- ibpf execution engine

ibpf execution engine can do all requested features already.
It's a matter of loader and verifier to accept them.
For example:

> - support global variable access

from execution engine point of view global or stack variable
makes no difference. It's a 'ld rY, word ptr [rX]' instruction.
where register rX is pointing to the stack or to some memory location.
In my old patch set 'verifier' was proving correctness of stack
and table accesses only, since I didn't see the need for global
pointers yet, but we can add it.

>   this is mandatory for dynamic tracing, otherwise, there have
>   no possible to run a simple script like get function execution
>   time.

I don't understand the correlation between measuring function
execution time and global variables.
I think userspace should be measuring script execution time.
Time sampling within kernel can be done from ibpf program
by calling ktime_get().

> - support timer in kernel
>   The final solution must need to support kernel timer for profiling,
>   and sampling stack.

we can let programs be executed in kernel by timer events, but
I think it's a userspace task.
If userspace can do it without hurting performance, it probably
should do it.

For example to do systemtap 'iotop.stp' which looks like:
probe vfs.read.return {
reads[execname()] += bytes_read
}
probe vfs.write.return {
writes[execname()] += bytes_written
}
# print top 10 IO processes every 5 seconds
probe timer.s(5) {
foreach (name in writes)
total_io[name] += writes[name]
foreach (name in reads)
total_io[name] += reads[name]
printf ("%16s\t%10s\t%10s\n", "Process", "KB Read", "KB Written")
...
}
first two probe functions belong in kernel as two independent
ibpf programs that access 'reads' and 'writes' tables,
and 'timer.s' really belongs in userspace.
Every 5 seconds it can access 'reads' and 'write' tables, sort them,
print them, etc.
The important concept here is a user/kernel shared table.
ibpf program can read/write to it from kernel.
userspace component can read/write it in parallel.

Back in september I posted patches for this style of table
access via netlink.
Note that ibpf program doesn't own memory.
It can call 'bpf_table_update' to store key/value pair
into kernel table. Think of it as small in kernel database
that ibpf program can store data to and user space can
read/write data at the same time.

> - support register multi-event in one script

I think it should be clear now, that it's already supported.
one ibpf program == one function.
object file may contain multiple programs that attach to
different kprobe events and store key/value pairs into
the same or different tables.
>From verifier point of view this two programs are disjoint.
They cannot call each other. Verifier checks them
independently.

> - support trace_end

if you mean the final print out of everything,
then it's a userspace task.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 1/2] mm: move FAULT_AROUND_ORDER to arch/

2014-04-03 Thread Madhavan Srinivasan
Kirill A. Shutemov with faultaround patchset introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.

This patch creates infrastructure to move the FAULT_AROUND_ORDER
to arch/ using Kconfig. This will enable architecture maintainers
to decide on suitable FAULT_AROUND_ORDER value based on
performance data for that architecture. Patch also adds
FAULT_AROUND_ORDER Kconfig element in arch/X86.

Signed-off-by: Madhavan Srinivasan 
---
 arch/x86/Kconfig   |4 
 include/linux/mm.h |9 +
 mm/memory.c|   12 +---
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9c0a657..5833f22 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1177,6 +1177,10 @@ config DIRECT_GBPAGES
  support it. This can improve the kernel's performance a tiny bit by
  reducing TLB pressure. If in doubt, say "Y".
 
+config FAULT_AROUND_ORDER
+   int
+   default "4"
+
 # Common NUMA Features
 config NUMA
bool "Numa Memory Allocation and Scheduler Support"
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0bd4359..b93c1c3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -26,6 +26,15 @@ struct file_ra_state;
 struct user_struct;
 struct writeback_control;
 
+/*
+ * Fault around order is a control knob to decide the fault around pages.
+ * Default value is set to 0UL (disabled), but the arch can override it as
+ * desired.
+ */
+#ifndef CONFIG_FAULT_AROUND_ORDER
+#define CONFIG_FAULT_AROUND_ORDER 0
+#endif
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES /* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
 
diff --git a/mm/memory.c b/mm/memory.c
index b02c584..22a4a89 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3358,10 +3358,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned 
long address,
update_mmu_cache(vma, address, pte);
 }
 
-#define FAULT_AROUND_ORDER 4
-
 #ifdef CONFIG_DEBUG_FS
-static unsigned int fault_around_order = FAULT_AROUND_ORDER;
+static unsigned int fault_around_order = CONFIG_FAULT_AROUND_ORDER;
 
 static int fault_around_order_get(void *data, u64 *val)
 {
@@ -3371,7 +3369,7 @@ static int fault_around_order_get(void *data, u64 *val)
 
 static int fault_around_order_set(void *data, u64 val)
 {
-   BUILD_BUG_ON((1UL << FAULT_AROUND_ORDER) > PTRS_PER_PTE);
+   BUILD_BUG_ON((1UL << CONFIG_FAULT_AROUND_ORDER) > PTRS_PER_PTE);
if (1UL << val > PTRS_PER_PTE)
return -EINVAL;
fault_around_order = val;
@@ -3406,14 +3404,14 @@ static inline unsigned long fault_around_pages(void)
 {
unsigned long nr_pages;
 
-   nr_pages = 1UL << FAULT_AROUND_ORDER;
+   nr_pages = 1UL << CONFIG_FAULT_AROUND_ORDER;
BUILD_BUG_ON(nr_pages > PTRS_PER_PTE);
return nr_pages;
 }
 
 static inline unsigned long fault_around_mask(void)
 {
-   return ~((1UL << (PAGE_SHIFT + FAULT_AROUND_ORDER)) - 1);
+   return ~((1UL << (PAGE_SHIFT + CONFIG_FAULT_AROUND_ORDER)) - 1);
 }
 #endif
 
@@ -3471,7 +3469,7 @@ static int do_read_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 * if page by the offset is not ready to be mapped (cold cache or
 * something).
 */
-   if (vma->vm_ops->map_pages) {
+   if ((vma->vm_ops->map_pages) && (fault_around_pages() > 1)) {
pte = pte_offset_map_lock(mm, pmd, address, &ptl);
do_fault_around(vma, address, pte, pgoff, flags);
if (!pte_same(*pte, orig_pte))
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 2/2] mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc

2014-04-03 Thread Madhavan Srinivasan
Performance data for different FAULT_AROUND_ORDER values from 4 socket
Power7 system (128 Threads and 128GB memory) is below. perf stat with
repeat of 5 is used to get the stddev values. This patch create
FAULT_AROUND_ORDER Kconfig parameter and defaults it to 3 based on the
performance data.

FAULT_AROUND_ORDER  Baseline1   3   4   
5   7

Linux build (make -j64)
minor-faults7184385 5874015 4567289 4318518 
4193815 4159193
times in seconds61.43377613660.86593529259.245368038
60.63067501160.56587624 59.828271924
 stddev for time( +-  1.18% )   ( +-  1.78% )   ( +-  0.44% )   ( +-  
2.03% )   ( +-  1.66% )   ( +-  1.45% )

Linux rebuild (make -j64)
minor-faults303018  226392  146170  132480  
126878  126236
times in seconds5.659819172 5.723996942 5.591238319 
5.622533357 5.878811995 5.550133096
 stddev for time( +-  0.71% )   ( +-  0.78% )   ( +-  0.62% )   ( +-  
0.45% )   ( +-  1.55% )   ( +-  0.29% )

Two synthetic tests: access every word in file in sequential/random order.
Marginal Performance gains seen for FAO value of 3 when compared to value
of 4.

Sequential access 16GiB file
FAULT_AROUND_ORDER  Baseline1   3   4   
5   7
1 thread
   minor-faults 262302  131192  32873   16486   
82912351
   times in seconds 53.07149735252.94582688252.931417302
52.92857718452.85928543953.116800539
   stddev for time  ( +-  0.01% )   ( +-  0.02% )   ( +-  0.02% )   ( +-  
0.04% )   ( +-  0.04% )   ( +-  0.01% )
8 threads
   minor-faults 2097314 1051046 263336  131715  
66098   16653
   times in seconds 54.38569856154.60365233954.771282004
54.48856567454.49670153154.962142189
   stddev for time  ( +-  0.05% )   ( +-  0.02% )   ( +-  0.37% )   ( +-  
0.08% )   ( +-  0.07% )   ( +-  0.51% )
32 threads
   minor-faults 8389267 4218595 1059961 531319  
266463  67271
   times in seconds 60.61715047 60.82796403860.46412673 
60.26604588560.49239831560.24531921
   stddev for time  ( +-  0.65% )   ( +-  0.21% )   ( +-  0.25% )   ( +-  
0.29% )   ( +-  0.19% )   ( +-  0.35% )
64 threads
   minor-faults 167774558485998 2178582 1092106 
544302  137693
   times in seconds 86.47133455484.41241573585.208303832
84.33147339285.59879347984.695469266
   stddev for time  ( +-  0.60% )   ( +-  1.47% )   ( +-  0.74% )   ( +-  
1.55% )   ( +-  0.92% )   ( +-  1.16% )
128 threads
   minor-faults 33555267177345224710107 2380821 
1182707 292077
   times in seconds 117.535385569   114.291359037   112.593908276   
113.081807611   114.358686588   114.491043011
   stddev for time  ( +-  2.24% )   ( +-  0.92% )   ( +-  0.36% )   ( +-  
0.53% )   ( +-  0.70% )   ( +-  0.53% )

Random access 1GiB file
FAULT_AROUND_ORDER  Baseline1   3   4   
5   7
1 thread
   minor-faults 16503   866421491126
610 437
   times in seconds 43.84357380848.04206980550.580779682
54.28288459352.64173987651.803302129
   stddev for time  ( +-  1.30% )   ( +-  2.25% )   ( +-  2.92% )   ( +-  
1.44% )   ( +-  4.49% )   ( +-  3.78% )
8 threads
   minor-faults 131201  70916   17760   8665
42501149
   times in seconds 46.26262680455.94285104156.629191584
57.97044714 55.41755759456.019709166
   stddev for time  ( +-  4.66% )   ( +-  1.52% )   ( +-  1.43% )   ( +-  
1.61% )   ( +-  0.65% )   ( +-  1.27% )
32 threads
   minor-faults 524959  265980  67282   33601   
16930   4316
   times in seconds 67.75417592869.85012331 71.750338061
71.05307464368.90728294 71.250103217
   stddev for time  ( +-  3.79% )   ( +-  0.77% )   ( +-  1.15% )   ( +-  
1.08% )   ( +-  2.14% )   ( +-  1.17% )
64 threads
   minor-faults 1048831 528829  133256  66700   
33428   8776
   times in seconds 96.67402530593.10996182287.44115
91.98633202888.68674847293.101434306
   stddev for time  ( +-  2.85% )   ( +-  2.42% )   ( +-  0.42% )   ( +-  
1.58% )   ( +-  1.29% )   ( +-  2.01% )
128 threads
   minor-faults 2098043 1053224 266271  133702  
66966   17276
   times in seconds 156.525792044   152.117971403   147.523673243   
148.

Re: [BUG] x86: reboot doesn't reboot

2014-04-03 Thread Ingo Molnar

* H. Peter Anvin  wrote:

> Keep in mind we already tried CF9 in the default flow and it broke 
> things.  I'm willing to wait for reports about production machines, 
> though, but I fully expect them.

Typically there's nothing particularly weird about preproduction Intel 
hardware when it comes to reboot methods, other than people more 
willing to test them with development kernels.

Is there any system were the triple fault method does not work?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] usb: doc: udc-xilinx: Add devicetree bindings

2014-04-03 Thread Michal Simek
On 04/03/2014 04:59 PM, Felipe Balbi wrote:
> On Thu, Apr 03, 2014 at 01:05:18PM +0530, Subbaraya Sundeep Bhatta wrote:
>> Add devicetree bindings for Xilinx axi udc driver.
>>
>> Signed-off-by: Subbaraya Sundeep Bhatta 
>> ---
>> Changes for v2:
>>  - replaced xlnx,include-dma with xlnx,has-builtin-dma
>>
>>  .../devicetree/bindings/usb/udc-xilinx.txt |   20 
>> 
>>  1 files changed, 20 insertions(+), 0 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/usb/udc-xilinx.txt
>>
>> diff --git a/Documentation/devicetree/bindings/usb/udc-xilinx.txt 
>> b/Documentation/devicetree/bindings/usb/udc-xilinx.txt
>> new file mode 100644
>> index 000..7c24fac
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/usb/udc-xilinx.txt
>> @@ -0,0 +1,20 @@
>> +Xilinx AXI USB2 device controller
>> +
>> +Required properties:
>> +- compatible: Should be "xlnx,axi-usb2-device-4.00.a"
>> +- reg   : Physical base address and size of the Axi USB2
>> +  device registers map.
>> +- interrupts: Property with a value describing the interrupt
>> +  number.
>> +- interrupt-parent  : Must be core interrupt controller
>> +- xlnx,has-builtin-dma  : if DMA is included
> 
> isn't there a configuration register to tell you this ?

I have checked this with Sundeep and there is nothing like that in the HW.

> 
>> +
>> +Example:
>> +axi-usb2-device@42e0 {
>> +compatible = "xlnx,axi-usb2-device-4.00.a";
>> +interrupt-parent = <0x1>;
> 
> why isn't interrupt-parent a phandle ?

Just for the record: Using number here should be also fine because DTC
is converting it to numbers with linux,phandle and phandle.
[linux-next]$ dtc -O dts -I dtb /tftpboot/devicetree.dtb | less
...
ps7-scugic@f8f01000 {
#address-cells = <0x2>;
#interrupt-cells = <0x3>;
#size-cells = <0x1>;
compatible = "arm,cortex-a9-gic", "arm,gic";
interrupt-controller;
num_cpus = <0x2>;
num_interrupts = <0x60>;
reg = <0xf8f01000 0x1000 0xf8f00100 0x100>;
linux,phandle = <0x3>;
phandle = <0x3>;
};

ps7-scutimer@f8f00600 {
clocks = <0x2 0x4>;
compatible = "arm,cortex-a9-twd-timer";
interrupt-parent = <0x3>;
interrupts = <0x1 0xd 0x301>;
reg = <0xf8f00600 0x20>;
};
...

but anyway Sundeep with change it to any sensible value <&intc>;

Thanks for pointing to it,
Michal


-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform




signature.asc
Description: OpenPGP digital signature


For review: open_by_handle_at(2) man page [v4]

2014-04-03 Thread Michael Kerrisk (man-pages)
Hello Aneesh,

After integrating review comments from NeilBown, Christoph Hellwig,
and Mike Frysinger, here is draft 4 of a man page I've written for
name_to_handle_at(2) and open_by_handle_at(2). (The changes since
draft 3 are only minor.)

Would you be willing to review it please, and let me know of any
corrections/improvements? (Of course, further comments from anyone
else are also welcome.)

There are some FIXMEs in the page that I would especially like some help with.

Thanks,

Michael


.\" Copyright (c) 2014 by Michael Kerrisk 
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include 
.B #include 
.B #include 

.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI "  struct file_handle *" handle ,
.BI "  int *" mount_id ", int " flags );

.BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle ,
.BI "  int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.\"
.\"
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by the
.IR dirfd
and
.IR pathname
arguments.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:

.in +4n
.nf
struct file_handle {
unsigned int  handle_bytes;   /* Size of f_handle [in, out] */
int   handle_type;/* Handle type [out] */
unsigned char f_handle[0];/* File identifier (sized by
 caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR  ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .

The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).

Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().

The
.I flags
argument is a bit mask constructed by ORing together zero or more of
.BR AT_EMPTY_PATH
and
.BR AT_SYMLINK_FOLLOW ,
described below.

Together, the
.I pathname
and
.I dirfd
arguments identify the file for which a handle is to be obtained.
There are four distinct cases:
.IP * 3
If
.I pathname
is a nonempty string containing an absolute pathname,
then a handle is returned for the file referred to by that pathname.
In this case,
.IR dirfd
is ignored.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working directory of the caller,
and a handle is returned for the file t

Re: [PATCH 1/2] devicetree: Add devicetree bindings documentation for Zynq Quad SPI

2014-04-03 Thread Michal Simek
Hi Mark and Harini,

On 04/04/2014 05:01 AM, Harini Katakam wrote:
> Hi Mark,
> 
> On Fri, Apr 4, 2014 at 2:31 AM, Mark Brown  wrote:
>> On Thu, Apr 03, 2014 at 10:33:06PM +0530, Punnaiah Choudary Kalluri wrote:
>>
>>> +Optional properties:
>>> +- num-cs : Number of chip selects used.
>>
>> What does this translate into?
>>
>>> + num-cs = /bits/ 16 <1>;
>>
>> Why the odd specification in the example - why not just specify it as a
>> number?
> 
> Same as discussed on SPI cadence thread.

I have discussed this briefly with Rob and it is more up to Mark
if he wants to have this with 16bit width or not. I expect that
"num-cs" is getting to be shared across spi drivers
and maybe in near future you will move "num-cs" of probe to spi core
that's why it should stay 32bit for easier integration.

I have asked Harini some weeks ago to try to do it just with
of_property_read_u16 because you can directly setup
master->num_chipselect and you don't need to read it as u32
and saving to u16.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform




signature.asc
Description: OpenPGP digital signature


[PATCH v2 1/2] media: davinci: vpif capture: upgrade the driver with v4l offerings

2014-04-03 Thread Lad, Prabhakar
From: "Lad, Prabhakar" 

This patch upgrades the vpif display driver with
v4l helpers, this patch does the following,

1: initialize the vb2 queue and context at the time of probe
and removes context at remove() callback.
2: uses vb2_ioctl_*() helpers.
3: uses vb2_fop_*() helpers.
4: uses SIMPLE_DEV_PM_OPS.
5: uses vb2_ioctl_*() helpers.
6: vidioc_g/s_priority is now handled by v4l core.
7: removed driver specific fh and now using one provided by v4l.
8: fixes checkpatch warnings.

Signed-off-by: Lad, Prabhakar 
---
 drivers/media/platform/davinci/vpif_capture.c |  931 +++--
 drivers/media/platform/davinci/vpif_capture.h |   32 +-
 2 files changed, 234 insertions(+), 729 deletions(-)

diff --git a/drivers/media/platform/davinci/vpif_capture.c 
b/drivers/media/platform/davinci/vpif_capture.c
index 8dea0b8..e4046f5 100644
--- a/drivers/media/platform/davinci/vpif_capture.c
+++ b/drivers/media/platform/davinci/vpif_capture.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (C) 2009 Texas Instruments Inc
+ * Copyright (C) 2014 Lad, Prabhakar 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -37,6 +38,8 @@ MODULE_VERSION(VPIF_CAPTURE_VERSION);
 #define vpif_dbg(level, debug, fmt, arg...)\
v4l2_dbg(level, debug, &vpif_obj.v4l2_dev, fmt, ## arg)
 
+#define VPIF_DRIVER_NAME   "vpif_capture"
+
 static int debug = 1;
 static u32 ch0_numbuffers = 3;
 static u32 ch1_numbuffers = 3;
@@ -65,11 +68,25 @@ static struct vpif_config_params config_params = {
.channel_bufsize[1] = 720 * 576 * 2,
 };
 
+static u8 channel_first_int[VPIF_NUMBER_OF_OBJECTS][2] = { {1, 1} };
+
 /* global variables */
 static struct vpif_device vpif_obj = { {NULL} };
 static struct device *vpif_dev;
 static void vpif_calculate_offsets(struct channel_obj *ch);
 static void vpif_config_addr(struct channel_obj *ch, int muxmode);
+static int vpif_check_format(struct channel_obj *ch,
+struct v4l2_pix_format *pixfmt, int update);
+
+/*
+ * Is set to 1 in case of SDTV formats, 2 in case of HDTV formats.
+ */
+static int ycmux_mode;
+
+static inline struct vpif_cap_buffer *to_vpif_buffer(struct vb2_buffer *vb)
+{
+   return container_of(vb, struct vpif_cap_buffer, vb);
+}
 
 /**
  * buffer_prepare :  callback function for buffer prepare
@@ -81,10 +98,8 @@ static void vpif_config_addr(struct channel_obj *ch, int 
muxmode);
  */
 static int vpif_buffer_prepare(struct vb2_buffer *vb)
 {
-   /* Get the file handle object and channel object */
-   struct vpif_fh *fh = vb2_get_drv_priv(vb->vb2_queue);
struct vb2_queue *q = vb->vb2_queue;
-   struct channel_obj *ch = fh->channel;
+   struct channel_obj *ch = vb2_get_drv_priv(q);
struct common_obj *common;
unsigned long addr;
 
@@ -100,7 +115,7 @@ static int vpif_buffer_prepare(struct vb2_buffer *vb)
goto exit;
addr = vb2_dma_contig_plane_dma_addr(vb, 0);
 
-   if (q->streaming) {
+   if (vb2_is_streaming(q)) {
if (!IS_ALIGNED((addr + common->ytop_off), 8) ||
!IS_ALIGNED((addr + common->ybtm_off), 8) ||
!IS_ALIGNED((addr + common->ctop_off), 8) ||
@@ -131,9 +146,7 @@ static int vpif_buffer_queue_setup(struct vb2_queue *vq,
unsigned int *nbuffers, unsigned int *nplanes,
unsigned int sizes[], void *alloc_ctxs[])
 {
-   /* Get the file handle object and channel object */
-   struct vpif_fh *fh = vb2_get_drv_priv(vq);
-   struct channel_obj *ch = fh->channel;
+   struct channel_obj *ch = vb2_get_drv_priv(vq);
struct common_obj *common;
unsigned long size;
 
@@ -141,8 +154,7 @@ static int vpif_buffer_queue_setup(struct vb2_queue *vq,
 
vpif_dbg(2, debug, "vpif_buffer_setup\n");
 
-   /* If memory type is not mmap, return */
-   if (V4L2_MEMORY_MMAP == common->memory) {
+   if (vq->memory == V4L2_MEMORY_MMAP) {
/* Calculate the size of the buffer */
size = config_params.channel_bufsize[ch->channel_id];
/*
@@ -183,11 +195,8 @@ static int vpif_buffer_queue_setup(struct vb2_queue *vq,
  */
 static void vpif_buffer_queue(struct vb2_buffer *vb)
 {
-   /* Get the file handle object and channel object */
-   struct vpif_fh *fh = vb2_get_drv_priv(vb->vb2_queue);
-   struct channel_obj *ch = fh->channel;
-   struct vpif_cap_buffer *buf = container_of(vb,
-   struct vpif_cap_buffer, vb);
+   struct channel_obj *ch = vb2_get_drv_priv(vb->vb2_queue);
+   struct vpif_cap_buffer *buf = to_vpif_buffer(vb);
struct common_obj *common;
unsigned long flags;
 
@@ -210,11 +219,8 @@ static void vpif_buffer_queue(struct vb2_buffer *vb)
  */
 static void vpi

[PATCH] Shiraz has moved

2014-04-03 Thread Viresh Kumar
shiraz.has...@st.com email-id doesn't exist anymore as he has left the
company. Replace ST's id with shiraz.linux.ker...@gmail.com.

It also updates .mailmap file to fix address for 'git shortlog'.

Cc: Shiraz Hashim 
Signed-off-by: Viresh Kumar 
---
 .mailmap   | 1 +
 MAINTAINERS| 2 +-
 arch/arm/boot/dts/spear320-hmi.dts | 2 +-
 arch/arm/mach-spear/headsmp.S  | 2 +-
 arch/arm/mach-spear/platsmp.c  | 2 +-
 arch/arm/mach-spear/time.c | 2 +-
 drivers/gpio/gpio-spear-spics.c| 4 ++--
 drivers/irqchip/spear-shirq.c  | 2 +-
 drivers/mtd/devices/spear_smi.c| 4 ++--
 drivers/pwm/pwm-spear.c| 4 ++--
 include/linux/mtd/spear_smi.h  | 2 +-
 11 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/.mailmap b/.mailmap
index 658003a..df1baba 100644
--- a/.mailmap
+++ b/.mailmap
@@ -99,6 +99,7 @@ Sachin P Sant 
 Sam Ravnborg 
 Sascha Hauer 
 S.Çağlar Onur 
+Shiraz Hashim  
 Simon Kelley 
 Stéphane Witzmann 
 Stephen Hemminger 
diff --git a/MAINTAINERS b/MAINTAINERS
index edd6139..014e5d6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8143,7 +8143,7 @@ F:include/linux/compiler.h
 
 SPEAR PLATFORM SUPPORT
 M: Viresh Kumar 
-M: Shiraz Hashim 
+M: Shiraz Hashim 
 L: spear-de...@list.st.com
 L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
 W: http://www.st.com/spear
diff --git a/arch/arm/boot/dts/spear320-hmi.dts 
b/arch/arm/boot/dts/spear320-hmi.dts
index 3075d2d..0aa6fef 100644
--- a/arch/arm/boot/dts/spear320-hmi.dts
+++ b/arch/arm/boot/dts/spear320-hmi.dts
@@ -1,7 +1,7 @@
 /*
  * DTS file for SPEAr320 Evaluation Baord
  *
- * Copyright 2012 Shiraz Hashim 
+ * Copyright 2012 Shiraz Hashim 
  *
  * The code contained herein is licensed under the GNU General Public
  * License. You may obtain a copy of the GNU General Public License
diff --git a/arch/arm/mach-spear/headsmp.S b/arch/arm/mach-spear/headsmp.S
index ed85473..c52192d 100644
--- a/arch/arm/mach-spear/headsmp.S
+++ b/arch/arm/mach-spear/headsmp.S
@@ -3,7 +3,7 @@
  *
  * Picked from realview
  * Copyright (c) 2012 ST Microelectronics Limited
- * Shiraz Hashim 
+ * Shiraz Hashim 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
diff --git a/arch/arm/mach-spear/platsmp.c b/arch/arm/mach-spear/platsmp.c
index 5c4a198..c19751f 100644
--- a/arch/arm/mach-spear/platsmp.c
+++ b/arch/arm/mach-spear/platsmp.c
@@ -4,7 +4,7 @@
  * based upon linux/arch/arm/mach-realview/platsmp.c
  *
  * Copyright (C) 2012 ST Microelectronics Ltd.
- * Shiraz Hashim 
+ * Shiraz Hashim 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
diff --git a/arch/arm/mach-spear/time.c b/arch/arm/mach-spear/time.c
index d449673..e43516d 100644
--- a/arch/arm/mach-spear/time.c
+++ b/arch/arm/mach-spear/time.c
@@ -2,7 +2,7 @@
  * arch/arm/plat-spear/time.c
  *
  * Copyright (C) 2010 ST Microelectronics
- * Shiraz Hashim
+ * Shiraz Hashim
  *
  * This file is licensed under the terms of the GNU General Public
  * License version 2. This program is licensed "as is" without any
diff --git a/drivers/gpio/gpio-spear-spics.c b/drivers/gpio/gpio-spear-spics.c
index e9a0415..30bcc53 100644
--- a/drivers/gpio/gpio-spear-spics.c
+++ b/drivers/gpio/gpio-spear-spics.c
@@ -2,7 +2,7 @@
  * SPEAr platform SPI chipselect abstraction over gpiolib
  *
  * Copyright (C) 2012 ST Microelectronics
- * Shiraz Hashim 
+ * Shiraz Hashim 
  *
  * This file is licensed under the terms of the GNU General Public
  * License version 2. This program is licensed "as is" without any
@@ -205,6 +205,6 @@ static int __init spics_gpio_init(void)
 }
 subsys_initcall(spics_gpio_init);
 
-MODULE_AUTHOR("Shiraz Hashim ");
+MODULE_AUTHOR("Shiraz Hashim ");
 MODULE_DESCRIPTION("ST Microlectronics SPEAr SPI Chip Select Abstraction");
 MODULE_LICENSE("GPL");
diff --git a/drivers/irqchip/spear-shirq.c b/drivers/irqchip/spear-shirq.c
index 8527743..3fdda3a 100644
--- a/drivers/irqchip/spear-shirq.c
+++ b/drivers/irqchip/spear-shirq.c
@@ -5,7 +5,7 @@
  * Viresh Kumar 
  *
  * Copyright (C) 2012 ST Microelectronics
- * Shiraz Hashim 
+ * Shiraz Hashim 
  *
  * This file is licensed under the terms of the GNU General Public
  * License version 2. This program is licensed "as is" without any
diff --git a/drivers/mtd/devices/spear_smi.c b/drivers/mtd/devices/spear_smi.c
index 4238214..e52c880 100644
--- a/drivers/mtd/devices/spear_smi.c
+++ b/drivers/mtd/devices/spear_smi.c
@@ -6,7 +6,7 @@
  *
  * Copyright © 2010 STMicroelectronics.
  * Ashish Priyadarshi
- * Shiraz Hashim 
+ * Shiraz Hashim 
  *
  * This file is licensed under the terms of the GNU General Public
  * License version 2. This program is licensed "as is" without any
@@ -1091,5 +1091,5 @@ static struct platform_driver spear_smi_driver = {
 module_pla

[PATCH v2 2/2] media: davinci: vpif display: upgrade the driver with v4l offerings

2014-04-03 Thread Lad, Prabhakar
From: "Lad, Prabhakar" 

This patch upgrades the vpif display driver with
v4l helpers, this patch does the following,

1: initialize the vb2 queue and context at the time of probe
and removes context at remove() callback.
2: uses vb2_ioctl_*() helpers.
3: uses vb2_fop_*() helpers.
4: uses SIMPLE_DEV_PM_OPS.
5: uses vb2_ioctl_*() helpers.
6: vidioc_g/s_priority is now handled by v4l core.
7: removed driver specific fh and now using one provided by v4l.
8: fixes checkpatch warnings.

Signed-off-by: Lad, Prabhakar 
---
 drivers/media/platform/davinci/vpif_display.c |  800 +++--
 drivers/media/platform/davinci/vpif_display.h |   31 +-
 2 files changed, 221 insertions(+), 610 deletions(-)

diff --git a/drivers/media/platform/davinci/vpif_display.c 
b/drivers/media/platform/davinci/vpif_display.c
index aed41ed..f3ae946 100644
--- a/drivers/media/platform/davinci/vpif_display.c
+++ b/drivers/media/platform/davinci/vpif_display.c
@@ -3,6 +3,7 @@
  * Display driver for TI DaVinci VPIF
  *
  * Copyright (C) 2009 Texas Instruments Incorporated - http://www.ti.com/
+ * Copyright (C) 2014 Lad, Prabhakar 
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
@@ -34,6 +35,8 @@ MODULE_VERSION(VPIF_DISPLAY_VERSION);
 #define vpif_dbg(level, debug, fmt, arg...)\
v4l2_dbg(level, debug, &vpif_obj.v4l2_dev, fmt, ## arg)
 
+#define VPIF_DRIVER_NAME   "vpif_display"
+
 static int debug = 1;
 static u32 ch2_numbuffers = 3;
 static u32 ch3_numbuffers = 3;
@@ -64,9 +67,21 @@ static struct vpif_config_params config_params = {
 
 static struct vpif_device vpif_obj = { {NULL} };
 static struct device *vpif_dev;
+static u8 channel_first_int[VPIF_NUMOBJECTS][2] = { {1, 1} };
+
+/*
+ * Is set to 1 in case of SDTV formats, 2 in case of HDTV formats.
+ */
+static int ycmux_mode;
+
 static void vpif_calculate_offsets(struct channel_obj *ch);
 static void vpif_config_addr(struct channel_obj *ch, int muxmode);
 
+static inline struct vpif_disp_buffer *to_vpif_buffer(struct vb2_buffer *vb)
+{
+   return container_of(vb, struct vpif_disp_buffer, vb);
+}
+
 /*
  * buffer_prepare: This is the callback function called from vb2_qbuf()
  * function the buffer is prepared and user space virtual address is converted
@@ -74,12 +89,12 @@ static void vpif_config_addr(struct channel_obj *ch, int 
muxmode);
  */
 static int vpif_buffer_prepare(struct vb2_buffer *vb)
 {
-   struct vpif_fh *fh = vb2_get_drv_priv(vb->vb2_queue);
struct vb2_queue *q = vb->vb2_queue;
+   struct channel_obj *ch = vb2_get_drv_priv(q);
struct common_obj *common;
unsigned long addr;
 
-   common = &fh->channel->common[VPIF_VIDEO_INDEX];
+   common = &ch->common[VPIF_VIDEO_INDEX];
if (vb->state != VB2_BUF_STATE_ACTIVE &&
vb->state != VB2_BUF_STATE_PREPARED) {
vb2_set_plane_payload(vb, 0, common->fmt.fmt.pix.sizeimage);
@@ -88,7 +103,7 @@ static int vpif_buffer_prepare(struct vb2_buffer *vb)
goto buf_align_exit;
 
addr = vb2_dma_contig_plane_dma_addr(vb, 0);
-   if (q->streaming &&
+   if (vb2_is_streaming(q) &&
(V4L2_BUF_TYPE_SLICED_VBI_OUTPUT != q->type)) {
if (!ISALIGNED(addr + common->ytop_off) ||
!ISALIGNED(addr + common->ybtm_off) ||
@@ -112,12 +127,11 @@ static int vpif_buffer_queue_setup(struct vb2_queue *vq,
unsigned int *nbuffers, unsigned int *nplanes,
unsigned int sizes[], void *alloc_ctxs[])
 {
-   struct vpif_fh *fh = vb2_get_drv_priv(vq);
-   struct channel_obj *ch = fh->channel;
+   struct channel_obj *ch = vb2_get_drv_priv(vq);
struct common_obj *common = &ch->common[VPIF_VIDEO_INDEX];
unsigned long size;
 
-   if (V4L2_MEMORY_MMAP == common->memory) {
+   if (vq->memory == V4L2_MEMORY_MMAP) {
size = config_params.channel_bufsize[ch->channel_id];
/*
* Checking if the buffer size exceeds the available buffer
@@ -154,15 +168,12 @@ static int vpif_buffer_queue_setup(struct vb2_queue *vq,
  */
 static void vpif_buffer_queue(struct vb2_buffer *vb)
 {
-   struct vpif_fh *fh = vb2_get_drv_priv(vb->vb2_queue);
-   struct vpif_disp_buffer *buf = container_of(vb,
-   struct vpif_disp_buffer, vb);
-   struct channel_obj *ch = fh->channel;
+   struct vpif_disp_buffer *buf = to_vpif_buffer(vb);
+   struct channel_obj *ch = vb2_get_drv_priv(vb->vb2_queue);
struct common_obj *common;
unsigned long flags;
 
common = &ch->common[VPIF_VIDEO_INDEX];
-
/* add the buffer to the DMA queue */
spin_lock_irqsave(&common->irqlock, flags);
list_add_tail(&buf->list, &common->dma_queue);
@@ -175,10 +186,8 @@ static 

[PATCH v2 0/2] DaVinci: VPIF: upgrade with v4l helpers

2014-04-03 Thread Lad, Prabhakar
From: "Lad, Prabhakar" 

Hi All,

This patch series upgrades the vpif capture & display
driver with the all the helpers provided by v4l, this makes
the driver much simpler and cleaner. This also includes few
checkpatch issues.

Sending them as single patch one for capture and another for
display, splitting them would have caused a huge number small
patches.

Changes for v2:
a> Added a copyright.
b> Dropped buf_init() callback from vb2_ops.
c> Fixed enabling & disabling of interrupts in case of HD formats.


Lad, Prabhakar (2):
  media: davinci: vpif capture: upgrade the driver with v4l offerings
  media: davinci: vpif display: upgrade the driver with v4l offerings

 drivers/media/platform/davinci/vpif_capture.c |  931 +++--
 drivers/media/platform/davinci/vpif_capture.h |   32 +-
 drivers/media/platform/davinci/vpif_display.c |  800 ++---
 drivers/media/platform/davinci/vpif_display.h |   31 +-
 4 files changed, 455 insertions(+), 1339 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] Staging driver patches for 3.15-rc1

2014-04-03 Thread Michal Simek
On 04/03/2014 01:08 AM, Greg KH wrote:
> On Wed, Apr 02, 2014 at 08:52:18PM +, Insop Song wrote:
>> On Wed, April 02, 2014 1:04 PM, Greg KH wrote:
>>> On Wed, Apr 02, 2014 at 10:24:03AM +0200, Paul Bolle wrote:
 On Tue, 2014-04-01 at 11:48 -0700, Greg KH wrote:
> Staging driver pull request for 3.15-rc1
>
> Here's the huge drivers/staging/ update for 3.15-rc1.
>
> Loads of cleanup fixes, a few drivers removed, and some new ones
>>> added.
>
> All have been in linux-next for a while.
>
> [...]
>
> Insop Song (1):
>   staging: fpgaboot: Xilinx FPGA firmware download driver

 This commit adds checks for CONFIG_B4860G100. Is a patch to add a
 Kconfig symbol B4860G100 perhaps queued somewhere?
>>>
>>> Insop, I thought this config option was coming from some other place, right?
>>>
>>
>> Paul,
>> I didn't include CONFIG_B4860G100 in Kconfig in original patch set,
>> since programming FPGA method can vary in different system, and this
>> was discussed during the review with Greg as well.
>>
>> However, actual fpga programming method is well contained in io.c with
>> ifdef-ed CONFIG_B4860G100, now I think I might better to update
>> Kconfig to include CONFIG_B4860G100.
>>
>> Greg, what do you think? Any harm to add custom board CONFIG* to
>> staging Kconfig? Let me know.
> 
> Let's see what it would look like and we can go from there.

That's interesting driver. Maybe good time to refresh my work
around fpga manager.
http://permalink.gmane.org/gmane.linux.kernel/1573330

It is the framework which just provides what you have done there.

I need to finish some work and let me look at it next week.
It should be easy for you to switch to it.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform




signature.asc
Description: OpenPGP digital signature


[tip:x86/hyperv] x86, hyperv: When on Hyper-v use NULL legacy PIC

2014-04-03 Thread tip-bot for K. Y. Srinivasan
Commit-ID:  8df28b82ff0649dd293f0469b97792cfb9ed10ab
Gitweb: http://git.kernel.org/tip/8df28b82ff0649dd293f0469b97792cfb9ed10ab
Author: K. Y. Srinivasan 
AuthorDate: Thu, 3 Apr 2014 18:16:33 -0700
Committer:  H. Peter Anvin 
CommitDate: Thu, 3 Apr 2014 22:00:13 -0700

x86, hyperv: When on Hyper-v use NULL legacy PIC

Use the NULL legacy PIC when on Hyper-V. With this change we can support kexec
even when booting on EFI firmware. This patch has been tested on both EFI as
well as non-EFI firmware stacks on Hyper-V.

This patch is required to support kexec on EFI firmware on Hyper-V. Please
apply.

[ hpa: HyperV in EFI mode doesn't include a legacy PIC, and apparently
doesn't stub it out in a meaningful way.  This becomes an issue
after kexec if the second kernel doesn't know it is EFI-booted.
Since HyperV presumably never actually *needs* the legacy PIC, we
can just disable it. ]

Signed-off-by: K. Y. Srinivasan 
Link: 
http://lkml.kernel.org/r/1396574193-12043-1-git-send-email-...@microsoft.com
Cc: [3.13+]
Signed-off-by: H. Peter Anvin 
---
 arch/x86/kernel/cpu/mshyperv.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 832d05a..b7d82c7 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -93,14 +93,8 @@ static void __init ms_hyperv_init_platform(void)
printk(KERN_INFO "HyperV: LAPIC Timer Frequency: %#x\n",
lapic_timer_frequency);
 
-   /*
-* On Hyper-V, when we are booting off an EFI firmware stack,
-* we do not have many legacy devices including PIC, PIT etc.
-*/
-   if (efi_enabled(EFI_BOOT)) {
-   printk(KERN_INFO "HyperV: Using null_legacy_pic\n");
-   legacy_pic = &null_legacy_pic;
-   }
+   printk(KERN_INFO "HyperV: Using null_legacy_pic\n");
+   legacy_pic = &null_legacy_pic;
}
 #endif
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dma: fix eDMA driver as a subsys_initcall

2014-04-03 Thread Yuan Yao
Because of some driver base on DMA, changed the initcall order as 
subsys_initcall.

Signed-off-by: Yuan Yao 
---
 drivers/dma/fsl-edma.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
index 381e793..b396a7f 100644
--- a/drivers/dma/fsl-edma.c
+++ b/drivers/dma/fsl-edma.c
@@ -968,7 +968,17 @@ static struct platform_driver fsl_edma_driver = {
.remove = fsl_edma_remove,
 };
 
-module_platform_driver(fsl_edma_driver);
+static int __init fsl_edma_init(void)
+{
+   return platform_driver_register(&fsl_edma_driver);
+}
+subsys_initcall(fsl_edma_init);
+
+static void __exit fsl_edma_exit(void)
+{
+   platform_driver_unregister(&fsl_edma_driver);
+}
+module_exit(fsl_edma_exit);
 
 MODULE_ALIAS("platform:fsl-edma");
 MODULE_DESCRIPTION("Freescale eDMA engine driver");
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipc,shm: disable shmmax and shmall by default

2014-04-03 Thread Davidlohr Bueso
On Thu, 2014-04-03 at 19:39 -0400, KOSAKI Motohiro wrote:
> On Thu, Apr 3, 2014 at 3:50 PM, Davidlohr Bueso  wrote:
> > On Thu, 2014-04-03 at 21:02 +0200, Manfred Spraul wrote:
> >> Hi Davidlohr,
> >>
> >> On 04/03/2014 02:20 AM, Davidlohr Bueso wrote:
> >> > The default size for shmmax is, and always has been, 32Mb.
> >> > Today, in the XXI century, it seems that this value is rather small,
> >> > making users have to increase it via sysctl, which can cause
> >> > unnecessary work and userspace application workarounds[1].
> >> >
> >> > Instead of choosing yet another arbitrary value, larger than 32Mb,
> >> > this patch disables the use of both shmmax and shmall by default,
> >> > allowing users to create segments of unlimited sizes. Users and
> >> > applications that already explicitly set these values through sysctl
> >> > are left untouched, and thus does not change any of the behavior.
> >> >
> >> > So a value of 0 bytes or pages, for shmmax and shmall, respectively,
> >> > implies unlimited memory, as opposed to disabling sysv shared memory.
> >> > This is safe as 0 cannot possibly be used previously as SHMMIN is
> >> > hardcoded to 1 and cannot be modified.
> >
> >> Are we sure that no user space apps uses shmctl(IPC_INFO) and prints a
> >> pretty error message if shmall is too small?
> >> We would break these apps.
> >
> > Good point. 0 bytes/pages would definitely trigger an unexpected error
> > message if users did this. But on the other hand I'm not sure this
> > actually is a _real_ scenario, since upon overflow the value can still
> > end up being 0, which is totally bogus and would cause the same
> > breakage.
> >
> > So I see two possible workarounds:
> > (i) Use ULONG_MAX for the shmmax default instead. This would make shmall
> > default to 1152921504606846720 and 268435456, for 64 and 32bit systems,
> > respectively.
> >
> > (ii) Keep the 0 bytes, but add a new a "transition" tunable that, if set
> > (default off), would allow 0 bytes to be unlimited. With time, users
> > could hopefully update their applications and we could eventually get
> > rid of it. This _seems_ to be the less aggressive way to go.
> 
> Do you mean
> 
> set 0: IPC_INFO return shmmax = 0.
> set 1: IPC_INFO return shmmax = ULONG_MAX.
> 
> ?
> 
> That makes sense.

Well I was mostly referring to:

set 0: leave things as there are now.
set 1: this patch.

I don't think it makes much sense to set unlimited for both 0 and
ULONG_MAX, that would probably just create even more confusion. 

But then again, we shouldn't even care about breaking things with shmmax
or shmall with 0 value, it just makes no sense from a user PoV. shmmax
cannot be 0 unless there's an overflow, which voids any valid cases, and
thus shmall cannot be 0 either as it would go against any values set for
shmmax. I think it's safe to ignore this.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-03 Thread Viresh Kumar
Thanks Fengguang,

On 4 April 2014 08:49, Fengguang Wu  wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> git://git.linaro.org/people/vireshk/linux timer-cleanup-for-tglx
>
> commit 6378cb51af5f4743db0dcb3cbcf862eac5908754
> Author: Viresh Kumar 
> AuthorDate: Thu Mar 20 14:29:02 2014 +0530
> Commit: Viresh Kumar 
> CommitDate: Wed Apr 2 14:54:57 2014 +0530
>
> timer: don't migrate pinned timers
>
> migrate_timer() is called when a CPU goes down and its timers are 
> required to be
> migrated to some other CPU. Its the responsibility of the users of the 
> timer to
> remove it before control reaches to migrate_timers().
>
> As these were the pinned timers, the best we can do is: don't migrate 
> these and
> report to the user as well.
>
> That's all this patch does.
>
> Signed-off-by: Viresh Kumar 
>
> ===
> PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
> ===
> Attached one more dmesg for the NULL pointer bug in parent commit.
>
> +++++
> || 5a8530b7c3 | 
> 6378cb51af | 7caf71f403 |
> +++++
> | boot_successes | 103| 14
>  | 10 |
> | boot_failures  | 17 | 18
>  | 13 |
> | BUG:unable_to_handle_kernel_NULL_pointer_dereference   | 16 |   
>  ||
> | Oops:SMP   | 16 |   
>  ||
> | Kernel_panic-not_syncing:Fatal_exception   | 16 |   
>  ||
> | backtrace:vfs_read | 16 |   
>  ||
> | backtrace:SyS_read | 16 |   
>  ||
> | BUG:kernel_test_crashed| 1  |   
>  ||
> | WARNING:CPU:PID:at_kernel/timer.c:migrate_timer_list() | 0  | 17
>  | 12 |
> | backtrace:vfs_write| 0  | 17
>  | 12 |
> | backtrace:SyS_write| 0  | 17
>  | 12 |
> | BUG:kernel_early_hang_without_any_printk_output| 0  | 1 
>  | 1  |
> +++++
>
> [   74.242293] Unregister pv shared memory for cpu 1
> [   74.273280] smpboot: CPU 1 is now offline
> [   74.274685] [ cut here ]
> [   74.275524] WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 
> migrate_timer_list+0xd6/0xf0()
> [   74.275524] migrate_timer_list: can't migrate pinned timer: 
> 81f06a60, deactivating it

Hmm, nice. So, my patch hasn't created a bug, but just highlighted it.
I have added this piece of code while migrating timers away:

if (unlikely(WARN(is_pinned,
"%s: can't migrate pinned timer: %p, deactivating it\n",
__func__, timer)))

Which means, migrate all timers to other CPUs when a CPU is going down.
But obviously we can't migrate the pinned timers. And it looks like we
actually were doing that before this commit and things went unnoticed.

But just due to this print, we are highlighting an existing issue here.
@Thomas: So, in a sense my patch is doing some good work now :)

Now, we need to fix the code which queued this pinned timer.
@Fengguang: As I don't have the facilities to reproduce this, can you
help me debugging this? Probably just change this print message to
print the address of timer->function as well and then we can find
the name of routine with help of objdump..

What we need to do finally is to remove this timer before CPU is
going down. I can fix the driver in question once we know which
driver it is.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: make the queue depth the max supportable by the hypervisor

2014-04-03 Thread Rusty Russell
Stefan Hajnoczi  writes:
> On Tue, Apr 1, 2014 at 4:27 AM, Theodore Ts'o  wrote:
>> On Mon, Mar 31, 2014 at 02:22:50PM +1030, Rusty Russell wrote:
>>>
>>> It's head of my virtio-next tree.
>>
>> Hey Rusty,
>>
>> While we have your attention --- what's your opinion about adding TRIM
>> support to virtio-blk.  I understand that you're starting an OASIS
>> standardization process for virtio --- what does that mean vis-a-vis a
>> patch to plumb discard support through virtio-blk?
>
> virtio-scsi already supports discard.  But maybe you cannot switch
> away from virtio-blk?
>
> If you need to add discard to virtio-blk then it could be added to the
> standard.  The standards text is kept in a svn repository here:
> https://tools.oasis-open.org/version-control/browse/wsvn/virtio/

It would be trivial to add, and I wouldn't be completely opposed, but we
generally point to virtio-scsi when people want actual features.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [hrtimer] BUG: unable to handle kernel NULL pointer dereference at 00000010

2014-04-03 Thread Viresh Kumar
On 4 April 2014 08:45, Fengguang Wu  wrote:
> [2.258025] BUG: unable to handle kernel NULL pointer dereference at 
> 0010
> [2.258641] IP: [] hrtimer_force_reprogram+0x3d/0xb1
> [2.259151] *pde = 
> [2.259412] Oops:  [#1] DEBUG_PAGEALLOC
> [2.259786] CPU: 0 PID: 0 Comm: swapper Not tainted 
> 3.14.0-rc1-00080-g336ca19 #9
> [2.260388] task: bdde5534 ti: bddda000 task.ti: bddda000
> [2.260830] EIP: 0060:[] EFLAGS: 00210002 CPU: 0
> [2.261278] EIP is at hrtimer_force_reprogram+0x3d/0xb1

Hi,

Thanks, this is already fixed in my latest branch. It was pushed yesterday
probably with all fixes :) ..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] staging: fpgaboot: clean up Makefile

2014-04-03 Thread Insop Song

On Thursday, April 03, 2014 10:56 AM, Dan Carpenter wrote:
> 
> Signed-off-by is like signing a legal document, to show you haven't violated
> copyright law or anything while the patch was in your hands.
> You should use Acked-by or Reviewed-by depending on what you mean.
> 

Dan,

Thank you for pointing that out.
This was for reviewing, so I would put this instead, then.

Reviewed-by: Insop Song 

Regards,

ISS
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/3] pstore: add seq_ops for norm zone

2014-04-03 Thread Liu ShuoX
Some developers want to output the pstore record trace flexible.
So add seq_ops into ramoops_zone in case users would make private output
format.

Signed-off-by: Zhang Yanmin 
Signed-off-by: Liu ShuoX 
---
 fs/pstore/inode.c  | 10 --
 include/linux/pstore_ramoops.h |  1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index d463481..a9c9782 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -207,14 +207,20 @@ static ssize_t pstore_file_read(struct file *file, char 
__user *userbuf,
 static int pstore_file_open(struct inode *inode, struct file *file)
 {
struct pstore_private *ps = inode->i_private;
+   struct ramoops_context *cxt = ps->psi->data;
+   struct ramoops_zone*zones = cxt ? cxt->zones : NULL;
struct seq_file *sf;
int err;
const struct seq_operations *sops = NULL;
 
if (ps->type == PSTORE_TYPE_FTRACE)
sops = &pstore_ftrace_seq_ops;
-   if (ps->type == PSTORE_TYPE_NORM)
-   sops = &pstore_seq_ops;
+   if (ps->type == PSTORE_TYPE_NORM && zones) {
+   if (zones[ps->id].seq_ops)
+   sops = zones[ps->id].seq_ops;
+   else
+   sops = &pstore_seq_ops;
+   }
 
err = seq_open(file, sops);
if (err < 0)
diff --git a/include/linux/pstore_ramoops.h b/include/linux/pstore_ramoops.h
index 423aacc..f021a94 100644
--- a/include/linux/pstore_ramoops.h
+++ b/include/linux/pstore_ramoops.h
@@ -33,6 +33,7 @@ struct ramoops_zone {
int item_size;
void (*print_record)(struct seq_file *s, void *record);
void *(*get_new_record)(struct persistent_ram_zone *prz);
+   const struct seq_operations *seq_ops;
 };
 
 /*
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/3] pstore: support current records dump in ramoops

2014-04-03 Thread Liu ShuoX
dump the records in runtime is useful sometime. We could check the
records and understand driver's and device's status.

Signed-off-by: Zhang Yanmin 
Signed-off-by: Liu ShuoX 
---
 fs/pstore/inode.c  | 39 +++
 fs/pstore/internal.h   |  3 ++-
 fs/pstore/platform.c   | 39 ++-
 fs/pstore/ram.c| 18 ++
 fs/pstore/ram_core.c   | 10 ++
 include/linux/pstore.h |  2 ++
 include/linux/pstore_ram.h |  2 ++
 7 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index a9c9782..a3b817c15df 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -48,10 +48,11 @@ struct pstore_private {
struct list_head list;
struct pstore_info *psi;
enum pstore_type_id type;
+   int curr;
u64 id;
int count;
ssize_t size;
-   chardata[];
+   char*data;
 };
 
 struct pstore_seq_data {
@@ -210,16 +211,27 @@ static int pstore_file_open(struct inode *inode, struct 
file *file)
struct ramoops_context *cxt = ps->psi->data;
struct ramoops_zone*zones = cxt ? cxt->zones : NULL;
struct seq_file *sf;
+   char *buf = NULL;
int err;
+   u64 id = ps->id;
const struct seq_operations *sops = NULL;
 
if (ps->type == PSTORE_TYPE_FTRACE)
sops = &pstore_ftrace_seq_ops;
if (ps->type == PSTORE_TYPE_NORM && zones) {
-   if (zones[ps->id].seq_ops)
-   sops = zones[ps->id].seq_ops;
+   if (zones[id].seq_ops)
+   sops = zones[id].seq_ops;
else
sops = &pstore_seq_ops;
+   if (ps->curr) {
+   /*
+* Update size again as current buffer
+* size might be changed.
+*/
+   inode->i_size = ps->size =
+   ps->psi->read_curr(&id, PSTORE_TYPE_NORM,
+   &buf, ps->psi);
+   }
}
 
err = seq_open(file, sops);
@@ -256,12 +268,16 @@ static int pstore_unlink(struct inode *dir, struct dentry 
*dentry)
 {
struct pstore_private *p = dentry->d_inode->i_private;
 
+   if (p->curr)
+   goto unlink;
if (p->psi->erase)
p->psi->erase(p->type, p->id, p->count,
  dentry->d_inode->i_ctime, p->psi);
else
return -EPERM;
 
+   kfree(p->data);
+unlink:
return simple_unlink(dir, dentry);
 }
 
@@ -358,7 +374,7 @@ int pstore_is_mounted(void)
  */
 int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
  char *data, bool compressed, size_t size,
- struct timespec time, struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi, bool curr)
 {
struct dentry   *root = pstore_sb->s_root;
struct dentry   *dentry;
@@ -374,14 +390,15 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
list_for_each_entry(pos, &allpstore, list) {
if (pos->type == type &&
pos->id == id &&
-   pos->psi == psi) {
+   pos->psi == psi &&
+   pos->curr == curr) {
rc = -EEXIST;
break;
}
}
spin_unlock_irqrestore(&allpstore_lock, flags);
if (rc)
-   return rc;
+   goto fail;
 
rc = -ENOMEM;
inode = pstore_get_inode(pstore_sb);
@@ -389,13 +406,15 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
goto fail;
inode->i_mode = S_IFREG | 0444;
inode->i_fop = &pstore_file_operations;
-   private = kmalloc(sizeof *private + size, GFP_KERNEL);
+   private = kmalloc(sizeof(*private), GFP_KERNEL);
if (!private)
goto fail_alloc;
private->type = type;
private->id = id;
private->count = count;
private->psi = psi;
+   private->curr = curr;
+   private->data = data;
 
switch (type) {
case PSTORE_TYPE_DMESG:
@@ -434,13 +453,15 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
break;
}
 
+   if (curr)
+   strcat(name, "_cur");
+
mutex_lock(&root->d_inode->i_mutex);
 
dentry = d_alloc_name(root, name);
if (!dentry)
goto fail_lockedalloc;
 
-   memcpy(private->data, data, size);
inode->i_size = private->size = size;
 
inode->i_private = private;
@@ -465,6 +486,7 @@ fail_alloc:
iput(inode);
 
 fail:
+   kfree(data);
return rc;
 }
 

[PATCH v3 1/3] pstore: restructure ramoops to support more trace

2014-04-03 Thread Liu ShuoX
From: Zhang Yanmin 

The patch restructure ramoops of pstore a little to support more user-defined
tracers through ramoops. Here is reason we enhance ramoops:
pstore ramoops is a very import debug feature for mobile development. At 
present,
ramoops has supported kdump, console and ftrace tracer. Sometimes, we need some
special tracers such as recording cmd and data when driver send/receive. But 
now,
it's hard to add new tracers into ramoops without touching the pstore core 
codes.
So we restructure ramoops to let it more flexiable, more eailier to extend.

With this, we split the pstore codes and new tracers which are based on ramoops.
Developer could add a new tracer based on ramoops standalone and pstore detects
it automatically.

Signed-off-by: Zhang Yanmin 
Signed-off-by: Liu ShuoX 
---
 Documentation/ramoops.txt |  70 +-
 arch/x86/kernel/vmlinux.lds.S |   9 +++
 drivers/platform/chrome/chromeos_pstore.c |   2 +-
 fs/pstore/inode.c |  85 +-
 fs/pstore/internal.h  |   1 +
 fs/pstore/platform.c  |  38 ++
 fs/pstore/ram.c   | 114 ++
 fs/pstore/ram_core.c  |  20 ++
 include/linux/pstore.h|   2 +
 include/linux/pstore_ram.h|  21 ++
 include/linux/pstore_ramoops.h|  77 
 11 files changed, 390 insertions(+), 49 deletions(-)
 create mode 100644 include/linux/pstore_ramoops.h

diff --git a/Documentation/ramoops.txt b/Documentation/ramoops.txt
index 69b3cac..b6e8c2a 100644
--- a/Documentation/ramoops.txt
+++ b/Documentation/ramoops.txt
@@ -49,7 +49,7 @@ Setting the ramoops parameters can be done in 2 different 
manners:
  2. Use a platform device and set the platform data. The parameters can then
  be set through that platform data. An example of doing that is:
 
-#include 
+#include 
 [...]
 
 static struct ramoops_platform_data ramoops_data = {
@@ -117,3 +117,71 @@ file. Here is an example of usage:
  0 811d9c54  8101a7a0  __const_udelay <- 
native_machine_emergency_restart+0x110/0x1e0
  0 811d9c34  811d9c80  __delay <- __const_udelay+0x30/0x40
  0 811d9d14  811d9c3f  delay_tsc <- __delay+0xf/0x20
+
+6. Persistent record tracing
+
+Persistent record tracing might be useful for debugging software of hardware
+related hangs. It has flexible usage allows developer to trace self-defined
+record structure at self-defined tracepoint. After reboot, the record log is
+stored in a "NAME-ramoops" file. Here is an example of usage:
+
+#include 
+[...]
+
+struct norm_zone_test_record {
+   unsigned long val;
+   char str[32];
+};
+
+static void print_record(struct seq_file *s, void *rec)
+{
+   struct norm_zone_test_record *record = rec;
+   seq_printf(s, "%s: %ld\n",
+   record->str, record->val);
+}
+
+DEFINE_PSTORE_RAMZONE(test_zone) = {
+   .size = 4096,
+   .name = "test_zone",
+   .item_size = sizeof(struct norm_zone_test_record),
+   .print_record = print_record,
+};
+
+static void add_test_record(char *str, unsigned long val)
+{
+   struct norm_zone_test_record *record;
+   record = ramoops_get_new_record(test_zone.prz);
+   if (record) {
+   record->val = val;
+   strcpy(record->str, str);
+   }
+}
+
+static int test_cpufreq_transition(struct notifier_block *nb,
+   unsigned long event, void *data)
+{
+   add_test_record("cpufreq transition", event);
+   return 0;
+}
+
+static struct notifier_block freq_transition = {
+   .notifier_call = test_cpufreq_transition,
+};
+
+static int __init norm_zone_test_init(void)
+{
+   cpufreq_register_notifier(&freq_transition,
+   CPUFREQ_TRANSITION_NOTIFIER);
+   return 0;
+}
+module_init(norm_zone_test_init);
+
+Record trace use the reserved memory by ramoops. For the most compatibility,
+user could use Chapter 2's methods to define the ramoops paramters, then
+enlarge the defined mem_size with pstore_norm_zones_size(), e.g.:
+
+#include 
+#include 
+
+memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size +
+   pstore_norm_zones_size(NULL));
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 49edf2d..2422622 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -304,6 +304,15 @@ SECTIONS
NOSAVE_DATA
}
 #endif
+#ifdef CONFIG_PSTORE
+   /* ramoops zone */
+   . = ALIGN(8);
+   .ram_zone : AT(ADDR(.ram_zone) - LOAD_OFFSET) {
+   __ramoops_zone_start = .;
+   *(.ram_zone)
+   __ramoops_zone_end = .;
+   }
+#endif
 
/* BSS */
. = ALIGN(PAGE_SIZE);
diff --git a/drivers/platform/chrome/chromeos_pstore.c 
b/drivers/platform/chro

[PATCH v3 0/3] Add a method to expand tracers for pstore easily

2014-04-03 Thread Liu ShuoX
Hi,
Here are the v3 of this series.
Changelog v3:
  1) Fix compiling errors when CONFIG_PSTORE_RAM=m.

Changelog v2:
  1) Fix compiling errors when CONFIG_PSTORE_RAM is disabled.
  2) Add some protection in the code in case we disable CONFIG_PSTORE_RAM.

---
Liu ShuoX (2):
  pstore: add seq_ops for norm zone
  pstore: support current records dump in ramoops

Zhang Yanmin (1):
  pstore: restructure ramoops to support more trace

 Documentation/ramoops.txt |  70 +++-
 arch/x86/kernel/vmlinux.lds.S |   9 ++
 drivers/platform/chrome/chromeos_pstore.c |   2 +-
 fs/pstore/inode.c | 126 ++--
 fs/pstore/internal.h  |   4 +-
 fs/pstore/platform.c  |  77 +++--
 fs/pstore/ram.c   | 132 +++---
 fs/pstore/ram_core.c  |  30 +++
 include/linux/pstore.h|   4 +
 include/linux/pstore_ram.h|  23 ++
 include/linux/pstore_ramoops.h|  78 ++
 11 files changed, 490 insertions(+), 65 deletions(-)
 create mode 100644 include/linux/pstore_ramoops.h

-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Xen 32-bit PV regression

2014-04-03 Thread Boris Ostrovsky

On 04/03/2014 11:23 PM, Boris Ostrovsky wrote:

Steven,

Looks like commit 198d208df (x86: Keep thread_info on thread stack in 
x86_32) broke Xen's 32-bit PV guests.


I poked a little at it and it seems that at least the ifdef in 
xen_cpu_up() needs to be adjusted to set up kernel_stack --- that 
allows CPUs to get going. This is not enough though (not particularly 
surprisingly) and we die a little later with #GPF in xen_iret.



I should have probably included some output.


...
[1.277533] Freeing unused kernel memory: 780K (c18fe000 - c19c1000)
[1.280041] Write protecting the kernel text: 6120k
[1.281477] Write protecting the kernel read-only data: 2472k
[1.282177] NX-protecting the kernel data: 4120k
[1.304957] general protection fault:  [#1] SMP
[1.305866] Modules linked in:
[1.305866] CPU: 0 PID: 1 Comm: init Not tainted 3.14.0-32b #29
[1.305866] task: eb88 ti: eb84c000 task.ti: eb84c000
[1.305866] EIP: e019:[] EFLAGS: 00010046 CPU: 0
[1.305866] EIP is at xen_iret+0xb/0x29
[1.305866] EAX:  EBX:  ECX:  EDX: 
[1.305866] ESI:  EDI:  EBP:  ESP: eb84dfe0
[1.305866]  DS: 007b ES: 007b FS:  GS:  SS: e021
[1.305866] CR0: 8005003b CR2: bf8e0fe0 CR3: 2ac52000 CR4: 00040660
[1.305866] Stack:
[1.305866]   b760d020 0073 0200 bf8e0fe0 007b 
c2c2c2c2 c2c2c2c2

[1.305866] Call Trace:
[1.305866] Code: a1 d8 9f 9b c1 2d ec 1f 00 00 36 8b 40 10 36 8b 04 
85 e0 74 8f c1 36 8b 80 c0 90 9b c1 eb 2a 90 f7 44 24 08 00 00 02 80 75 
3d 50 <64> a1 d8 9f 9b c1 2d ec 1f 00 00 36 8b 40 10 36 8b 04 85 e0 74

[1.305866] EIP: [] xen_iret+0xb/0x29 SS:ESP e021:eb84dfe0
[1.305866] ---[ end trace e4f42ac1798ac467 ]---
[1.369739] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ext4 changes for 3.15

2014-04-03 Thread Theodore Ts'o
On Thu, Apr 03, 2014 at 12:39:42PM -0700, Linus Torvalds wrote:
> Btw, since I'm planning on getting to the filesystem pulls later today
> (or perhaps tomorrow), I wanted to check: are you ok with the ext4
> parts of the cross-rename patches from Miklos?
> 
> They are currently at
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git cross-rename
> 
> in case you want to refresh your memory.

I've pulled in the cross-rename branch to my test branch and run a set
of tests, and it passes.  Unfortunately I don't believe Miklos
contributed tests for renameat(2) to xfstests, so we don't have any
on-point testing of renameat() and cross-rename, but it's at least not
triggering any failures on the existing tests.

I've also reviewed the patches again, so:

Acked-by: "Theodore Ts'o" 

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] SPI: Add driver for Cadence SPI controller

2014-04-03 Thread Harini Katakam
Hi Mark

On Fri, Apr 4, 2014 at 3:13 AM, Mark Brown  wrote:
> On Thu, Apr 03, 2014 at 04:40:30PM +0530, Harini Katakam wrote:
>> Add driver for Cadence SPI controller. This is used in Xilinx Zynq.
>
> I just reviewed a driver for "Zynq Quad SPI controller" from Punnaiah
> Choudary Kalluri (CCed) which seems *very* similar to this one.  Are
> there opportunities for code sharing here (I'm not entirely sure the
> hardware blocks are different, though I didn't check in detail).
>

Thanks for the review.

QSPI is a Xilinx IP built on top of cadence SPI  with
considerable functional changes.
As explained in the QSPI patch, there are three configurations
QSPI supports :

- A single flash device connected with 1 CS and 4 IO lines
- Two flash devices connected over two separate sets of 4 IO lines
  and two CS lines which are driven together.
- Two flash devices connected with two separate CS line and one
  common set of 4 IO lines.

This first set of QSPI patches is only for the single flash configuration.
As the next two configurations follow, QSPI driver will differ from SPI
even more. That's why it might be better to have two separate drivers.
It will avoid a lot of "if spi/ if qspi" checks.

I will send an RFC with proposed changes for all QSPI configurations.

Also, I've replied to your comments on the QSPI driver.
(The QSPI driver already addresses the comments for SPI v1)
Except in two places where the comment was only applicable to QSPI driver,
the replies hold good for both SPI and QSPI drivers.
If you would like to continue the discussion on that thread, I'm ok with it.
FYI, I'll be sending the next versions for both drivers after further
discussion concludes.

Regards,
Harini
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update

2014-04-03 Thread Michael wang
Hi, Srivatsa

Thanks for your reply :)

On 04/03/2014 04:50 PM, Srivatsa S. Bhat wrote:
[snip]
> 
> Now, the interesting thing to note here is that, if CPU0's node was already
> set as node0, *nothing* should go wrong, since its just a redundant update.
> However, if CPU0's original node mapping was something different, or if
> node0 doesn't even exist in the machine, then the system can crash.

By printk I confirmed all cpus was belong to node 1 at very beginning,
and things become magically after the wrong updating...

> 
> Have you verified that CPU0's node mapping is different from node 0?
> That is, boot the kernel with "numa=debug" in the kernel command line and
> it will print out the cpu-to-node associativity during boot. That way you
> can figure out what was the original associativity that was set. This will
> confirm the theory that the hypervisor sent a redundant update, but because
> of the weird pre-allocation using kzalloc that we do inside
> arch_update_cpu_topology(), we wrongly updated CPU0's mapping as CPU0 <-> 
> Node0.

Associativity should changes, otherwise we won't continue the updating,
and empty updates[] was confirmed to show up inside
arch_update_cpu_topology().

What I can't make sure is whether this is legal, notify changes but no
changes happen sounds weird...however, even if it's legal, a check in
here still make sense IMHO.

Regards,
Michael Wang

> 
> 
> Regards,
> Srivatsa S. Bhat
> 
>> Thus we should stop the updating in such cases, this patch will achieve
>> this and fix the issue.
>>
>> CC: Benjamin Herrenschmidt 
>> CC: Paul Mackerras 
>> CC: Nathan Fontenot 
>> CC: Stephen Rothwell 
>> CC: Andrew Morton 
>> CC: Robert Jennings 
>> CC: Jesse Larrew 
>> CC: "Srivatsa S. Bhat" 
>> CC: Alistair Popple 
>> Signed-off-by: Michael Wang 
>> ---
>>  arch/powerpc/mm/numa.c |9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index 30a42e2..6757690 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -1591,6 +1591,14 @@ int arch_update_cpu_topology(void)
>>  cpu = cpu_last_thread_sibling(cpu);
>>  }
>>
>> +/*
>> + * The 'cpu_associativity_changes_mask' could be cleared if
>> + * all the cpus it indicates won't change their node, in
>> + * which case the 'updated_cpus' will be empty.
>> + */
>> +if (!cpumask_weight(&updated_cpus))
>> +goto out;
>> +
>>  stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
>>
>>  /*
>> @@ -1612,6 +1620,7 @@ int arch_update_cpu_topology(void)
>>  changed = 1;
>>  }
>>
>> +out:
>>  kfree(updates);
>>  return changed;
>>  }
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] KVM: x86: Fix page-tables reserved bits

2014-04-03 Thread Nadav Amit
KVM does not handle the reserved bits of x86 page tables correctly:
In PAE, bits 5:8 are reserved in the PDPTE.
In IA-32e, bit 8 is not reserved.

Signed-off-by: Nadav Amit 
---
 arch/x86/kvm/mmu.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f5704d9..3993976 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3538,7 +3538,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
case PT32E_ROOT_LEVEL:
context->rsvd_bits_mask[0][2] =
rsvd_bits(maxphyaddr, 63) |
-   rsvd_bits(7, 8) | rsvd_bits(1, 2);  /* PDPTE */
+   rsvd_bits(5, 8) | rsvd_bits(1, 2);  /* PDPTE */
context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 62);  /* PDE */
context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
@@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
break;
case PT64_ROOT_LEVEL:
context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
-   rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
+   rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
-   rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
+   rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 51);
context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 1/1] Tracepoint: register/unregister struct tracepoint

2014-04-03 Thread Mathieu Desnoyers
- Original Message -
> From: "Steven Rostedt" 
> To: "Mathieu Desnoyers" 
> Cc: linux-kernel@vger.kernel.org, "Ingo Molnar" , "Frederic 
> Weisbecker" ,
> "Andrew Morton" , "Frank Ch. Eigler" 
> , "Johannes Berg"
> 
> Sent: Thursday, April 3, 2014 2:54:41 PM
> Subject: Re: [PATCH v9 1/1] Tracepoint: register/unregister struct tracepoint
> 
> On Thu, 3 Apr 2014 17:49:54 + (UTC)
> Mathieu Desnoyers  wrote:
> 
> 
> > So my current thinking is that the pre-existing code was erroneously
> > enabling tracepoints with the name of every event enabled (including
> > e.g. function tracer, kprobes, etc). It was not failing because
> > tracepoint.c silently accepted to enable tracepoints were not loaded
> > yet.
> > 
> 
> If that was true, than wouldn't the error code I added have returned an
> error?

Good point.

I found the culprit:

[0.560002] event_trace_enable: �GT� call 81613930 (core)

  
[0.564001] event_trace_enable: �GTȁ call 816139c0 (core) 

compudj@ok:~/git/rostedt/linux-trace$ objdump -t vmlinux |grep 81613930
81613930 l O .data  0090 event_sys_exit
compudj@ok:~/git/rostedt/linux-trace$ objdump -t vmlinux |grep 816139c0
816139c0 l O .data  0090 event_sys_enter

I'll look into those two sites tomorrow morning.

Thanks,

Mathieu

> 
> -- Steve
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] SPI: Add support for Zynq Quad SPI controller

2014-04-03 Thread Harini Katakam
Hi Mark,

On Fri, Apr 4, 2014 at 2:59 AM, Mark Brown  wrote:
> On Thu, Apr 03, 2014 at 10:33:07PM +0530, Punnaiah Choudary Kalluri wrote:
>
> Overall this looks fairly good, there are a few issues that need to be
> looked at but they're not too invasive.  Please also check for coding
> style issues, quite a few spaces before commas for example.
>

Thanks. I'll check that.



>> +/**
>> + * zynq_qspi_copy_read_data - Copy data to RX buffer
>> + * @xqspi:   Pointer to the zynq_qspi structure
>> + * @data:The 32 bit variable where data is stored
>> + * @size:Number of bytes to be copied from data to RX buffer
>> + */
>> +static void zynq_qspi_copy_read_data(struct zynq_qspi *xqspi, u32 data, u8 
>> size)
>> +{
>> + if (xqspi->rxbuf) {
>> + memcpy(xqspi->rxbuf, ((u8 *) &data) + 4 - size, size);
>> + xqspi->rxbuf += size;
>> + }
>> + xqspi->bytes_to_receive -= size;
>> +}
>
> Does this and the write function really need to be a separate function -
> it's trivial and used once?  It's probably more beneficial to split out
> some of the more complex logic later on that's causing the indentation
> to get too deep.
>

I'm aware it's used in only one place but it does make receive data handling
easier for future. As you may have noticed there are 4 different ways to
write into transmit FIFO and the data read also differs accordingly.
I'll try to reduce the indentation in other places.



>> +static int zynq_qspi_setup_transfer(struct spi_device *qspi,
>> + struct spi_transfer *transfer)
>> +{
>> + struct zynq_qspi *xqspi = spi_master_get_devdata(qspi->master);
>> + u32 config_reg, req_hz, baud_rate_val = 0;
>> +
>> + if (transfer)
>> + req_hz = transfer->speed_hz;
>> + else
>> + req_hz = qspi->max_speed_hz;
>
> Why would a transfer be being set up without a transfer being provided?
>

The setup function calls this function before a transfer is initiated.
In this case NULL is passed to setup_transfer (see below) and
SPI is initialized with default clock configuration.
This initialization is necessary because otherwise this clock config
would be done
only after SPI is enabled in prepare_hardware, which is wrong.
(I'm checking for master->busy in setup to address your previous
comment on SPI).

I explained the same in SPI v2 changes and this valid there too.

>> +/**
>> + * zynq_qspi_setup - Configure the QSPI controller
>> + * @qspi:Pointer to the spi_device structure
>> + *
>> + * Sets the operational mode of QSPI controller for the next QSPI transfer, 
>> baud
>> + * rate and divisor value to setup the requested qspi clock.
>> + *
>> + * Return:   0 on success and error value on failure
>> + */
>> +static int zynq_qspi_setup(struct spi_device *qspi)
>> +{
>> + if (qspi->master->busy)
>> + return -EBUSY;
>> +
>> + return zynq_qspi_setup_transfer(qspi, NULL);
>> +}
>
> No, this is broken - you have to support setup() while the hardware is
> active.  Just remove this if there's nothing to do and set up on the
> transfer.

But where do you suggest this clock configuration be done?
I've looked at the option of doing it in prepare_hardware but
spi_device structure is not passed to it.



>
>> + if (xqspi->rxbuf) {
>> + (*(u32 *)xqspi->rxbuf) =
>> + zynq_qspi_read(xqspi,
>> +ZYNQ_QSPI_RXD_OFFSET);
>> + xqspi->rxbuf += 4;
>
> This only works in 4 byte words?  That seems a bit limited.
> Alternatively, if it works with smaller sizes (as it appears to) then
> isn't this at risk of overflowing buffers?
>

There is a
if (xqspi->bytes_to_receive < 4) {
above and this statement is in the else loop.
When less than 4 bytes are being read/received, the handling is different.

>> +static int __maybe_unused zynq_qspi_suspend(struct device *_dev)
>> +{
>> + struct platform_device *pdev = container_of(_dev,
>> + struct platform_device, dev);
>> + struct spi_master *master = platform_get_drvdata(pdev);
>> +
>> + spi_master_suspend(master);
>> +
>> + zynq_unprepare_transfer_hardware(master);
>
> Why are you unpreparing the hardware - the framework should be doing
> that for you if the device is active, if it's not you've got an extra
> clock disable here?
>

I called unprepare_hardware  becuase it does the things necessary
after master suspend - disable clock and controller.
(I thought this was your suggestion for SPI?)

>> +static int __maybe_unused zynq_qspi_resume(struct device *dev)
>
> This doesn't appear to be calling init_hw() - is it guaranteed that all
> the register settings written there are OK after power on?
>
>> + ret = clk_prepare_enable(xqspi->aperclk);
>> + if (ret) {
>> + dev_err(&pdev->dev, "Unable to enable APER clock.\n");
>> + go

[PATCH v2 0/8] DMA: Freescale: driver cleanups and enhancements

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

Hi Vinod Koul,
Please have a look at the v2 patch set.

v1 -> v2 change:
The only one change is introducing a new patch[1/7] to remove the unnecessary
macro FSL_DMA_LD_DEBUG, thus the total patches number is 8 now (was 7)

Hongbo Zhang (8):
  DMA: Freescale: remove the unnecessary FSL_DMA_LD_DEBUG
  DMA: Freescale: unify register access methods
  DMA: Freescale: remove attribute DMA_INTERRUPT of dmaengine
  DMA: Freescale: add fsl_dma_free_descriptor() to reduce code
duplication
  DMA: Freescale: move functions to avoid forward declarations
  DMA: Freescale: change descriptor release process for supporting
async_tx
  DMA: Freescale: use spin_lock_bh instead of spin_lock_irqsave
  DMA: Freescale: add suspend resume functions for DMA driver

 drivers/dma/fsldma.c |  590 --
 drivers/dma/fsldma.h |   33 ++-
 2 files changed, 408 insertions(+), 215 deletions(-)

-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 4/8] DMA: Freescale: add fsl_dma_free_descriptor() to reduce code duplication

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

There are several places where descriptors are freed using identical code.
This patch puts this code into a function to reduce code duplication.

Signed-off-by: Hongbo Zhang 
Signed-off-by: Qiang Liu 
---
 drivers/dma/fsldma.c |   30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index b71cc04..b5a0ffa 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -418,6 +418,19 @@ static dma_cookie_t fsl_dma_tx_submit(struct 
dma_async_tx_descriptor *tx)
 }
 
 /**
+ * fsl_dma_free_descriptor - Free descriptor from channel's DMA pool.
+ * @chan : Freescale DMA channel
+ * @desc: descriptor to be freed
+ */
+static void fsl_dma_free_descriptor(struct fsldma_chan *chan,
+   struct fsl_desc_sw *desc)
+{
+   list_del(&desc->node);
+   chan_dbg(chan, "LD %p free\n", desc);
+   dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
+}
+
+/**
  * fsl_dma_alloc_descriptor - Allocate descriptor from channel's DMA pool.
  * @chan : Freescale DMA channel
  *
@@ -489,11 +502,8 @@ static void fsldma_free_desc_list(struct fsldma_chan *chan,
 {
struct fsl_desc_sw *desc, *_desc;
 
-   list_for_each_entry_safe(desc, _desc, list, node) {
-   list_del(&desc->node);
-   chan_dbg(chan, "LD %p free\n", desc);
-   dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
-   }
+   list_for_each_entry_safe(desc, _desc, list, node)
+   fsl_dma_free_descriptor(chan, desc);
 }
 
 static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
@@ -501,11 +511,8 @@ static void fsldma_free_desc_list_reverse(struct 
fsldma_chan *chan,
 {
struct fsl_desc_sw *desc, *_desc;
 
-   list_for_each_entry_safe_reverse(desc, _desc, list, node) {
-   list_del(&desc->node);
-   chan_dbg(chan, "LD %p free\n", desc);
-   dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
-   }
+   list_for_each_entry_safe_reverse(desc, _desc, list, node)
+   fsl_dma_free_descriptor(chan, desc);
 }
 
 /**
@@ -819,8 +826,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan 
*chan,
dma_run_dependencies(txd);
 
dma_descriptor_unmap(txd);
-   chan_dbg(chan, "LD %p free\n", desc);
-   dma_pool_free(chan->desc_pool, desc, txd->phys);
+   fsl_dma_free_descriptor(chan, desc);
 }
 
 /**
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 7/8] DMA: Freescale: use spin_lock_bh instead of spin_lock_irqsave

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

The usage of spin_lock_irqsave() is a stronger locking mechanism than is
required throughout the driver. The minimum locking required should be used
instead. Interrupts will be turned off and context will be saved, it is
unnecessary to use irqsave.

This patch changes all instances of spin_lock_irqsave() to spin_lock_bh(). All
manipulation of protected fields is done using tasklet context or weaker, which
makes spin_lock_bh() the correct choice.

Signed-off-by: Hongbo Zhang 
Signed-off-by: Qiang Liu 
---
 drivers/dma/fsldma.c |   25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index f8eee60..c9bf54a 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -396,10 +396,9 @@ static dma_cookie_t fsl_dma_tx_submit(struct 
dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
-   unsigned long flags;
dma_cookie_t cookie = -EINVAL;
 
-   spin_lock_irqsave(&chan->desc_lock, flags);
+   spin_lock_bh(&chan->desc_lock);
 
/*
 * assign cookies to all of the software descriptors
@@ -412,7 +411,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct 
dma_async_tx_descriptor *tx)
/* put this transaction onto the tail of the pending queue */
append_ld_queue(chan, desc);
 
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
+   spin_unlock_bh(&chan->desc_lock);
 
return cookie;
 }
@@ -725,15 +724,14 @@ static void fsldma_free_desc_list_reverse(struct 
fsldma_chan *chan,
 static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
 {
struct fsldma_chan *chan = to_fsl_chan(dchan);
-   unsigned long flags;
 
chan_dbg(chan, "free all channel resources\n");
-   spin_lock_irqsave(&chan->desc_lock, flags);
+   spin_lock_bh(&chan->desc_lock);
fsldma_cleanup_descriptors(chan);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
fsldma_free_desc_list(chan, &chan->ld_completed);
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
+   spin_unlock_bh(&chan->desc_lock);
 
dma_pool_destroy(chan->desc_pool);
chan->desc_pool = NULL;
@@ -952,7 +950,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
 {
struct dma_slave_config *config;
struct fsldma_chan *chan;
-   unsigned long flags;
int size;
 
if (!dchan)
@@ -962,7 +959,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
 
switch (cmd) {
case DMA_TERMINATE_ALL:
-   spin_lock_irqsave(&chan->desc_lock, flags);
+   spin_lock_bh(&chan->desc_lock);
 
/* Halt the DMA engine */
dma_halt(chan);
@@ -973,7 +970,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
fsldma_free_desc_list(chan, &chan->ld_completed);
chan->idle = true;
 
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
+   spin_unlock_bh(&chan->desc_lock);
return 0;
 
case DMA_SLAVE_CONFIG:
@@ -1015,11 +1012,10 @@ static int fsl_dma_device_control(struct dma_chan 
*dchan,
 static void fsl_dma_memcpy_issue_pending(struct dma_chan *dchan)
 {
struct fsldma_chan *chan = to_fsl_chan(dchan);
-   unsigned long flags;
 
-   spin_lock_irqsave(&chan->desc_lock, flags);
+   spin_lock_bh(&chan->desc_lock);
fsl_chan_xfer_ld_queue(chan);
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
+   spin_unlock_bh(&chan->desc_lock);
 }
 
 /**
@@ -1118,11 +1114,10 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
 static void dma_do_tasklet(unsigned long data)
 {
struct fsldma_chan *chan = (struct fsldma_chan *)data;
-   unsigned long flags;
 
chan_dbg(chan, "tasklet entry\n");
 
-   spin_lock_irqsave(&chan->desc_lock, flags);
+   spin_lock_bh(&chan->desc_lock);
 
/* the hardware is now idle and ready for more */
chan->idle = true;
@@ -1130,7 +1125,7 @@ static void dma_do_tasklet(unsigned long data)
/* Run all cleanup for descriptors which have been completed */
fsldma_cleanup_descriptors(chan);
 
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
+   spin_unlock_bh(&chan->desc_lock);
 
chan_dbg(chan, "tasklet exit\n");
 }
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 6/8] DMA: Freescale: change descriptor release process for supporting async_tx

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

Fix the potential risk when enable config NET_DMA and ASYNC_TX. Async_tx is
lack of support in current release process of dma descriptor, all descriptors
will be released whatever is acked or no-acked by async_tx, so there is a
potential race condition when dma engine is uesd by others clients (e.g. when
enable NET_DMA to offload TCP).

In our case, a race condition which is raised when use both of talitos and
dmaengine to offload xor is because napi scheduler will sync all pending
requests in dma channels, it affects the process of raid operations due to
ack_tx is not checked in fsl dma. The no-acked descriptor is freed which is
submitted just now, as a dependent tx, this freed descriptor trigger
BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().

TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf4 CPU: 0
GPR00: 0001 ecf41ca0 ee44/921a94a0 003f 0001 c00593e4  
0001
GPR08:  a7a7a7a7 0001 045/92002 42028042 100a38d4 ed576d98 

GPR16: ed5a11b0  2b162000 0200 046/92000 2d555000 ed3015e8 
c15a7aa0
GPR24:  c155fc40  ecb63220 ecf41d28 e47/92f640bb0 ef640c30 
ecf41ca0
NIP [c02b048c] async_tx_submit+0x6c/0x2b4
LR [c02b068c] async_tx_submit+0x26c/0x2b4
Call Trace:
[ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
[ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
[ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
[ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
[ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
[ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
[ecf41f40] [c04329b8] md_thread+0x138/0x16c
[ecf41f90] [c008277c] kthread+0x8c/0x90
[ecf41ff0] [c0011630] kernel_thread+0x4c/0x68

Another modification in this patch is the change of completed descriptors,
there is a potential risk which caused by exception interrupt, all descriptors
in ld_running list are seemed completed when an interrupt raised, it works fine
under normal condition, but if there is an exception occured, it cannot work as
our excepted. Hardware should not be depend on s/w list, the right way is to
read current descriptor address register to find the last completed descriptor.
If an interrupt is raised by an error, all descriptors in ld_running should not
be seemed finished, or these unfinished descriptors in ld_running will be
released wrongly.

A simple way to reproduce:
Enable dmatest first, then insert some bad descriptors which can trigger
Programming Error interrupts before the good descriptors. Last, the good
descriptors will be freed before they are processsed because of the exception
intrerrupt.

Note: the bad descriptors are only for simulating an exception interrupt.  This
case can illustrate the potential risk in current fsl-dma very well.

Signed-off-by: Hongbo Zhang 
Signed-off-by: Qiang Liu 
Signed-off-by: Ira W. Snyder 
---
 drivers/dma/fsldma.c |  195 --
 drivers/dma/fsldma.h |   17 -
 2 files changed, 158 insertions(+), 54 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 968877f..f8eee60 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -459,6 +459,87 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct 
fsldma_chan *chan)
 }
 
 /**
+ * fsldma_clean_completed_descriptor - free all descriptors which
+ * has been completed and acked
+ * @chan: Freescale DMA channel
+ *
+ * This function is used on all completed and acked descriptors.
+ * All descriptors should only be freed in this function.
+ */
+static void fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
+{
+   struct fsl_desc_sw *desc, *_desc;
+
+   /* Run the callback for each descriptor, in order */
+   list_for_each_entry_safe(desc, _desc, &chan->ld_completed, node)
+   if (async_tx_test_ack(&desc->async_tx))
+   fsl_dma_free_descriptor(chan, desc);
+}
+
+/**
+ * fsldma_run_tx_complete_actions - cleanup a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ * @cookie: Freescale DMA transaction identifier
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies.
+ */
+static dma_cookie_t fsldma_run_tx_complete_actions(struct fsldma_chan *chan,
+   struct fsl_desc_sw *desc, dma_cookie_t cookie)
+{
+   struct dma_async_tx_descriptor *txd = &desc->async_tx;
+
+   BUG_ON(txd->cookie < 0);
+
+   if (txd->cookie > 0) {
+   cookie = txd->cookie;
+
+   /* Run the link descriptor callback function */
+   if (txd->callback) {
+   chan_dbg(chan, "LD %p callback\n", desc);
+   txd->callback(txd->callback_param);
+   }
+   }
+
+   /* Run any dependencies */
+   dma_run_dependencies(txd);
+
+   return cookie;
+}
+
+/**
+ * fsldma_clean_running_descriptor - move

[PATCH v2 5/8] DMA: Freescale: move functions to avoid forward declarations

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

These functions will be modified in the next patch in the series. By moving the
function in a patch separate from the changes, it will make review easier.

Signed-off-by: Hongbo Zhang 
Signed-off-by: Qiang Liu 
---
 drivers/dma/fsldma.c |  188 +-
 1 file changed, 94 insertions(+), 94 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index b5a0ffa..968877f 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -459,6 +459,100 @@ static struct fsl_desc_sw 
*fsl_dma_alloc_descriptor(struct fsldma_chan *chan)
 }
 
 /**
+ * fsl_chan_xfer_ld_queue - transfer any pending transactions
+ * @chan : Freescale DMA channel
+ *
+ * HARDWARE STATE: idle
+ * LOCKING: must hold chan->desc_lock
+ */
+static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
+{
+   struct fsl_desc_sw *desc;
+
+   /*
+* If the list of pending descriptors is empty, then we
+* don't need to do any work at all
+*/
+   if (list_empty(&chan->ld_pending)) {
+   chan_dbg(chan, "no pending LDs\n");
+   return;
+   }
+
+   /*
+* The DMA controller is not idle, which means that the interrupt
+* handler will start any queued transactions when it runs after
+* this transaction finishes
+*/
+   if (!chan->idle) {
+   chan_dbg(chan, "DMA controller still busy\n");
+   return;
+   }
+
+   /*
+* If there are some link descriptors which have not been
+* transferred, we need to start the controller
+*/
+
+   /*
+* Move all elements from the queue of pending transactions
+* onto the list of running transactions
+*/
+   chan_dbg(chan, "idle, starting controller\n");
+   desc = list_first_entry(&chan->ld_pending, struct fsl_desc_sw, node);
+   list_splice_tail_init(&chan->ld_pending, &chan->ld_running);
+
+   /*
+* The 85xx DMA controller doesn't clear the channel start bit
+* automatically at the end of a transfer. Therefore we must clear
+* it in software before starting the transfer.
+*/
+   if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
+   u32 mode;
+
+   mode = get_mr(chan);
+   mode &= ~FSL_DMA_MR_CS;
+   set_mr(chan, mode);
+   }
+
+   /*
+* Program the descriptor's address into the DMA controller,
+* then start the DMA transaction
+*/
+   set_cdar(chan, desc->async_tx.phys);
+   get_cdar(chan);
+
+   dma_start(chan);
+   chan->idle = false;
+}
+
+/**
+ * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies, and then
+ * free the descriptor.
+ */
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+   struct dma_async_tx_descriptor *txd = &desc->async_tx;
+
+   /* Run the link descriptor callback function */
+   if (txd->callback) {
+   chan_dbg(chan, "LD %p callback\n", desc);
+   txd->callback(txd->callback_param);
+   }
+
+   /* Run any dependencies */
+   dma_run_dependencies(txd);
+
+   dma_descriptor_unmap(txd);
+   fsl_dma_free_descriptor(chan, desc);
+}
+
+/**
  * fsl_dma_alloc_chan_resources - Allocate resources for DMA channel.
  * @chan : Freescale DMA channel
  *
@@ -803,100 +897,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
 }
 
 /**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
- * @chan: Freescale DMA channel
- * @desc: descriptor to cleanup and free
- *
- * This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
- */
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
-{
-   struct dma_async_tx_descriptor *txd = &desc->async_tx;
-
-   /* Run the link descriptor callback function */
-   if (txd->callback) {
-   chan_dbg(chan, "LD %p callback\n", desc);
-   txd->callback(txd->callback_param);
-   }
-
-   /* Run any dependencies */
-   dma_run_dependencies(txd);
-
-   dma_descriptor_unmap(txd);
-   fsl_dma_free_descriptor(chan, desc);
-}
-
-/**
- * fsl_chan_xfer_ld_queue - transfer any pending transactions
- * @chan : Freescale DMA channel
- *
- * HARDWARE STATE: idle
- * LOCKING: must hold chan->desc_lock
- */
-static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
-{
-   struct fsl_desc_sw *desc;
-
-   /*
-* If the l

[PATCH v2 8/8] DMA: Freescale: add suspend resume functions for DMA driver

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

This patch adds suspend resume functions for Freescale DMA driver.
.prepare callback is used to stop further descriptors from being added into the
pending queue, and also issue pending queues into execution if there is any.
.suspend callback makes sure all the pending jobs are cleaned up and all the
channels are idle, and save the mode registers.
.resume callback re-initializes the channels by restore the mode registers.

Signed-off-by: Hongbo Zhang 
---
 drivers/dma/fsldma.c |   99 ++
 drivers/dma/fsldma.h |   16 
 2 files changed, 115 insertions(+)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index c9bf54a..91482d2 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -400,6 +400,14 @@ static dma_cookie_t fsl_dma_tx_submit(struct 
dma_async_tx_descriptor *tx)
 
spin_lock_bh(&chan->desc_lock);
 
+#ifdef CONFIG_PM
+   if (unlikely(chan->pm_state != RUNNING)) {
+   chan_dbg(chan, "cannot submit due to suspend\n");
+   spin_unlock_bh(&chan->desc_lock);
+   return -1;
+   }
+#endif
+
/*
 * assign cookies to all of the software descriptors
 * that make up this transaction
@@ -1311,6 +1319,9 @@ static int fsl_dma_chan_probe(struct fsldma_device *fdev,
INIT_LIST_HEAD(&chan->ld_running);
INIT_LIST_HEAD(&chan->ld_completed);
chan->idle = true;
+#ifdef CONFIG_PM
+   chan->pm_state = RUNNING;
+#endif
 
chan->common.device = &fdev->common;
dma_cookie_init(&chan->common);
@@ -1450,6 +1461,91 @@ static int fsldma_of_remove(struct platform_device *op)
return 0;
 }
 
+#ifdef CONFIG_PM
+static int fsldma_prepare(struct device *dev)
+{
+   struct platform_device *pdev = to_platform_device(dev);
+   struct fsldma_device *fdev = platform_get_drvdata(pdev);
+   struct fsldma_chan *chan;
+   int i;
+
+   for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+   chan = fdev->chan[i];
+   if (!chan)
+   continue;
+
+   spin_lock_bh(&chan->desc_lock);
+   chan->pm_state = SUSPENDING;
+   if (!list_empty(&chan->ld_pending))
+   fsl_chan_xfer_ld_queue(chan);
+   spin_unlock_bh(&chan->desc_lock);
+   }
+
+   return 0;
+}
+
+static int fsldma_suspend(struct device *dev)
+{
+   struct platform_device *pdev = to_platform_device(dev);
+   struct fsldma_device *fdev = platform_get_drvdata(pdev);
+   struct fsldma_chan *chan;
+   int i;
+
+   for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+   chan = fdev->chan[i];
+   if (!chan)
+   continue;
+
+   spin_lock_bh(&chan->desc_lock);
+   if (!chan->idle)
+   goto out;
+   chan->regs_save.mr = DMA_IN(chan, &chan->regs->mr, 32);
+   chan->pm_state = SUSPENDED;
+   spin_unlock_bh(&chan->desc_lock);
+   }
+   return 0;
+
+out:
+   for (; i >= 0; i--) {
+   chan = fdev->chan[i];
+   if (!chan)
+   continue;
+   spin_unlock_bh(&chan->desc_lock);
+   }
+   return -EBUSY;
+}
+
+static int fsldma_resume(struct device *dev)
+{
+   struct platform_device *pdev = to_platform_device(dev);
+   struct fsldma_device *fdev = platform_get_drvdata(pdev);
+   struct fsldma_chan *chan;
+   u32 mode;
+   int i;
+
+   for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+   chan = fdev->chan[i];
+   if (!chan)
+   continue;
+
+   spin_lock_bh(&chan->desc_lock);
+   mode = chan->regs_save.mr
+   & ~FSL_DMA_MR_CS & ~FSL_DMA_MR_CC & ~FSL_DMA_MR_CA;
+   DMA_OUT(chan, &chan->regs->mr, mode, 32);
+   chan->pm_state = RUNNING;
+   spin_unlock_bh(&chan->desc_lock);
+   }
+
+   return 0;
+}
+
+static const struct dev_pm_ops fsldma_pm_ops = {
+   .prepare= fsldma_prepare,
+   .suspend= fsldma_suspend,
+   .resume = fsldma_resume,
+};
+#endif
+
 static const struct of_device_id fsldma_of_ids[] = {
{ .compatible = "fsl,elo3-dma", },
{ .compatible = "fsl,eloplus-dma", },
@@ -1462,6 +1558,9 @@ static struct platform_driver fsldma_of_driver = {
.name = "fsl-elo-dma",
.owner = THIS_MODULE,
.of_match_table = fsldma_of_ids,
+#ifdef CONFIG_PM
+   .pm = &fsldma_pm_ops,
+#endif
},
.probe = fsldma_of_probe,
.remove = fsldma_of_remove,
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index ec19517..eecaf9e 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -134,6 +134,18 @@ struct fsldma_device {
 #define FSL_DMA_CHAN_PAUSE_EXT 0x1000
 #define F

[PATCH v2 3/8] DMA: Freescale: remove attribute DMA_INTERRUPT of dmaengine

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

Delete attribute DMA_INTERRUPT because fsldma doesn't support this function,
exception will be thrown if talitos is used to offload xor at the same time.

Signed-off-by: Hongbo Zhang 
Signed-off-by: Qiang Liu 
---
 drivers/dma/fsldma.c |   31 ---
 1 file changed, 31 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 5f32cb8..b71cc04 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -528,35 +528,6 @@ static void fsl_dma_free_chan_resources(struct dma_chan 
*dchan)
 }
 
 static struct dma_async_tx_descriptor *
-fsl_dma_prep_interrupt(struct dma_chan *dchan, unsigned long flags)
-{
-   struct fsldma_chan *chan;
-   struct fsl_desc_sw *new;
-
-   if (!dchan)
-   return NULL;
-
-   chan = to_fsl_chan(dchan);
-
-   new = fsl_dma_alloc_descriptor(chan);
-   if (!new) {
-   chan_err(chan, "%s\n", msg_ld_oom);
-   return NULL;
-   }
-
-   new->async_tx.cookie = -EBUSY;
-   new->async_tx.flags = flags;
-
-   /* Insert the link descriptor to the LD ring */
-   list_add_tail(&new->node, &new->tx_list);
-
-   /* Set End-of-link to the last link descriptor of new list */
-   set_ld_eol(chan, new);
-
-   return &new->async_tx;
-}
-
-static struct dma_async_tx_descriptor *
 fsl_dma_prep_memcpy(struct dma_chan *dchan,
dma_addr_t dma_dst, dma_addr_t dma_src,
size_t len, unsigned long flags)
@@ -1308,12 +1279,10 @@ static int fsldma_of_probe(struct platform_device *op)
fdev->irq = irq_of_parse_and_map(op->dev.of_node, 0);
 
dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
-   dma_cap_set(DMA_INTERRUPT, fdev->common.cap_mask);
dma_cap_set(DMA_SG, fdev->common.cap_mask);
dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
-   fdev->common.device_prep_dma_interrupt = fsl_dma_prep_interrupt;
fdev->common.device_prep_dma_memcpy = fsl_dma_prep_memcpy;
fdev->common.device_prep_dma_sg = fsl_dma_prep_sg;
fdev->common.device_tx_status = fsl_tx_status;
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/8] DMA: Freescale: unify register access methods

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

Methods of accessing DMA contorller registers are inconsistent, some registers
are accessed by DMA_IN/OUT directly, while others are accessed by functions
get/set_* which are wrappers of DMA_IN/OUT, and even for the BCR register, it
is read by get_bcr but written by DMA_OUT.
This patch unifies the inconsistent methods, all registers are accessed by
get/set_* now.

Signed-off-by: Hongbo Zhang 
---
 drivers/dma/fsldma.c |   52 --
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index ec50420..5f32cb8 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -61,6 +61,16 @@ static u32 get_sr(struct fsldma_chan *chan)
return DMA_IN(chan, &chan->regs->sr, 32);
 }
 
+static void set_mr(struct fsldma_chan *chan, u32 val)
+{
+   DMA_OUT(chan, &chan->regs->mr, val, 32);
+}
+
+static u32 get_mr(struct fsldma_chan *chan)
+{
+   return DMA_IN(chan, &chan->regs->mr, 32);
+}
+
 static void set_cdar(struct fsldma_chan *chan, dma_addr_t addr)
 {
DMA_OUT(chan, &chan->regs->cdar, addr | FSL_DMA_SNEN, 64);
@@ -71,6 +81,11 @@ static dma_addr_t get_cdar(struct fsldma_chan *chan)
return DMA_IN(chan, &chan->regs->cdar, 64) & ~FSL_DMA_SNEN;
 }
 
+static void set_bcr(struct fsldma_chan *chan, u32 val)
+{
+   DMA_OUT(chan, &chan->regs->bcr, val, 32);
+}
+
 static u32 get_bcr(struct fsldma_chan *chan)
 {
return DMA_IN(chan, &chan->regs->bcr, 32);
@@ -135,7 +150,7 @@ static void set_ld_eol(struct fsldma_chan *chan, struct 
fsl_desc_sw *desc)
 static void dma_init(struct fsldma_chan *chan)
 {
/* Reset the channel */
-   DMA_OUT(chan, &chan->regs->mr, 0, 32);
+   set_mr(chan, 0);
 
switch (chan->feature & FSL_DMA_IP_MASK) {
case FSL_DMA_IP_85XX:
@@ -144,16 +159,15 @@ static void dma_init(struct fsldma_chan *chan)
 * EOLNIE - End of links interrupt enable
 * BWC - Bandwidth sharing among channels
 */
-   DMA_OUT(chan, &chan->regs->mr, FSL_DMA_MR_BWC
-   | FSL_DMA_MR_EIE | FSL_DMA_MR_EOLNIE, 32);
+   set_mr(chan, FSL_DMA_MR_BWC | FSL_DMA_MR_EIE
+   | FSL_DMA_MR_EOLNIE);
break;
case FSL_DMA_IP_83XX:
/* Set the channel to below modes:
 * EOTIE - End-of-transfer interrupt enable
 * PRC_RM - PCI read multiple
 */
-   DMA_OUT(chan, &chan->regs->mr, FSL_DMA_MR_EOTIE
-   | FSL_DMA_MR_PRC_RM, 32);
+   set_mr(chan, FSL_DMA_MR_EOTIE | FSL_DMA_MR_PRC_RM);
break;
}
 }
@@ -175,10 +189,10 @@ static void dma_start(struct fsldma_chan *chan)
 {
u32 mode;
 
-   mode = DMA_IN(chan, &chan->regs->mr, 32);
+   mode = get_mr(chan);
 
if (chan->feature & FSL_DMA_CHAN_PAUSE_EXT) {
-   DMA_OUT(chan, &chan->regs->bcr, 0, 32);
+   set_bcr(chan, 0);
mode |= FSL_DMA_MR_EMP_EN;
} else {
mode &= ~FSL_DMA_MR_EMP_EN;
@@ -191,7 +205,7 @@ static void dma_start(struct fsldma_chan *chan)
mode |= FSL_DMA_MR_CS;
}
 
-   DMA_OUT(chan, &chan->regs->mr, mode, 32);
+   set_mr(chan, mode);
 }
 
 static void dma_halt(struct fsldma_chan *chan)
@@ -200,7 +214,7 @@ static void dma_halt(struct fsldma_chan *chan)
int i;
 
/* read the mode register */
-   mode = DMA_IN(chan, &chan->regs->mr, 32);
+   mode = get_mr(chan);
 
/*
 * The 85xx controller supports channel abort, which will stop
@@ -209,14 +223,14 @@ static void dma_halt(struct fsldma_chan *chan)
 */
if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
mode |= FSL_DMA_MR_CA;
-   DMA_OUT(chan, &chan->regs->mr, mode, 32);
+   set_mr(chan, mode);
 
mode &= ~FSL_DMA_MR_CA;
}
 
/* stop the DMA controller */
mode &= ~(FSL_DMA_MR_CS | FSL_DMA_MR_EMS_EN);
-   DMA_OUT(chan, &chan->regs->mr, mode, 32);
+   set_mr(chan, mode);
 
/* wait for the DMA controller to become idle */
for (i = 0; i < 100; i++) {
@@ -245,7 +259,7 @@ static void fsl_chan_set_src_loop_size(struct fsldma_chan 
*chan, int size)
 {
u32 mode;
 
-   mode = DMA_IN(chan, &chan->regs->mr, 32);
+   mode = get_mr(chan);
 
switch (size) {
case 0:
@@ -259,7 +273,7 @@ static void fsl_chan_set_src_loop_size(struct fsldma_chan 
*chan, int size)
break;
}
 
-   DMA_OUT(chan, &chan->regs->mr, mode, 32);
+   set_mr(chan, mode);
 }
 
 /**
@@ -277,7 +291,7 @@ static void fsl_chan_set_dst_loop_size(struct fsldma_chan 
*chan, int size)
 {
u32 mode;
 
-   mode = DMA_IN(chan, &chan->regs->mr, 32);
+   mode = get_mr(chan);
 
switch (size) {

[PATCH v2 1/8] DMA: Freescale: remove the unnecessary FSL_DMA_LD_DEBUG

2014-04-03 Thread hongbo.zhang
From: Hongbo Zhang 

Some codes are calling chan_dbg with FSL_DMA_LD_DEBUG surrounded, it is really
unnecessary to use such a macro because chan_dbg is a wrapper of dev_dbg, we do
have corresponding DEBUG macro to switch on/off dev_dbg, and most of the other
codes are also calling chan_dbg directly without using FSL_DMA_LD_DEBUG.

Signed-off-by: Hongbo Zhang 
---
 drivers/dma/fsldma.c |   10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index f157c6f..ec50420 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -426,9 +426,7 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct 
fsldma_chan *chan)
desc->async_tx.tx_submit = fsl_dma_tx_submit;
desc->async_tx.phys = pdesc;
 
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p allocated\n", desc);
-#endif
 
return desc;
 }
@@ -479,9 +477,7 @@ static void fsldma_free_desc_list(struct fsldma_chan *chan,
 
list_for_each_entry_safe(desc, _desc, list, node) {
list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
}
 }
@@ -493,9 +489,7 @@ static void fsldma_free_desc_list_reverse(struct 
fsldma_chan *chan,
 
list_for_each_entry_safe_reverse(desc, _desc, list, node) {
list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
}
 }
@@ -832,9 +826,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan 
*chan,
 
/* Run the link descriptor callback function */
if (txd->callback) {
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p callback\n", desc);
-#endif
txd->callback(txd->callback_param);
}
 
@@ -842,9 +834,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan 
*chan,
dma_run_dependencies(txd);
 
dma_descriptor_unmap(txd);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, txd->phys);
 }
 
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/2] i2c: add DMA support for freescale i2c driver

2014-04-03 Thread Yuan Yao
Add dma support for i2c. This function depend on DMA driver.
You can turn on it by write both the dmas and dma-name properties in dts node.

Signed-off-by: Yuan Yao 
---
 drivers/i2c/busses/i2c-imx.c | 372 +--
 1 file changed, 319 insertions(+), 53 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index db895fb..3d63b35 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -37,22 +37,27 @@
 /** Includes 
***
 
***/
 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+#include 
 
 /** Defines 

 
***/
@@ -63,6 +68,10 @@
 /* Default value */
 #define IMX_I2C_BIT_RATE   10  /* 100kHz */
 
+/* enable DMA if transfer byte size is bigger than this threshold */
+#define IMX_I2C_DMA_THRESHOLD  16
+#define IMX_I2C_DMA_TIMEOUT1000
+
 /* IMX I2C registers:
  * the I2C register offset is different between SoCs,
  * to provid support for all these chips, split the
@@ -88,6 +97,7 @@
 #define I2SR_IBB   0x20
 #define I2SR_IAAS  0x40
 #define I2SR_ICF   0x80
+#define I2CR_DMAEN 0x02
 #define I2CR_RSTA  0x04
 #define I2CR_TXAK  0x08
 #define I2CR_MTX   0x10
@@ -174,6 +184,17 @@ struct imx_i2c_hwdata {
unsignedi2cr_ien_opcode;
 };
 
+struct imx_i2c_dma {
+   struct dma_chan *chan_tx;
+   struct dma_chan *chan_rx;
+   struct dma_chan *chan_using;
+   struct completion   cmd_complete;
+   dma_addr_t  dma_buf;
+   unsigned intdma_len;
+   unsigned intdma_transfer_dir;
+   unsigned intdma_data_dir;
+};
+
 struct imx_i2c_struct {
struct i2c_adapter  adapter;
struct clk  *clk;
@@ -184,6 +205,8 @@ struct imx_i2c_struct {
int stopped;
unsigned intifdr; /* IMX_I2C_IFDR */
const struct imx_i2c_hwdata *hwdata;
+
+   struct imx_i2c_dma  *dma;
 };
 
 static const struct imx_i2c_hwdata imx1_i2c_hwdata  = {
@@ -254,6 +277,132 @@ static inline unsigned char imx_i2c_read_reg(struct 
imx_i2c_struct *i2c_imx,
return readb(i2c_imx->base + (reg << i2c_imx->hwdata->regshift));
 }
 
+/* Functions for DMA support */
+static int i2c_imx_dma_request(struct imx_i2c_struct *i2c_imx,
+   dma_addr_t phy_addr)
+{
+   struct imx_i2c_dma *dma;
+   struct dma_slave_config dma_sconfig;
+   struct device *dev = &i2c_imx->adapter.dev;
+   int ret;
+
+   dma = devm_kzalloc(dev, sizeof(struct imx_i2c_dma), GFP_KERNEL);
+   if (!dma) {
+   dev_info(dev, "can't allocate DMA struct\n");
+   return -ENOMEM;
+   }
+
+   dma->chan_tx = dma_request_slave_channel(dev, "tx");
+   return 0;
+   if (!dma->chan_tx) {
+   dev_info(dev, "DMA tx channel request failed\n");
+   ret = -ENODEV;
+   goto fail_al;
+   }
+
+   dma_sconfig.dst_addr = phy_addr +
+   (IMX_I2C_I2DR << i2c_imx->hwdata->regshift);
+   dma_sconfig.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+   dma_sconfig.dst_maxburst = 1;
+   dma_sconfig.direction = DMA_MEM_TO_DEV;
+   ret = dmaengine_slave_config(dma->chan_tx, &dma_sconfig);
+   if (ret < 0) {
+   dev_info(dev, "DMA slave config failed, err = %d\n", ret);
+   goto fail_tx;
+   }
+
+   dma->chan_rx = dma_request_slave_channel(dev, "rx");
+   if (!dma->chan_rx) {
+   dev_info(dev, "DMA rx channel request failed\n");
+   ret = -ENODEV;
+   goto fail_tx;
+   }
+
+   dma_sconfig.src_addr = phy_addr +
+   (IMX_I2C_I2DR << i2c_imx->hwdata->regshift);
+   dma_sconfig.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+   dma_sconfig.src_maxburst = 1;
+   dma_sconfig.direction = DMA_DEV_TO_MEM;
+   ret = dmaengine_slave_config(dma->chan_rx, &dma_sconfig);
+   if (ret < 0) {
+   dev_info(dev, "DMA slave config failed, err = %d\n", ret);
+   goto fail_rx;
+   }
+
+   i2c_imx->dma = dma;
+
+   init_completion(&dma->cmd_complete);
+
+   return 0;
+
+fail_rx:
+   dma_release_channel(dma->chan_rx);
+fail_tx:
+   dma_release_channel(dma->chan_tx);
+fail_al:
+   devm_kfree(dev, dma);
+
+   ret

[PATCH v4 2/2] Documentation:add DMA support for freescale i2c driver

2014-04-03 Thread Yuan Yao
Add i2c dts node properties for eDMA support, them depend on the eDMA driver.

Signed-off-by: Yuan Yao 
---
 Documentation/devicetree/bindings/i2c/i2c-imx.txt | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt 
b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
index 4a8513e..52d37fd 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
@@ -11,6 +11,8 @@ Required properties:
 Optional properties:
 - clock-frequency : Constains desired I2C/HS-I2C bus clock frequency in Hz.
   The absence of the propoerty indicates the default frequency 100 kHz.
+- dmas: A list of two dma specifiers, one for each entry in dma-names.
+- dma-names: should contain "tx" and "rx".
 
 Examples:
 
@@ -26,3 +28,12 @@ i2c@70038000 { /* HS-I2C on i.MX51 */
interrupts = <64>;
clock-frequency = <40>;
 };
+
+i2c0: i2c@40066000 { /* i2c0 on vf610 */
+   compatible = "fsl,vf610-i2c";
+   reg = <0x40066000 0x1000>;
+   interrupts =<0 71 0x04>;
+   dmas = <&edma0 0 50>,
+   <&edma0 0 51>;
+   dma-names = "rx","tx";
+};
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 0/2] i2c: add DMA support for freescale i2c driver

2014-04-03 Thread Yuan Yao

Changed in v4:
- cancelled "i2c_imx->use_dma".
- changed "Dma" to "DMA".
- add timeout handling  for DMA transfer complete.

Changed in v3:
- fix a bug when request the DMA faild.
- some minor fixes for coding style.
- other minor fixes.

Changed in v2:
- remove has_dma_support property
- unify i2c_imx_dma_rx and i2c_imx_dma_tx
- unify i2c_imx_dma_read and i2c_imx_pio_read
- unify i2c_imx_dma_write and i2c_imx_pio_write

Added in v1:
- Enable DMA if it's support DMA and transfer size bigger than the threshold.
- Add device tree bindings for i2c eDMA support.
- Add eDMA support for i2c driver.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Xen 32-bit PV regression

2014-04-03 Thread Boris Ostrovsky

Steven,

Looks like commit 198d208df (x86: Keep thread_info on thread stack in 
x86_32) broke Xen's 32-bit PV guests.


I poked a little at it and it seems that at least the ifdef in 
xen_cpu_up() needs to be adjusted to set up kernel_stack --- that allows 
CPUs to get going. This is not enough though (not particularly 
surprisingly) and we die a little later with #GPF in xen_iret.


-boris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] timekeeping: move clocksource init to the early place

2014-04-03 Thread Lei Wen
So that in the very early booting place, we could call timekeeping
code, while it would not cause system panic, since clock is not
init yet.

And for system default clock is always jiffies, so that it shall be
safe to do so.

Signed-off-by: Lei Wen 
---
 include/linux/time.h  |  1 +
 init/main.c   |  1 +
 kernel/time/timekeeping.c | 22 +++---
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index a2f5079..e2d4899 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -127,6 +127,7 @@ extern void read_boot_clock(struct timespec *ts);
 extern int persistent_clock_is_local;
 extern int update_persistent_clock(struct timespec now);
 void timekeeping_init(void);
+void timekeeping_init_early(void);
 extern int timekeeping_suspended;
 
 unsigned long get_seconds(void);
diff --git a/init/main.c b/init/main.c
index 9c7fd4c..5723933 100644
--- a/init/main.c
+++ b/init/main.c
@@ -494,6 +494,7 @@ asmlinkage void __init start_kernel(void)
 */
boot_init_stack_canary();
 
+   timekeeping_init_early();
cgroup_init_early();
 
local_irq_disable();
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c196111..b8f850b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -782,12 +782,25 @@ void __attribute__((weak)) read_boot_clock(struct 
timespec *ts)
 }
 
 /*
- * timekeeping_init - Initializes the clocksource and common timekeeping values
+ * timekeeping_init_early - setup clocksource early
  */
-void __init timekeeping_init(void)
+void __init timekeeping_init_early(void)
 {
struct timekeeper *tk = &timekeeper;
struct clocksource *clock;
+
+   clock = clocksource_default_clock();
+   if (clock->enable)
+   clock->enable(clock);
+   tk_setup_internals(tk, clock);
+}
+
+/*
+ * timekeeping_init - Initializes common timekeeping values
+ */
+void __init timekeeping_init(void)
+{
+   struct timekeeper *tk = &timekeeper;
unsigned long flags;
struct timespec now, boot, tmp;
 
@@ -813,11 +826,6 @@ void __init timekeeping_init(void)
write_seqcount_begin(&timekeeper_seq);
ntp_init();
 
-   clock = clocksource_default_clock();
-   if (clock->enable)
-   clock->enable(clock);
-   tk_setup_internals(tk, clock);
-
tk_set_xtime(tk, &now);
tk->raw_time.tv_sec = 0;
tk->raw_time.tv_nsec = 0;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] printk: using booting time as the timestamp

2014-04-03 Thread Lei Wen
As people may want to align the kernel log with some other processor
running over the same machine but not the same copy of linux, we
need to keep their log aligned, so that it would not make debug
process hard and confused.

Signed-off-by: Lei Wen 
---
 kernel/printk/printk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a45b509..af96fbd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -349,7 +349,7 @@ static void log_store(int facility, int level,
if (ts_nsec > 0)
msg->ts_nsec = ts_nsec;
else
-   msg->ts_nsec = local_clock();
+   msg->ts_nsec = ktime_to_ns(ktime_get_boottime());
memset(log_dict(msg) + dict_len, 0, pad_len);
msg->len = size;
 
@@ -1440,7 +1440,7 @@ static bool cont_add(int facility, int level, const char 
*text, size_t len)
cont.facility = facility;
cont.level = level;
cont.owner = current;
-   cont.ts_nsec = local_clock();
+   cont.ts_nsec = ktime_to_ns(ktime_get_boottime());
cont.flags = 0;
cont.cons = 0;
cont.flushed = false;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] switch printk timestamp to use booting time

2014-04-03 Thread Lei Wen
It is very common to have many processor to run over the same machine
but run different OS actually, so that timestamp alignment is key
to do the right debugging when find something wrong.

Linux adopt schedule clock stopped, so that printk timestamp would get
during suspend period which break such assumption in the old days.
So this patch set is supposed to recover such behavior again.

BTW, I am not sure whether we could add additional member in printk
log structure, so that we could print out two piece of log with
one including suspend time, while another not?

Lei Wen (3):
  time: create __get_monotonic_boottime for WARNless calls
  timekeeping: move clocksource init to the early place
  printk: using booting time as the timestamp

 include/linux/time.h  |  2 ++
 init/main.c   |  1 +
 kernel/printk/printk.c|  4 ++--
 kernel/time/timekeeping.c | 55 ---
 4 files changed, 48 insertions(+), 14 deletions(-)

-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] time: create __get_monotonic_boottime for WARNless calls

2014-04-03 Thread Lei Wen
Since sched_clock always get stopped during suspend period, it
would make it hard to use the kernel log to compare with other
procssor generated log which running over the same machine.
[Absolutely not running linux]

So we need a way to recover the printk timestamp that including
suspend time in the old way, get_monotonic_boottime is a good
candidate, but it cannot be called after suspend process has happen.
Thus, it prevent printk to be used in every corner.

Export one warn less __get_monotonic_boottime to solve this issue.

Signed-off-by: Lei Wen 
---
 include/linux/time.h  |  1 +
 kernel/time/timekeeping.c | 33 -
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index d5d229b..a2f5079 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -171,6 +171,7 @@ extern void getnstime_raw_and_real(struct timespec *ts_raw,
struct timespec *ts_real);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
+extern int __get_monotonic_boottime(struct timespec *ts);
 extern void get_monotonic_boottime(struct timespec *ts);
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5b40279..c196111 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1465,23 +1465,22 @@ void getboottime(struct timespec *ts)
 EXPORT_SYMBOL_GPL(getboottime);
 
 /**
- * get_monotonic_boottime - Returns monotonic time since boot
+ * __get_monotonic_boottime - Returns monotonic time since boot
  * @ts:pointer to the timespec to be set
  *
- * Returns the monotonic time since boot in a timespec.
+ * Update the monotonic time since boot in a timespec.
+ * Returns 0 on success, or -ve when suspended (timespec will be undefined).
  *
  * This is similar to CLOCK_MONTONIC/ktime_get_ts, but also
  * includes the time spent in suspend.
  */
-void get_monotonic_boottime(struct timespec *ts)
+int __get_monotonic_boottime(struct timespec *ts)
 {
struct timekeeper *tk = &timekeeper;
struct timespec tomono, sleep;
s64 nsec;
unsigned int seq;
 
-   WARN_ON(timekeeping_suspended);
-
do {
seq = read_seqcount_begin(&timekeeper_seq);
ts->tv_sec = tk->xtime_sec;
@@ -1494,6 +1493,30 @@ void get_monotonic_boottime(struct timespec *ts)
ts->tv_sec += tomono.tv_sec + sleep.tv_sec;
ts->tv_nsec = 0;
timespec_add_ns(ts, nsec + tomono.tv_nsec + sleep.tv_nsec);
+
+   /*
+* Do not bail out early, in case there were callers still using
+* the value, even in the face of the WARN_ON.
+*/
+   if (unlikely(timekeeping_suspended))
+   return -EAGAIN;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(__get_monotonic_boottime);
+
+/**
+ * get_monotonic_boottime - Returns monotonic time since boot
+ * @ts:pointer to the timespec to be set
+ *
+ * Returns the monotonic time since boot in a timespec.
+ * (WARN if suspended)
+ *
+ * This is similar to CLOCK_MONTONIC/ktime_get_ts, but also
+ * includes the time spent in suspend.
+ */
+void get_monotonic_boottime(struct timespec *ts)
+{
+   WARN_ON(__get_monotonic_boottime(ts));
 }
 EXPORT_SYMBOL_GPL(get_monotonic_boottime);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] hugetlb: add support for gigantic page allocation at runtime

2014-04-03 Thread Yasuaki Ishimatsu
(2014/04/03 3:08), Luiz Capitulino wrote:
> HugeTLB is limited to allocating hugepages whose size are less than
> MAX_ORDER order. This is so because HugeTLB allocates hugepages via
> the buddy allocator. Gigantic pages (that is, pages whose size is
> greater than MAX_ORDER order) have to be allocated at boottime.
> 
> However, boottime allocation has at least two serious problems. First,
> it doesn't support NUMA and second, gigantic pages allocated at
> boottime can't be freed.
> 
> This commit solves both issues by adding support for allocating gigantic
> pages during runtime. It works just like regular sized hugepages,
> meaning that the interface in sysfs is the same, it supports NUMA,
> and gigantic pages can be freed.
> 
> For example, on x86_64 gigantic pages are 1GB big. To allocate two 1G
> gigantic pages on node 1, one can do:
> 
>   # echo 2 > \
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
> 
> And to free them later:
> 
>   # echo 0 > \
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
> 
> The one problem with gigantic page allocation at runtime is that it
> can't be serviced by the buddy allocator. To overcome that problem, this
> series scans all zones from a node looking for a large enough contiguous
> region. When one is found, it's allocated by using CMA, that is, we call
> alloc_contig_range() to do the actual allocation. For example, on x86_64
> we scan all zones looking for a 1GB contiguous region. When one is found
> it's allocated by alloc_contig_range().
> 
> One expected issue with that approach is that such gigantic contiguous
> regions tend to vanish as time goes by. The best way to avoid this for
> now is to make gigantic page allocations very early during boot, say
> from a init script. Other possible optimization include using compaction,
> which is supported by CMA but is not explicitly used by this commit.
> 
> It's also important to note the following:
> 
>   1. My target systems are x86_64 machines, so I have only tested 1GB
>  pages allocation/release. I did try to make this arch indepedent
>  and expect it to work on other archs but didn't try it myself
> 
>   2. I didn't add support for hugepage overcommit, that is allocating
>  a gigantic page on demand when
> /proc/sys/vm/nr_overcommit_hugepages > 0. The reason is that I don't
> think it's reasonable to do the hard and long work required for
> allocating a gigantic page at fault time. But it should be simple
> to add this if wanted
> 
> Signed-off-by: Luiz Capitulino 
> ---
>   arch/x86/include/asm/hugetlb.h |  10 +++
>   mm/hugetlb.c   | 177 
> ++---
>   2 files changed, 176 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
> index a809121..2b262f7 100644
> --- a/arch/x86/include/asm/hugetlb.h
> +++ b/arch/x86/include/asm/hugetlb.h
> @@ -91,6 +91,16 @@ static inline void arch_release_hugepage(struct page *page)
>   {
>   }
>   
> +static inline int arch_prepare_gigantic_page(struct page *page)
> +{
> + return 0;
> +}
> +
> +static inline void arch_release_gigantic_page(struct page *page)
> +{
> +}
> +
> +
>   static inline void arch_clear_hugepage_flags(struct page *page)
>   {
>   }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 2c7a44a..c68515e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -643,11 +643,159 @@ static int hstate_next_node_to_free(struct hstate *h, 
> nodemask_t *nodes_allowed)
>   ((node = hstate_next_node_to_free(hs, mask)) || 1); \
>   nr_nodes--)
>   
> +#ifdef CONFIG_CMA
> +static void destroy_compound_gigantic_page(struct page *page,
> + unsigned long order)
> +{
> + int i;
> + int nr_pages = 1 << order;
> + struct page *p = page + 1;
> +
> + for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
> + __ClearPageTail(p);
> + set_page_refcounted(p);
> + p->first_page = NULL;
> + }
> +
> + set_compound_order(page, 0);
> + __ClearPageHead(page);
> +}
> +
> +static void free_gigantic_page(struct page *page, unsigned order)
> +{
> + free_contig_range(page_to_pfn(page), 1 << order);
> +}
> +
> +static int __alloc_gigantic_page(unsigned long start_pfn, unsigned long 
> count)
> +{
> + unsigned long end_pfn = start_pfn + count;
> + return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
> +}
> +
> +static bool pfn_valid_gigantic(unsigned long pfn)
> +{
> + struct page *page;
> +
> + if (!pfn_valid(pfn))
> + return false;
> +
> + page = pfn_to_page(pfn);
> +
> + if (PageReserved(page))
> + return false;
> +
> + if (page_count(page) > 0)
> + return false;
> +
> + return true;
> +}
> +
> +static inline bool pfn_aligned_gigantic(unsigned long pfn, unsigned order)
> +{
> +

Re: [PATCH 1/2] devicetree: Add devicetree bindings documentation for Zynq Quad SPI

2014-04-03 Thread Harini Katakam
Hi Mark,

On Fri, Apr 4, 2014 at 2:31 AM, Mark Brown  wrote:
> On Thu, Apr 03, 2014 at 10:33:06PM +0530, Punnaiah Choudary Kalluri wrote:
>
>> +Optional properties:
>> +- num-cs : Number of chip selects used.
>
> What does this translate into?
>
>> + num-cs = /bits/ 16 <1>;
>
> Why the odd specification in the example - why not just specify it as a
> number?

Same as discussed on SPI cadence thread.

Regards,
Harini
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] devicetree: Add devicetree bindings documentation for Cadence SPI

2014-04-03 Thread Harini Katakam
Hi Mark,

On Fri, Apr 4, 2014 at 3:04 AM, Mark Brown  wrote:
> On Thu, Apr 03, 2014 at 04:40:31PM +0530, Harini Katakam wrote:
>
>> +Optional properties:
>> +- num-cs : Number of chip selects used.
>
> How does this translate to the hardware?

This IP can drive 4 slaves.
The CS line to be driven is selected in spi device structure and
that is driven by the software.

>
>> + num-cs = /bits/ 16 <4>;
>
> What's going on with the /bits/ - is this something that's required for
> the property?

The master->num-chipselect property is 16 bit but writing <4> here directly
leads to 0 being read in of_property_read (because it's big endian).
Instead using of property read u32 and then copying, we decided to do this.
This was discussed on v2 between Michal and Rob:
 +   num-chip-select = /bits/ 16 <4>;
>>
>> I was expecting you will comment this a little bit. :-)
>> Because all just reading this num-cs as 32bit and then
>> assigning this value to master->num_chipselect which is 16bit.
>
> Well, everyone else has that problem then. Obviously it takes a bit
> more care than just reading into a u32, but that is a kernel problem
> and not a problem of the binding.
They are not reading it directly with read_u32 but they are using
intermediate u32 value which is assigned to u16 which is fine.
This pattern is in most drivers(maybe all).
The point is if binding should or can't simplify driver code.
And from your reaction above I expect that it is up to driver
owner and binding doc how you want to do it.

Regards,
Harini
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-03 Thread Waiman Long

On 04/03/2014 01:23 PM, Konrad Rzeszutek Wilk wrote:

On Wed, Apr 02, 2014 at 10:10:17PM -0400, Waiman Long wrote:

On 04/02/2014 04:35 PM, Waiman Long wrote:

On 04/02/2014 10:32 AM, Konrad Rzeszutek Wilk wrote:

On Wed, Apr 02, 2014 at 09:27:29AM -0400, Waiman Long wrote:

N.B. Sorry for the duplicate. This patch series were resent as the
  original one was rejected by the vger.kernel.org list server
  due to long header. There is no change in content.

v7->v8:
   - Remove one unneeded atomic operation from the slowpath, thus
 improving performance.
   - Simplify some of the codes and add more comments.
   - Test for X86_FEATURE_HYPERVISOR CPU feature bit to enable/disable
 unfair lock.
   - Reduce unfair lock slowpath lock stealing frequency depending
 on its distance from the queue head.
   - Add performance data for IvyBridge-EX CPU.

FYI, your v7 patch with 32 VCPUs (on a 32 cpu socket machine) on an
HVM guest under Xen after a while stops working. The workload
is doing 'make -j32' on the Linux kernel.

Completely unresponsive. Thoughts?


Thank for reporting that. I haven't done that much testing on Xen.
My focus was in KVM. I will perform more test on Xen to see if I
can reproduce the problem.


BTW, does the halting and sending IPI mechanism work in HVM? I saw

Yes.

that in RHEL7, PV spinlock was explicitly disabled when in HVM mode.
However, this piece of code isn't in upstream code. So I wonder if
there is problem with that.

The PV ticketlock fixed it for HVM. It was disabled before because
the PV guests were using bytelocks while the HVM were using ticketlocks
and you couldnt' swap in PV bytelocks for ticketlocks during startup.


The RHEL7 code has used PV ticketlock already. RHEL7 uses a single 
kernel for all configurations. So PV ticketlock as well as Xen and KVM 
support was compiled in. I think booting the kernel on bare metal will 
cause the Xen code to work in HVM mode thus activating the PV spinlock 
code which has a negative impact on performance. That may be why it was 
disabled so that the bare metal performance will not be impacted.


BTW, could you send me more information about the configuration of the 
machine, like the .config file that you used?


-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] nohz: use seqlock to avoid race on idle time stats v2

2014-04-03 Thread Hidetoshi Seto
(2014/04/03 18:51), Denys Vlasenko wrote:
> On Thu, Apr 3, 2014 at 9:02 AM, Hidetoshi Seto
>  wrote:
 [PROBLEM 2]: broken iowait accounting.

 As historical nature, cpu's idle time was accounted as either
 idle or iowait depending on the presence of tasks blocked by
 I/O. No one complain about it for a long time. However:

   > Still trying to wrap my head around it, but conceptually
   > get_cpu_iowait_time_us() doesn't make any kind of sense.
   > iowait isn't per cpu since effectively tasks that aren't
   > running aren't assigned a cpu (as Oleg already pointed out).
   -- Peter Zijlstra

 Now some kernel folks realized that accounting iowait as per-cpu
 does not make sense in SMP world. When we were in traditional
 UP era, cpu is considered to be waiting I/O if it is idle while
 nr_iowait > 0. But in these days with SMP systems, tasks easily
 migrate from a cpu where they issued an I/O to another cpu where
 they are queued after I/O completion.
>>>
>>> However, if we would put ourselves into admin's seat, iowait
>>> immediately starts to make sense: for admin, the system state
>>> where a lot of CPU time is genuinely idle is qualitatively different
>>> form the state where a lot of CPU time is "idle" because
>>> we are I/O bound.
>>>
>>> Admins probably wouldn't insist that iowait accounting must be
>>> very accurate. I would hazard to guess that admins would settle
>>> for the following rules:
>>>
>>> * (idle + iowait) should accurately represent amount of time
>>> CPUs were idle.
>>> * both idle and iowait should never go backwards
>>> * when system is truly idle, only idle should increase
>>> * when system is truly I/O bound on all CPUs, only iowait should increase
>>> * when the situation is in between of the above two cases,
>>> both iowait and idle counters should grow. It's ok if they
>>> represent idle/IO-bound ratio only approximately
>>
>> Yep. Admins are at the mercy of iowait value, though they know it
>> is not accurate.
>>
>> Assume there are task X,Y,Z (X issues io, Y sleeps moderately,
>> and Z has low priority):
>>
>> Case 1:
>>   cpu A: <--run X--><--iowait--><--run X--><--iowait--><--run X ...
>>   cpu B: <---run Y--><--run Z--><--run Y--><--run Z--><--run Y ...
>>   io:   <-- io X -->   <-- io X -->
>>
>> Case 2:
>>   cpu A: <--run X--><--run Z---><--run X--><--run Z---><--run X ...
>>   cpu B: <---run Y---><--idle--><---run Y---><--idle--><--run Y ...
>>   io:   <-- io X -->   <-- io X -->
>>
>> So case 1 tend to be iowait while case 2 is idle, despite
>> almost same workloads. Then what should admins do...?
> 
> This happens with current code too, right?
> No regression then.

Yes, problem 2 is not regression. As I state it at first place,
it is fundamental problem of current iowait stuff. And my patch
set does not aim at this problem 2.

 Back to NO_HZ mechanism. Totally terrible thing here is that
 observer need to handle duration "delta" without knowing that
 nr_iowait of sleeping cpu can be changed easily by migration
 even if cpu is sleeping.
>>>
>>> How about the following: when CPU enters idle, it remembers
>>> in struct tick_sched->idle_active whether it was "truly" idle
>>> or I/O bound: something like
>>>
>>> ts->idle_active = nr_iowait_cpu(smp_processor_id()) ? 2 : 1;
>>>
>>> Then, when we exit idle, we account entire idle period as
>>> "true" idle or as iowait depending on ts->idle_active value,
>>> regardless of what happened to I/O bound task (whether
>>> it migrated or not).
>>
>> It will not be acceptable. CPU can sleep significantly long
>> time after all I/O bound tasks are migrated. e.g.:
>>
>> cpu A: <-run X-><-- iowait ---... (few days) ...--><-run Z ..
>> cpu B: <-run X->..
>> io: <-io X->
> 
> Does task migrate from an *idle* CPU? If yes, why?
> Since its CPU is idle (i.e. immediately available
> for it to be scheduled on),
> I would imagine normally IO-blocked task stays
> on its CPU's rq if it is idle.

I found an answer from Peter Zijlstra in following threads:
[PATCH RESEND 0/4] nohz: Fix racy sleeptime stats
https://lkml.org/lkml/2013/8/16/274

(Sorry, I could not reach lkml.org today due to some network
 error, so I could not get direct link to following reply.
 I hope you can find it from parent post started from link
 above. I quote the important part instead.)

 
> Option B:
> 
>> Or we can live with that and still account the whole idle time slept until
>> tick_nohz_stop_idle() to iowait if we called tick_nohz_start_idle() with 
>> nr_iowait > 0.
>> All we need is just a new field in ts-> that records on which state we 
>> entered
>> idle.
>>
>> What do you think?
> 
> I think option B is unworkable. Afaict it could basically caused
> unlimited iowait time. Suppose we have a load-balancer that tries it
> bestestest to sort-left (ie. run a task on the lowest 'free' cpu
> possible) -- the

Re: [BUG] x86: reboot doesn't reboot

2014-04-03 Thread Li, Aubrey
On 2014/4/4 10:16, Steven Rostedt wrote:
> On Fri, 04 Apr 2014 07:52:53 +0800
> "Li, Aubrey"  wrote:
> 
>> On 2014/4/4 7:40, Steven Rostedt wrote:
>>> On Fri, 04 Apr 2014 07:23:32 +0800
>>> "Li, Aubrey"  wrote:
>>>
 Can you please send the dmi table out?
>>>
>>> I already did as a gz attachment to H. Peter. You were on the Cc, did
>>> you not receive it?
>>>
>> Oh, I got it. This is a Preproduction machine.
>> When reboot failed via a method (=e or =p), there are two case.
>>
>> Case 1: this method do nothing, pass the attempt chance to the next method
>> Case 2: this method hangs the system
>>
>> I want to know if CF9 is case 1 or case 2. Could you please try the following
>> patch *without* any reboot parameters?
>>
>> (1) If we didn't see any string, then EFI hangs your box.
>> (2) if we see the first string but not the second one, CF9 hangs your box
>> (3) if we see both, couldn't be, because BIOS works on your box.
> 
> Here's the output:
> 
> [  114.445327] reboot: Restarting system
> [  114.449002] reboot: machine restart
> [  114.453495] reboot: reboot via CF9...
> 

Thanks Steven, I got what I want.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] ARM: add SMP support for Broadcom mobile SoCs

2014-04-03 Thread Alex Elder
On 04/03/2014 09:18 PM, Alex Elder wrote:
> This patch adds SMP support for BCM281XX and BCM21664 family SoCs.
> 
> This feature is controlled with a distinct config option such that a
> SMP-enabled multi-v7 binary can be configured to run these SoCs in
> uniprocessor mode.  Since this SMP functionality is used for
> multiple Broadcom mobile chip families the config option is called
> ARCH_BCM_MOBILE_SMP (for lack of a better name).
> 
> On SoCs of this type, the secondary core is not held in reset on
> power-on.  Instead it loops in a ROM-based holding pen.  To release
> it, one must write into a special register a jump address whose
> low-order bits have been replaced with a secondary core's id, then
> trigger an event with SEV.  On receipt of an event, the ROM code
> will examine the register's contents, and if the low-order bits
> match its cpu id, it will clear them and write the value back to the
> register just prior to jumping to the address specified.
> 
> The location of the special register is defined in the device tree
> using a "secondary-boot-reg" property in a node whose "enable-method"
> matches.
> 
> Derived from code originally provided by Ray Jui 
> 
> Signed-off-by: Alex Elder 
> ---

. . .

> diff --git a/arch/arm/mach-bcm/Makefile b/arch/arm/mach-bcm/Makefile
> index b2279e3..929579f 100644
> --- a/arch/arm/mach-bcm/Makefile
> +++ b/arch/arm/mach-bcm/Makefile
> @@ -15,7 +15,10 @@ obj-$(CONFIG_ARCH_BCM_281XX)   += board_bcm281xx.o
>  plus_sec := $(call as-instr,.arch_extension sec,+sec)
>  
>  # BCM21664
> -obj-$(CONFIG_ARCH_BCM_21664) += board_bcm21664.o
> +obj-$(CONFIG_ARCH_BCM_21664) := board_bcm21664.o

The above was a mistake, it should still be +=.

(I'll fix it.)

. . .

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] ARM: introduce CPU_METHOD_OF_DECLARE_SETUP()

2014-04-03 Thread Alex Elder
The CPU_METHOD_OF_DECLARE() macro allows methods for assigning
SMP/hotplug operations to CPUS to be defined using device tree,
without the need for machine-dependent code.

And although it allows the *method* to be specified, it does *not*
allow any parameterization of that method.  For example, there is no
efficient way to define a machine-specific address or other
property one might want to define for secondary CPUs.

Define a new of_cpu_method->setup() function, which (if defined) is
called for nodes found having a matching "enable-method" property.
The matching node is supplied as the function's argument, allowing
additional required information to be extracted from that node.
A new macro CPU_METHOD_OF_DECLARE_SETUP() allows a setup method
to be supplied when a method is declared.

Extend the interface for set_smp_ops_by_method() so that it can
return a negative error code to allow DT parsing errors to be
reported by the setup function.

(Note that only the first "cpu" (or "cpus") node having a matching
method is used by set_smp_ops_by_method(); this logic is not
changed.)

Signed-off-by: Alex Elder 
---
 arch/arm/include/asm/smp.h |   10 --
 arch/arm/kernel/devtree.c  |   31 +--
 2 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 2ec765c..ab4a5a9 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -115,15 +115,21 @@ struct smp_operations {
 #endif
 };
 
+struct device_node;
 struct of_cpu_method {
const char *method;
+   int (*setup)(struct device_node *node);
struct smp_operations *ops;
 };
 
-#define CPU_METHOD_OF_DECLARE(name, _method, _ops) \
+#define CPU_METHOD_OF_DECLARE_SETUP(name, _method, _setup, _ops)   \
static const struct of_cpu_method __cpu_method_of_table_##name  \
__used __section(__cpu_method_of_table) \
-   = { .method = _method, .ops = _ops }
+   = { .method = _method, .setup = _setup, .ops = _ops }
+
+#define CPU_METHOD_OF_DECLARE(name, _method, _ops) \
+   CPU_METHOD_OF_DECLARE_SETUP(name, _method, NULL, _ops)
+
 /*
  * set platform specific SMP operations
  */
diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c
index c7419a5..1a0cca3 100644
--- a/arch/arm/kernel/devtree.c
+++ b/arch/arm/kernel/devtree.c
@@ -76,11 +76,18 @@ static int __init set_smp_ops_by_method(struct device_node 
*node)
if (of_property_read_string(node, "enable-method", &method))
return 0;
 
-   for (; m < __cpu_method_of_table_end; m++)
+   for (; m < __cpu_method_of_table_end; m++) {
if (!strcmp(m->method, method)) {
-   smp_set_ops(m->ops);
-   return 1;
+   int ret = 0;
+
+   if (m->setup)
+   ret = m->setup(node);
+   if (!ret)
+   smp_set_ops(m->ops);
+
+   return ret ? ret : 1;
}
+   }
 
return 0;
 }
@@ -181,16 +188,28 @@ void __init arm_dt_init_cpu_maps(void)
 
tmp_map[i] = hwid;
 
-   if (!found_method)
+   if (!found_method) {
found_method = set_smp_ops_by_method(cpu);
+   if (WARN(found_method < 0,
+   "error %d getting enable-method for "
+   "DT /cpu %u\n", found_method, cpuidx)) {
+   return;
+   }
+   }
}
 
/*
 * Fallback to an enable-method in the cpus node if nothing found in
 * a cpu node.
 */
-   if (!found_method)
-   set_smp_ops_by_method(cpus);
+   if (!found_method) {
+   found_method = set_smp_ops_by_method(cpus);
+   if (WARN(found_method < 0,
+   "error %d getting enable-method for "
+   "DT /cpus node\n", found_method)) {
+   return;
+   }
+   }
 
if (!bootcpu_valid) {
pr_warn("DT missing boot CPU MPIDR[23:0], fall back to default 
cpu_logical_map\n");
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] ARM: dts: enable SMP support for bcm21664

2014-04-03 Thread Alex Elder
Define nodes representing the two Cortex A9 CPUs in a bcm21644 SoC.

Signed-off-by: Alex Elder 
---
 arch/arm/boot/dts/bcm21664.dtsi |   19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/bcm21664.dtsi b/arch/arm/boot/dts/bcm21664.dtsi
index 08a44d4..a37ded1 100644
--- a/arch/arm/boot/dts/bcm21664.dtsi
+++ b/arch/arm/boot/dts/bcm21664.dtsi
@@ -25,6 +25,25 @@
bootargs = "console=ttyS0,115200n8";
};
 
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   enable-method = "brcm,bcm11351-cpu-method";
+   secondary-boot-reg = <0x35004178>;
+
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <0>;
+   };
+
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <1>;
+   };
+   };
+
gic: interrupt-controller@3ff00100 {
compatible = "arm,cortex-a9-gic";
#interrupt-cells = <3>;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   >