Re: [PATCH RFC] Coccinelle: Check for return not matching function signature

2015-05-07 Thread Nicholas Mc Guire
On Tue, 05 May 2015, Julia Lawall wrote:

> 
> 
> On Tue, 5 May 2015, Nicholas Mc Guire wrote:
> 
> > On Tue, 05 May 2015, Julia Lawall wrote:
> >
> > > > +@match@
> > > > +identifier f,ret;
> > > > +position p;
> > > > +type T1,T2;
> > > > +@@
> > > > +
> > > > +T1 f(...) {
> > > > + T2 ret;
> > > > +<+...
> > > > +* return@p ret
> > > > +;
> > > > +...+>
> > > > +}
> > >
> > > Given the number of results, it may seem surprising, but I think that you
> > > are actually missing a lot of results.  Becaue you require that ret be the
> > > first variable that is declared in the function.  Also, you require that
> > > ret be an identifier.  If you want to keep the restriction about being an
> > > identifier, you could put:
> > >
> > > @match exists@
> > > type T1,T2;
> > > idexpression T2 ret;
> 
> I was think ing that you don't want expression in general, because for all
> contansts that will give you int.
> 
> You can of course put return C; for constant metavariable C in the
> disjunction to avoid that possibility.
>
looks a lot better and removed a lot of false positives - the main problem 
now is managing classification of the kernels type "system" - seems like there
are atleast 5 ways to describe every type (except for enum) and infinitely
many possible assignments for ssize_t ...

here a little summary of the outputs - might be motivation to put some 
quite simple scanners into mainline to catch such issues.

comparison of return types in functions to the functions signature for 
kernel 4.1-rc2, glibc-2.9 and busybox 1.2.2.1 - no particular reason for
that glibc/busybox versions they just happend to be on my harddrive.

This is using the version that was fixed by Julia Lawal

@match exists@ 
type T1,T2;
idexpression T1 ok;
idexpression T2 ret;
identifier f;
constant C;
position p;
@@

T1 f(...) {
<+...
(
return ok;
|
return C;
|
return@p ret;
)
...+>
}


component  Nr funcs != type%
kernel   :  374600   10727   2.85 
glibc:9184 268   2.92
busybox  :3645  43   1.18

 kernel  glibc busybox  criticality
wrong ?: 8 40   not sure
sign missmatch :  2279309   critical
down sized :   435494   critical
up sized   :   910203   ugly
declaration missmatch  :  7095   165   27   wishlist

wrong: seems plain wrong like float assigned to int (did not check details yet)
sign missmatch: assigning signed types to unsiggned or vice versa
down sized:  some form of possible truncation like u64 being assigned u32
up sized: non-critical as it was correct type and it fit
declaration missmatch: means that they were named differently s32/int

Some limitations:
The glibc runs produced some error cases (spatch level) that were ignored
for now e.g.:

EXN:Failure("match: node 194: return ...[1,2,90] in rec_dirsearch reachable
by inconsistent control-flow paths")

The kernel numbers are a bit inaccurate because not all types can be 
checked reliably - e.g. when they are config dependent also due to the
enourmous type-"system" in the kernel not all assignments are sure
but that does not change the overall result.
I did not yet manage to automate the classification - just too many types
where its hard to say due to config dependencies - probably need to put
thos into a "don't know" category. Also all assignments of pointers of
any type on one side to void * on the other side was counted as legitimate.
Some results were mangled probably because of inacurate filtering resulting 
in things like "_EXTERN_INLINE != mp_limb_t" just dropped those for now.

Conclusion:
* atleast the sign missmatch cases (2279) and potentially truncating 
  assignments (435) are problematic. 
* the scripts needs a lot more cleanup in the classification of the reported
  types to be useful
* probably not realistic to cleanup all currently present tupe mismatches
  but scanning continuously and reporting before it goes into mainline or
  integrating such a check in the routine submission process seems
  worthwhile

 Once the classifier is working properly I'll post the next version.

thx!
hofrat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why isn't IRQ shared for i2c-ocore

2015-05-07 Thread Lee Jones
On Fri, 08 May 2015, Geert Uytterhoeven wrote:

> On Thu, May 7, 2015 at 9:01 AM, Lee Jones  wrote:
> >> I have a follow up question regarding interrupt. I see many I2C bus drivers
> >> request interrupt with flag = 0. Why not using IRQF_SHARED?
> >
> > Probably because that particular IRQ is only used by the I2C
> > Controller.  I'm not exactly sure that you're getting at?  Why do you
> > think it should be shared?  You should only flag it as shared if it
> > is.
> 
> However, that's something the driver can't know.
> Sharing interrupts is an integration property. The same IP core may share its
> interrupt on one SoC, and not on another.

I guess that would depend on the IP.  If this is part of an MFD, you'd
know if you only hand a single interrupt line coming into the chip or
not.  If the IP can be moved around (copy & pasted) into different
chips, then yes, that might change.

How does one share an interrupt with other drivers if all them don't
know the IRQ is shared thought?

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [WARNING 4.1-rc2] i915: Unclaimed register detected before writing to register 0xc4040

2015-05-07 Thread Daniel Vetter
On Thu, May 7, 2015 at 9:40 PM, Steven Rostedt  wrote:
>  [ cut here ]
>  WARNING: CPU: 2 PID: 0 at 
> /work/autotest/nobackup/linux-test.git/drivers/gpu/drm/i915/intel_uncore.c:566
>  hsw_unclaimed_reg_debug.isra.10+0x6c/0x84()
>  Unclaimed register detected before writing to register 0xc4040
>  Modules linked in: microcode r8169
>  CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.1.0-rc2-test+ #4
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>   0236 880215203c68 81ba8161 
>   880215203cb8 880215203ca8 81080dbb 
>   81626bc6 88020de60068 000c4040 0001
>  Call Trace:
> [] dump_stack+0x4c/0x65
>   [] warn_slowpath_common+0xa1/0xbb
>   [] ? hsw_unclaimed_reg_debug.isra.10+0x6c/0x84
>   [] warn_slowpath_fmt+0x46/0x48
>   [] hsw_unclaimed_reg_debug.isra.10+0x6c/0x84
>   [] hsw_write32+0x86/0xcf
>   [] cpt_irq_handler+0x1e8/0x1f5
>   [] ivb_display_irq_handler+0xf4/0x11b
>   [] ironlake_irq_handler+0x187/0x24d
>   [] handle_irq_event_percpu+0xf7/0x2a3
>   [] handle_irq_event+0x41/0x64
>   [] handle_edge_irq+0xa0/0xb9
>   [] handle_irq+0x11d/0x128
>   [] ? atomic_notifier_call_chain+0x14/0x16
>   [] do_IRQ+0x4e/0xc4
>   [] common_interrupt+0x70/0x70
> [] ? cpuidle_enter_state+0xd8/0x135
>   [] ? cpuidle_enter_state+0xd4/0x135
>   [] cpuidle_enter+0x17/0x19
>   [] cpuidle_idle_call+0xf2/0x180
>   [] cpu_idle_loop+0x12b/0x164
>   [] cpu_startup_entry+0x13/0x14
>   [] start_secondary+0x102/0x106
>   [] ? set_cpu_sibling_map+0x35e/0x35e
>  ---[ end trace 77c6a96cf41e96d1 ]---
>
> I'm still triggering warnings in the i915 code. :-(

Please retry with snd-hda-intel blacklisted. At least last time I
checked that was the only culprit left, i915 is just the messenger
here. The other one was stupid things done by the bios, but we should
correctly clear that up since a long time.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/2] clk: improve handling of orphan clocks

2015-05-07 Thread Stephen Boyd
On 05/07, Kevin Hilman wrote:
> On Thu, May 7, 2015 at 2:03 PM, Stephen Boyd  wrote:
> > On 05/07/15 08:17, Kevin Hilman wrote:
> >> On Fri, May 1, 2015 at 4:40 PM, Stephen Boyd  wrote:
> >>> On 05/01/15 15:07, Heiko Stübner wrote:
> >>>> Am Freitag, 1. Mai 2015, 13:52:47 schrieb Stephen Boyd:
> >>>>
> >>>>>> Instead I guess we could hook it less deep into clk_get_sys, like in 
> >>>>>> the
> >>>>>> following patch?
> >>>>> It looks like it will work at least, but still I'd prefer to keep the
> >>>>> orphan check contained to clk.c. How about this compile tested only 
> >>>>> patch?
> >>>> I gave this a spin on my rk3288-firefly board. It still boots, the clock 
> >>>> tree
> >>>> looks the same and it also still defers nicely in the scenario I needed 
> >>>> it
> >>>> for. The implementation also looks nice - and of course much more 
> >>>> compact than
> >>>> my check in two places :-) . I don't know if you want to put this as 
> >>>> follow-up
> >>>> on top or fold it into the original orphan-check, so in any case
> >>>>
> >>>> Tested-by: Heiko Stuebner 
> >>>> Reviewed-by: Heiko Stuebner 
> >>> Thanks. I'm leaning towards tossing your patch 2/2 and replacing it with
> >>> my patch and a note that it's based on an earlier patch from you.
> >> It appears this has landed in linux-next in the form of 882667c1fcf1
> >> clk: prevent orphan clocks from being used.  A bunch of boot failures
> >> for sunxi in today's linux-next[1] were bisected down to that patch.
> >>
> >> I confirmed that reverting that commit on top of next/master gets
> >> sunxi booting again.
> >>
> >>
> >
> > Thanks for the report. I've removed the two clk orphan patches from
> > clk-next. Would it be possible to try with next-20150507 and
> > clk_ignore_unused on the command line?
> 
> That doesn't help.  I tried on cubieboard2 and bananapi.

Thanks for trying.

> 
> > Also we can try to see if
> > critical clocks aren't being forced on by applying this patch and
> > looking for clk_get() failures
> 
> From cubieboard2, there's a few that look rather important:
> 
> [0.00] Additional per-CPU info printed with stalls.
> [0.00] Build-time adjustment of leaf fanout to 32.
> [0.00] RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=2.
> [0.00] RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
> [0.00] NR_IRQS:16 nr_irqs:16 16
> [0.00] clk: couldn't get parent clock 0 for /clocks/ahb@01c20054
> [0.00] Failed to enable critical clock cpu
> [0.00] Failed to enable critical clock pll5_ddr
> [0.00] Failed to enable critical clock ahb_sdram
> [0.00] Architected cp15 timer(s) running at 24.00MHz (virt).

Ok. So it seems we need to come up with some solution to the
"critical clocks" problem that doesn't require the individual
clock drivers to call clk_prepare_enable().

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clk: bindings: Fix assigned-clock-rates description

2015-05-07 Thread Stephen Boyd
The binding uses assigned-clock-parents when it should use
assigned-clock-rates. Furthermore, the part that describes how
they relate to the assigned-clocks property is not clear about
what is related. Correct and clarify this part of the binding.

Reported-by: Krzysztof Kozlowski 
Signed-off-by: Stephen Boyd 
---

On 05/06, Krzysztof Kozlowski wrote:
> Looks much better. So actually this should be yours patch now, you
> can add my Reported-by :).

Ok. I'll queue up this patch unless there are any other objections.

 Documentation/devicetree/bindings/clock/clock-bindings.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/clock-bindings.txt 
b/Documentation/devicetree/bindings/clock/clock-bindings.txt
index 06fc6d541c89..2ec489eebe72 100644
--- a/Documentation/devicetree/bindings/clock/clock-bindings.txt
+++ b/Documentation/devicetree/bindings/clock/clock-bindings.txt
@@ -138,9 +138,10 @@ Some platforms may require initial configuration of 
default parent clocks
 and clock frequencies. Such a configuration can be specified in a device tree
 node through assigned-clocks, assigned-clock-parents and assigned-clock-rates
 properties. The assigned-clock-parents property should contain a list of parent
-clocks in form of phandle and clock specifier pairs, the assigned-clock-parents
-property the list of assigned clock frequency values - corresponding to clocks
-listed in the assigned-clocks property.
+clocks in the form of a phandle and clock specifier pair and the
+assigned-clock-rates property should contain a list of frequencies in Hz. Both
+these properties should correspond to the clocks listed in the assigned-clocks
+property.
 
 To skip setting parent or rate of a clock its corresponding entry should be
 set to 0, or can be omitted if it is not followed by any non-zero entry.
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [STLinux Kernel] [PATCH 06/12] watchdog: bindings: Supply knowledge of a third supported device - clocksource

2015-05-07 Thread Lee Jones
On Thu, 07 May 2015, Peter Griffin wrote:

> Hi Lee,
> 
> On Thu, 07 May 2015, Lee Jones wrote:
> 
> 
> 
> > > >  Required properties
> > > >  
> > > > -- compatible   : Must be one of: "st,stih407-lpc" "st,stih416-lpc"
> > > > - "st,stih415-lpc" "st,stid127-lpc"
> > > > +- compatible   : Must be one of: "st,stih407-lpc"
> > > 
> > > The same comment as the RTC DT patch, you are removing the compatibles
> > > documentation for the other supported platforms like stih416-lpc.
> > > AFAIK they are required in the driver to get the correct sysconfig 
> > > register.
> > 
> > That's intentional.  I haven't yet tested any of this IP on
> > STiH41{5,6} & STiH127.  Due to lack of documentation, I'm not even
> > sure if this IP even exists on some of the other platforms.  I will
> > add them back when support is added to both driver and DTB and I've
> > been able to test them.
> 
> That was kind of my point, the driver code AFAIK already contains support
> for these SoC's.
> 
> I would either expect the patch to remove support from the DT docs AND the
> driver, or leave it as is.
> 
> It seems odd to only change the DT docs, and become unaligned to the
> code (this assumes I'm looking at the latest patchset here 
> https://lkml.org/lkml/2015/3/4/1088 which includes support for these SoCs).

The decision to remove these 'supported' platforms was made on the RTC
side, where there is only support for "st,stih407-lpc" in the driver.
I thought it best to mirror that thought over to the Watchdog LPC
bindings, but thought it not really worth ripping out existing support
from the driver.  If others wish to test/use the Watchdog on other
platforms and can read C code, they'll know what to do.

Hopefully all of this will be a non-issue anyway, as I plan to test
this on the other boards I have in my farm and make the necessary
changes in the upcoming weeks.  All should be squared away by the next
merge-window.

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why isn't IRQ shared for i2c-ocore

2015-05-07 Thread Geert Uytterhoeven
On Thu, May 7, 2015 at 9:01 AM, Lee Jones  wrote:
>> I have a follow up question regarding interrupt. I see many I2C bus drivers
>> request interrupt with flag = 0. Why not using IRQF_SHARED?
>
> Probably because that particular IRQ is only used by the I2C
> Controller.  I'm not exactly sure that you're getting at?  Why do you
> think it should be shared?  You should only flag it as shared if it
> is.

However, that's something the driver can't know.
Sharing interrupts is an integration property. The same IP core may share its
interrupt on one SoC, and not on another.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/5] crypto: drbg - prepare for async seeding

2015-05-07 Thread Stephan Mueller
In order to prepare for the addition of the asynchronous seeding call,
the invocation of seeding the DRBG is moved out into a helper function.

In addition, a block of memory is allocated during initialization time
that will be used as a scratchpad for obtaining entropy. That scratchpad
is used for the initial seeding operation as well as by the
asynchronous seeding call. The memory must be zeroized every time the
DRBG seeding call succeeds to avoid entropy data lingering in memory.

CC: Andreas Steffen 
CC: Theodore Ts'o 
CC: Sandy Harris 
Signed-off-by: Stephan Mueller 
---
 crypto/drbg.c | 81 ++-
 include/crypto/drbg.h |  2 ++
 2 files changed, 56 insertions(+), 27 deletions(-)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index 23d444e..36dfece 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1041,6 +1041,21 @@ static struct drbg_state_ops drbg_hash_ops = {
  * Functions common for DRBG implementations
  **/
 
+static inline int __drbg_seed(struct drbg_state *drbg, struct list_head *seed,
+ int reseed)
+{
+   int ret = drbg->d_ops->update(drbg, seed, reseed);
+
+   if (ret)
+   return ret;
+
+   drbg->seeded = true;
+   /* 10.1.1.2 / 10.1.1.3 step 5 */
+   drbg->reseed_ctr = 1;
+
+   return ret;
+}
+
 /*
  * Seeding or reseeding of the DRBG
  *
@@ -1056,8 +1071,6 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
 bool reseed)
 {
int ret = 0;
-   unsigned char *entropy = NULL;
-   size_t entropylen = 0;
struct drbg_string data1;
LIST_HEAD(seedlist);
 
@@ -1073,26 +1086,10 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
 drbg->test_data.len);
pr_devel("DRBG: using test entropy\n");
} else {
-   /*
-* Gather entropy equal to the security strength of the DRBG.
-* With a derivation function, a nonce is required in addition
-* to the entropy. A nonce must be at least 1/2 of the security
-* strength of the DRBG in size. Thus, entropy * nonce is 3/2
-* of the strength. The consideration of a nonce is only
-* applicable during initial seeding.
-*/
-   entropylen = drbg_sec_strength(drbg->core->flags);
-   if (!entropylen)
-   return -EFAULT;
-   if (!reseed)
-   entropylen = ((entropylen + 1) / 2) * 3;
pr_devel("DRBG: (re)seeding with %zu bytes of entropy\n",
-entropylen);
-   entropy = kzalloc(entropylen, GFP_KERNEL);
-   if (!entropy)
-   return -ENOMEM;
-   get_random_bytes(entropy, entropylen);
-   drbg_string_fill(&data1, entropy, entropylen);
+drbg->seed_buf_len);
+   get_random_bytes(drbg->seed_buf, drbg->seed_buf_len);
+   drbg_string_fill(&data1, drbg->seed_buf, drbg->seed_buf_len);
}
list_add_tail(&data1.list, &seedlist);
 
@@ -,16 +1108,24 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
memset(drbg->C, 0, drbg_statelen(drbg));
}
 
-   ret = drbg->d_ops->update(drbg, &seedlist, reseed);
+   ret = __drbg_seed(drbg, &seedlist, reseed);
+
+   /*
+* Clear the initial entropy buffer as the async call may not overwrite
+* that buffer for quite some time.
+*/
+   memzero_explicit(drbg->seed_buf, drbg->seed_buf_len);
if (ret)
goto out;
-
-   drbg->seeded = true;
-   /* 10.1.1.2 / 10.1.1.3 step 5 */
-   drbg->reseed_ctr = 1;
+   /*
+* For all subsequent seeding calls, we only need the seed buffer
+* equal to the security strength of the DRBG. We undo the calculation
+* in drbg_alloc_state.
+*/
+   if (!reseed)
+   drbg->seed_buf_len = drbg->seed_buf_len / 3 * 2;
 
 out:
-   kzfree(entropy);
return ret;
 }
 
@@ -1143,6 +1148,8 @@ static inline void drbg_dealloc_state(struct drbg_state 
*drbg)
drbg->prev = NULL;
drbg->fips_primed = false;
 #endif
+   kzfree(drbg->seed_buf);
+   drbg->seed_buf = NULL;
 }
 
 /*
@@ -1204,6 +1211,26 @@ static inline int drbg_alloc_state(struct drbg_state 
*drbg)
if (!drbg->scratchpad)
goto err;
}
+
+   /*
+* Gather entropy equal to the security strength of the DRBG.
+* With a derivation function, a nonce is required in addition
+* to the entropy. A nonce must be at least 1/2 of the security
+* strength of the DRBG in size. Thus, entropy * nonce is 3/2
+* of the strength. T

[PATCH v5 1/5] random: Async and sync API for accessing nonblocking_pool

2015-05-07 Thread Stephan Mueller
The added API calls provide a synchronous function call
get_blocking_random_bytes where the caller is blocked until
the nonblocking_pool is initialized.

In addition, an asynchronous API call of get_blocking_random_bytes_cb
is provided which returns immediately to the caller after submitting
the request for random data. The caller-provided buffer that shall be
filled with random data is filled up as available entropy permits. The
caller may provide a callback function that is invoked once the
request is completed.

A third API call, get_blocking_random_bytes_cancel, is provided to
cancel the random number gathering operation.

CC: Andreas Steffen 
CC: Theodore Ts'o 
CC: Sandy Harris 
Signed-off-by: Stephan Mueller 
---
 drivers/char/random.c  | 104 +
 include/linux/random.h |  20 ++
 2 files changed, 124 insertions(+)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 9cd6968..05faef2 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -407,6 +407,7 @@ static struct poolinfo {
 static DECLARE_WAIT_QUEUE_HEAD(random_read_wait);
 static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
 static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait);
+static DECLARE_WAIT_QUEUE_HEAD(urandom_kernel_wait);
 static struct fasync_struct *fasync;
 
 /**
@@ -661,6 +662,7 @@ retry:
if (r == &nonblocking_pool) {
prandom_reseed_late();
wake_up_interruptible(&urandom_init_wait);
+   wake_up_interruptible(&urandom_kernel_wait);
pr_notice("random: %s pool is initialized\n", r->name);
}
}
@@ -1778,3 +1781,106 @@ void add_hwgenerator_randomness(const char *buffer, 
size_t count,
credit_entropy_bits(poolp, entropy);
 }
 EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+
+static bool get_blocking_random_bytes_term(bool *cancel)
+{
+   if (nonblocking_pool.initialized)
+   return true;
+   return *cancel;
+}
+
+/*
+ * Equivalent function to get_random_bytes with the difference that this
+ * function blocks the request until the nonblocking_pool is initialized.
+ *
+ * This function may sleep.
+ *
+ * @buf caller allocated buffer filled with random data
+ * @nbytes requested number of bytes -- buffer should be at least as big
+ * @cancel pointer to variable that can be used to cancel the collection
+ *operation. If this boolean is set to true, the collection operation
+ *is terminated immediately. When it is set to true during the
+ *collection loop, the collection is terminated immediately.
+ */
+void get_blocking_random_bytes(void *buf, int nbytes, bool *cancel)
+{
+   if (unlikely(nonblocking_pool.initialized == 0))
+   wait_event_interruptible(urandom_kernel_wait,
+   get_blocking_random_bytes_term(cancel));
+
+   if (!cancel)
+   extract_entropy(&nonblocking_pool, buf, nbytes, 0, 0);
+}
+EXPORT_SYMBOL(get_blocking_random_bytes);
+
+/*
+ * Immediate canceling the collection operation for the random_work
+ */
+void get_blocking_random_bytes_cancel(struct random_work *rw)
+{
+   rw->cancel = true;
+   wake_up_interruptible(&urandom_kernel_wait);
+
+}
+EXPORT_SYMBOL(get_blocking_random_bytes_cancel);
+
+static void get_blocking_random_bytes_work(struct work_struct *work)
+{
+   struct random_work *rw = container_of(work, struct random_work,
+ rw_work);
+
+   get_blocking_random_bytes(rw->rw_buf, rw->rw_len, &rw->cancel);
+   if (!rw->cancel)
+   rw->rw_cb(rw->rw_buf, rw->rw_len, rw->private);
+}
+
+/*
+ * Asynchronous invocation of the blocking interface. The function
+ * queues the request in either the private work queue supplied with the
+ * wq argument or in the general work queue framework if wq is NULL.
+ * Once the request is completed or upon receiving an error, the callback
+ * function of cb is called, if not NULL, to inform the caller about the
+ * completion of its operation.
+ *
+ * Only if no work is pending, the pointers for the callback function
+ * or the output data can be changed.
+ *
+ * If a caller wants to cancel the work (e.g. in the module_exit function),
+ * simply call
+ * get_blocking_random_bytes_cancel(&my_random_work);
+ * cancel_work_sync(&my_random_work.rw_work);
+ *
+ * @wq pointer to private work queue or NULL - input
+ * @rw handle to the work queue frame - output
+ * @buf allocated buffer where random numbers are to be stored
+ * @nbytes size of buf and implicitly number of bytes requested
+ * @private pointer to data that is not processed by here, but handed to the
+ * callback function to allow the caller to maintain a state
+ * @cb callback function where
+ * * buf holds the pointer to buf will be supplied
+ * * buflen ho

[PATCH v5 4/5] crypto: drbg - use Jitter RNG to obtain seed

2015-05-07 Thread Stephan Mueller
During initialization, the DRBG now tries to allocate a handle of the
Jitter RNG. If such a Jitter RNG is available during seeding, the DRBG
pulls the required entropy/nonce string from get_random_bytes and
concatenates it with a string of equal size from the Jitter RNG. That
combined string is now the seed for the DRBG.

Written differently, the initial seed of the DRBG is now:

get_random_bytes(entropy/nonce) || jitterentropy (entropy/nonce)

If the Jitter RNG is not available, the DRBG only seeds from
get_random_bytes.

CC: Andreas Steffen 
CC: Theodore Ts'o 
CC: Sandy Harris 
Signed-off-by: Stephan Mueller 
---
 crypto/drbg.c | 46 --
 include/crypto/drbg.h |  1 +
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index 693dac4..6e2b272 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1120,10 +1120,25 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
 drbg->test_data.len);
pr_devel("DRBG: using test entropy\n");
} else {
-   pr_devel("DRBG: (re)seeding with %zu bytes of entropy\n",
-drbg->seed_buf_len);
+   /* Get seed from in-kernel /dev/urandom */
get_random_bytes(drbg->seed_buf, drbg->seed_buf_len);
-   drbg_string_fill(&data1, drbg->seed_buf, drbg->seed_buf_len);
+
+   /* Get seed from Jitter RNG */
+   if (!drbg->jent ||
+   crypto_rng_get_bytes(drbg->jent,
+drbg->seed_buf + drbg->seed_buf_len,
+drbg->seed_buf_len)) {
+   pr_info("DRBG: could not obtain random data from Jitter 
RNG\n");
+   drbg_string_fill(&data1, drbg->seed_buf,
+drbg->seed_buf_len);
+   pr_devel("DRBG: (re)seeding with %zu bytes of 
entropy\n",
+drbg->seed_buf_len);
+   } else {
+   drbg_string_fill(&data1, drbg->seed_buf,
+drbg->seed_buf_len * 2);
+   pr_devel("DRBG: (re)seeding with %zu bytes of 
entropy\n",
+drbg->seed_buf_len * 2);
+   }
}
list_add_tail(&data1.list, &seedlist);
 
@@ -1148,7 +1163,7 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
 * Clear the initial entropy buffer as the async call may not overwrite
 * that buffer for quite some time.
 */
-   memzero_explicit(drbg->seed_buf, drbg->seed_buf_len);
+   memzero_explicit(drbg->seed_buf, drbg->seed_buf_len * 2);
if (ret)
goto out;
/*
@@ -1190,6 +1205,10 @@ static inline void drbg_dealloc_state(struct drbg_state 
*drbg)
 #endif
kzfree(drbg->seed_buf);
drbg->seed_buf = NULL;
+   if (drbg->jent) {
+   crypto_free_rng(drbg->jent);
+   drbg->jent = NULL;
+   }
 }
 
 /*
@@ -1265,12 +1284,27 @@ static inline int drbg_alloc_state(struct drbg_state 
*drbg)
ret = -EFAULT;
goto err;
}
-   /* ensure we have sufficient buffer space for initial seed */
+   /*
+* Ensure we have sufficient buffer space for initial seed which
+* consists of the seed from get_random_bytes and the Jitter RNG.
+*/
drbg->seed_buf_len = ((drbg->seed_buf_len + 1) / 2) * 3;
-   drbg->seed_buf = kzalloc(drbg->seed_buf_len, GFP_KERNEL);
+   drbg->seed_buf = kzalloc(drbg->seed_buf_len * 2, GFP_KERNEL);
if (!drbg->seed_buf)
goto err;
 
+   drbg->jent = crypto_alloc_rng("jitterentropy_rng", 0, 0);
+   if(IS_ERR(drbg->jent))
+   {
+   pr_info("DRBG: could not allocate Jitter RNG handle for 
seeding\n");
+   /*
+* As the Jitter RNG is a module that may not be present, we
+* continue with the operation and do not fully tie the DRBG
+* to the Jitter RNG.
+*/
+   drbg->jent = NULL;
+   }
+
return 0;
 
 err:
diff --git a/include/crypto/drbg.h b/include/crypto/drbg.h
index e4980a1..fabf102 100644
--- a/include/crypto/drbg.h
+++ b/include/crypto/drbg.h
@@ -122,6 +122,7 @@ struct drbg_state {
struct random_work seed_work;   /* asynchronous seeding support */
u8 *seed_buf;   /* buffer holding the seed */
size_t seed_buf_len;
+   struct crypto_rng *jent;
const struct drbg_state_ops *d_ops;
const struct drbg_core *core;
struct drbg_string test_data;
-- 
2.1.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majo

[PATCH v5 5/5] crypto: add jitterentropy RNG

2015-05-07 Thread Stephan Mueller
The CPU Jitter RNG provides a source of good entropy by
collecting CPU executing time jitter. The entropy in the CPU
execution time jitter is magnified by the CPU Jitter Random
Number Generator. The CPU Jitter Random Number Generator uses
the CPU execution timing jitter to generate a bit stream
which complies with different statistical measurements that
determine the bit stream is random.

The CPU Jitter Random Number Generator delivers entropy which
follows information theoretical requirements. Based on these
studies and the implementation, the caller can assume that
one bit of data extracted from the CPU Jitter Random Number
Generator holds one bit of entropy.

The CPU Jitter Random Number Generator provides a decentralized
source of entropy, i.e. every caller can operate on a private
state of the entropy pool.

The RNG does not have any dependencies on any other service
in the kernel. The RNG only needs a high-resolution time
stamp.

Further design details, the cryptographic assessment and
large array of test results are documented at
http://www.chronox.de/jent.html.

CC: Andreas Steffen 
CC: Theodore Ts'o 
CC: Sandy Harris 
Signed-off-by: Stephan Mueller 
---
 crypto/Kconfig |  10 +
 crypto/Makefile|   2 +
 crypto/jitterentropy.c | 909 +
 crypto/testmgr.c   |   4 +
 4 files changed, 925 insertions(+)
 create mode 100644 crypto/jitterentropy.c

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 8aaf298..5cf9174 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1479,9 +1479,19 @@ config CRYPTO_DRBG
tristate
default CRYPTO_DRBG_MENU if (CRYPTO_DRBG_HMAC || CRYPTO_DRBG_HASH || 
CRYPTO_DRBG_CTR)
select CRYPTO_RNG
+   select CRYPTO_JITTERENTROPY
 
 endif  # if CRYPTO_DRBG_MENU
 
+config CRYPTO_JITTERENTROPY
+   tristate "Jitterentropy Non-Deterministic Random Number Generator"
+   help
+ The Jitterentropy RNG is a noise that is intended
+ to provide seed to another RNG. The RNG does not
+ perform any cryptographic whitening of the generated
+ random numbers. This Jitterentropy RNG registers with
+ the kernel crypto API and can be used by any caller.
+
 config CRYPTO_USER_API
tristate
 
diff --git a/crypto/Makefile b/crypto/Makefile
index 97b7d3a..2f450ef 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -94,6 +94,8 @@ obj-$(CONFIG_CRYPTO_RNG2) += rng.o
 obj-$(CONFIG_CRYPTO_RNG2) += krng.o
 obj-$(CONFIG_CRYPTO_ANSI_CPRNG) += ansi_cprng.o
 obj-$(CONFIG_CRYPTO_DRBG) += drbg.o
+CFLAGS_jitterentropy.o = -O0
+obj-$(CONFIG_CRYPTO_JITTERENTROPY) += jitterentropy.o
 obj-$(CONFIG_CRYPTO_TEST) += tcrypt.o
 obj-$(CONFIG_CRYPTO_GHASH) += ghash-generic.o
 obj-$(CONFIG_CRYPTO_USER_API) += af_alg.o
diff --git a/crypto/jitterentropy.c b/crypto/jitterentropy.c
new file mode 100644
index 000..1ebe58a
--- /dev/null
+++ b/crypto/jitterentropy.c
@@ -0,0 +1,909 @@
+/*
+ * Non-physical true random number generator based on timing jitter.
+ *
+ * Copyright Stephan Mueller , 2014
+ *
+ * Design
+ * ==
+ *
+ * See http://www.chronox.de/jent.html
+ *
+ * License
+ * ===
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, and the entire permission notice in its entirety,
+ *including the disclaimer of warranties.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote
+ *products derived from this software without specific prior
+ *written permission.
+ *
+ * ALTERNATIVELY, this product may be distributed under the terms of
+ * the GNU General Public License, in which case the provisions of the GPL2 are
+ * required INSTEAD OF the above restrictions.  (This clause is
+ * necessary due to a potential bad interaction between the GPL and
+ * the restrictions contained in a BSD-style copyright.)
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
+ * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
+ * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
+ * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
+ * 

[PATCH v5 0/5] Seeding DRBG with more entropy

2015-05-07 Thread Stephan Mueller
Hi,

as of now, the DRBG is only seeded from get_random_bytes. In various
circumstances, the nonblocking_pool behind get_random_bytes may not be fully
seeded from hardware events at the time the DRBG requires to be seeded.
Based on the discussion in [1], the DRBG seeding is updated such that it
does not completely rely on get_random_bytes any more.

The seeding approach can be characterized as follows:

1. pull buffer of size entropy + nonce from get_random_bytes

2. pull another buffer of size entropy + nonce from my Jitter RNG

3. concatenate both buffers

4. seed the DRBG with the concatenated buffer

5. trigger the async invocation of the blocking API for accessing
   the nonblocking pool with a buffer of size entropy

6. return the DRBG instance to the caller without waiting for the completion
   of step 5

7. at some point in time, the blocking API returns with a full buffer
   which is then used to re-seed the DRBG

This way, we will get entropy during the first initialization without
blocking.

The patch set adds an asynchronous API to access the nonblocking pool to wait
until the nonblocking pool is initialized. A test module for testing the
asynchronous operation is given with the code below.

Note: the DRBG and Jitter RNG patches are against the current cryptodev-2.6
tree.

The new Jitter RNG is an RNG that has large set of tests and was presented on
LKML some time back. After speaking with mathematicians at NIST, that Jitter
RNG approach would be acceptable from their side as a noise source. Note, I
personally think that the Jitter RNG has sufficient entropy in almost all
circumstances (see the massive testing I conducted on all more widely used
CPUs as shown in [2]).

Changes v5:
* drop patch 01 and therefore drop the creation of a kernel pool
* change patch 02 to use the nonblocking pool and block until the nonblocking
  pool is initialized or until the cancel operation is triggered.

Changes v4:
* Patch 02: Change get_blocking_random_bytes_cb to allow callers to call it
  multiple times without re-initializing the work data structure. Furthermore,
  only change the pointers to the output buffer and callback if work is not
  pending to avoid race conditions.
* Patch 04: No canceling of seeding during drbg_seed as the invocation of
  get_blocking_random_bytes_cb can now be done repeatedly without
  re-initializing the work data structure.

Changes v3:
* Patch 01: Correct calculation of entropy count as pointed out by Herbert Xu
* Patch 06: Correct a trivial coding issue in jent_entropy_init for
  checking JENT_EMINVARVAR reported by cppcheck

Changes v2:
* Use Dual BSD/GPL license in MODULE_LICENSE as suggested by
  Paul Bolle 
* Patch 05, drbg_dealloc_state: only deallocate Jitter RNG if one was
  instantiated in the first place. There are two main reasons why the Jitter RNG
  may not be allocated: either it is not available as kernel module/in vmlinuz
  or during init time of the Jitter RNG, the performed testing shows that the
  underlying hardware is not suitable for the Jitter RNG (e.g. has a too coarse
  timer).


[1] http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg13891.html

[2] http://www.chronox.de/jent.html

Stephan Mueller (5):
  random: Async and sync API for accessing nonblocking_pool
  crypto: drbg - prepare for async seeding
  crypto: drbg - add async seeding operation
  crypto: drbg - use Jitter RNG to obtain seed
  crypto: add jitterentropy RNG

 crypto/Kconfig |  10 +
 crypto/Makefile|   2 +
 crypto/drbg.c  | 156 +++--
 crypto/jitterentropy.c | 909 +
 crypto/testmgr.c   |   4 +
 drivers/char/random.c  | 104 ++
 include/crypto/drbg.h  |   4 +
 include/linux/random.h |  20 ++
 8 files changed, 1182 insertions(+), 27 deletions(-)
 create mode 100644 crypto/jitterentropy.c
---
/*
 * Test module for verifying the correct operation of the
 * in-kernel /dev/random handling
 *
 * Use: compile, load into the kernel and observe dmesg
 *
 * Written by: Stephan Mueller 
 * Copyright (c) 2014
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *notice, and the entire permission notice in its entirety,
 *including the disclaimer of warranties.
 * 2. Redistributions in binary form must reproduce the above copyright
 *notice, this list of conditions and the following disclaimer in the
 *documentation and/or other materials provided with the distribution.
 * 3. The name of the author may not be used to endorse or promote
 *products derived from this software without specific prior
 *written permission.
 *
 * ALTERNATIVELY, this product may be distributed under the terms of
 * the GNU General Public License, in which case the provisions of the GPL are
 * required INSTEAD OF the above restrictions.  (This 

[PATCH v5 3/5] crypto: drbg - add async seeding operation

2015-05-07 Thread Stephan Mueller
The async seeding operation is triggered during initalization right
after the first non-blocking seeding is completed. As required by the
asynchronous operation of random.c, a callback function is provided that
is triggered by random.c once entropy is available. That callback
function performs the actual seeding of the DRBG.

CC: Andreas Steffen 
CC: Theodore Ts'o 
CC: Sandy Harris 
Signed-off-by: Stephan Mueller 
---
 crypto/drbg.c | 41 +
 include/crypto/drbg.h |  1 +
 2 files changed, 42 insertions(+)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index 36dfece..693dac4 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1056,6 +1056,40 @@ static inline int __drbg_seed(struct drbg_state *drbg, 
struct list_head *seed,
return ret;
 }
 
+/* DRBG callback for obtaining data from the async Linux RNG */
+static void drbg_async_seed_cb(void *buf, ssize_t buflen, void *private)
+{
+   struct drbg_string data;
+   LIST_HEAD(seedlist);
+   struct drbg_state *drbg = (struct drbg_state *)private;
+   int ret = 0;
+
+   if (buflen <= 0 || !buf)
+   return;
+
+   drbg_string_fill(&data, buf, buflen);
+   list_add_tail(&data.list, &seedlist);
+   /* sanity check to verify that there is still a DRBG instance */
+   if (!drbg)
+   return;
+   mutex_lock(&drbg->drbg_mutex);
+   /* sanity check to verify that the DRBG instance is valid */
+   if (!drbg->V) {
+   mutex_unlock(&drbg->drbg_mutex);
+   return;
+   }
+   ret = __drbg_seed(drbg, &seedlist, true);
+   memzero_explicit(buf, buflen);
+   mutex_unlock(&drbg->drbg_mutex);
+}
+
+/* Cancel any outstanding async operation and wait for their completion */
+static inline void drbg_async_work_cancel(struct random_work *work)
+{
+   get_blocking_random_bytes_cancel(work);
+   cancel_work_sync(&work->rw_work);
+}
+
 /*
  * Seeding or reseeding of the DRBG
  *
@@ -1125,6 +1159,12 @@ static int drbg_seed(struct drbg_state *drbg, struct 
drbg_string *pers,
if (!reseed)
drbg->seed_buf_len = drbg->seed_buf_len / 3 * 2;
 
+   /* Invoke asynchronous seeding unless DRBG is in test mode. */
+   if (!list_empty(&drbg->test_data.list))
+   get_blocking_random_bytes_cb(NULL, &drbg->seed_work,
+drbg->seed_buf, drbg->seed_buf_len,
+drbg, drbg_async_seed_cb);
+
 out:
return ret;
 }
@@ -1487,6 +1527,7 @@ unlock:
  */
 static int drbg_uninstantiate(struct drbg_state *drbg)
 {
+   drbg_async_work_cancel(&drbg->seed_work);
if (drbg->d_ops)
drbg->d_ops->crypto_fini(drbg);
drbg_dealloc_state(drbg);
diff --git a/include/crypto/drbg.h b/include/crypto/drbg.h
index b052698..e4980a1 100644
--- a/include/crypto/drbg.h
+++ b/include/crypto/drbg.h
@@ -119,6 +119,7 @@ struct drbg_state {
bool fips_primed;   /* Continuous test primed? */
unsigned char *prev;/* FIPS 140-2 continuous test value */
 #endif
+   struct random_work seed_work;   /* asynchronous seeding support */
u8 *seed_buf;   /* buffer holding the seed */
size_t seed_buf_len;
const struct drbg_state_ops *d_ops;
-- 
2.1.0


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional

2015-05-07 Thread Len Brown
On Sun, Jan 26, 2014 at 4:08 PM, Pavel Machek  wrote:

> Dunno. Config option plus sysfs attribute is overdoing it a bit.

Agreed.
Have discussed w/ Rafael, and current plan is to simply delete.
Updated patch on the way...

thanks,
Len Brown, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] x86: replace cpu_up hard-coded mdelay with variable udelay

2015-05-07 Thread Len Brown
From: Len Brown 

Replace the hard-coded mdelay(10) in cpu_up() to a variable udelay.

Add a boot-time override, "cpu_init_udelay=N"

Default unchanged at 10ms on all systems.

Signed-off-by: Len Brown 
---
 Documentation/kernel-parameters.txt |  6 ++
 arch/x86/kernel/smpboot.c   | 28 +++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index bfcb1a6..0a16309 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -737,6 +737,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
cpuidle.off=1   [CPU_IDLE]
disable the cpuidle sub-system
 
+   cpu_init_udelay=N
+   [X86] Delay for N microsec between assert and de-assert
+   of APIC INIT to start processors.  This delay occurs
+   on every CPU online, such as boot, and resume from 
suspend.
+   Default: 1
+
cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
Format:
,,,[,]
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index febc6aa..76734f4 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -517,6 +517,32 @@ void __inquire_remote_apic(int apicid)
 }
 
 /*
+ * The Multiprocessor Specification 1.4 (1997) example code suggests
+ * that there should be a 10ms delay between the BSP asserting INIT
+ * and de-asserting INIT, when starting a remote processor.
+ * But that slows boot and resume on modern processors, which include
+ * many cores and don't require that delay.
+ *
+ * cmdline "init_cpu_udelay=" is available to specify this delay.
+ */
+#define UDELAY_10MS_DEFAULT 1
+
+static unsigned int init_udelay = UDELAY_10MS_DEFAULT;
+
+static int __init cpu_init_udelay(char *str)
+{
+   unsigned int new_udelay;
+
+   get_option(&str, &new_udelay);
+   pr_debug("cpu_init_udelay=%d, was %d", new_udelay, init_udelay);
+   init_udelay = new_udelay;
+
+   return 0;
+}
+
+early_param("cpu_init_udelay", cpu_init_udelay);
+
+/*
  * Poke the other CPU in the eye via NMI to wake it up. Remember that the 
normal
  * INIT, INIT, STARTUP sequence will reset the chip hard for us, and this
  * won't ... remember to clear down the APIC, etc later.
@@ -586,7 +612,7 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned 
long start_eip)
pr_debug("Waiting for send to finish...\n");
send_status = safe_apic_wait_icr_idle();
 
-   mdelay(10);
+   udelay(init_udelay);
 
pr_debug("Deasserting INIT\n");
 
-- 
2.4.0.rc1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] x86: speed cpu_up by quirking cpu_init_udelay

2015-05-07 Thread Len Brown
From: Len Brown 

Modern processor familes are on a white-list to remove
the costly cpu_init_udelay 1.  Unknown processor families
get the traditional 10ms delay in cpu_up().

This seemed more efficient than forcing modern processors
to exhaustively search a black-list having all the old
processor families that should have a 10ms delay.
For not only are new processor familes infrequently added,
the white list also allows a delay other than 0, if needed.

Signed-off-by: Len Brown 
---
 arch/x86/kernel/smpboot.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 76734f4..34a08ff 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -76,6 +76,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* State of each CPU */
 DEFINE_PER_CPU(int, cpu_state) = { 0 };
@@ -521,14 +522,44 @@ void __inquire_remote_apic(int apicid)
  * that there should be a 10ms delay between the BSP asserting INIT
  * and de-asserting INIT, when starting a remote processor.
  * But that slows boot and resume on modern processors, which include
- * many cores and don't require that delay.
- *
+ * many cores and don't require that delay.  Here we default to the
+ * legacy delay, but quirk new processors to skip the delay.
  * cmdline "init_cpu_udelay=" is available to specify this delay.
  */
 #define UDELAY_10MS_DEFAULT 1
 
 static unsigned int init_udelay = UDELAY_10MS_DEFAULT;
 
+static const struct x86_cpu_id init_udelay_ids[] = {
+   { X86_VENDOR_INTEL, 0x6, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x16, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x15, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x14, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x12, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x11, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0x10, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   { X86_VENDOR_AMD, 0xF, X86_MODEL_ANY, X86_FEATURE_ANY, 0 },
+   {}
+};
+MODULE_DEVICE_TABLE(x86cpu, init_udelay_ids);
+
+static void __init smp_quirk_init_udelay(void)
+{
+   const struct x86_cpu_id *id;
+   unsigned int new_udelay;
+
+   id = x86_match_cpu(init_udelay_ids);
+   if (id == NULL)
+   return; /* if no match, keep default */
+
+   if (init_udelay != UDELAY_10MS_DEFAULT)
+   return; /* if cmdline changed from default, leave it alone */
+
+   new_udelay = (unsigned long) id->driver_data;
+   pr_debug("cpu_init_udelay quirk to %d, was %d", new_udelay, 
init_udelay);
+   init_udelay = new_udelay;
+}
+
 static int __init cpu_init_udelay(char *str)
 {
unsigned int new_udelay;
@@ -1196,6 +1227,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
uv_system_init();
 
set_mtrr_aps_delayed_init();
+
+   smp_quirk_init_udelay();
 }
 
 void arch_enable_nonboot_cpus_begin(void)
-- 
2.4.0.rc1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2 v2] speeding up cpu_up()

2015-05-07 Thread Len Brown
Thanks for testing, Aravind, Borislav,

I went with Ingo's suggestion to made this a quirk.
However, I went with a white-list instead of a black-list,
because fewer comparisons are needed for modern processors.
Families are added infrequently, and finally, we get the option
to stick something other than 0 in the table if it we need to.

Let me know if I got the AMD family #'s right.

thanks,
-Len Brown, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry

2015-05-07 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> > So do you mean:
> >
> >this_cpu_set(rcu_state) = IN_KERNEL;
> >...
> >this_cpu_inc(rcu_qs_ctr);
> >this_cpu_set(rcu_state) = IN_USER;
> >
> > ?
> >
> > So in your proposal we'd have an INC and two MOVs. I think we can make
> > it just two simple stores into a byte flag, one on entry and one on
> > exit:
> >
> >this_cpu_set(rcu_state) = IN_KERNEL;
> >...
> >this_cpu_set(rcu_state) = IN_USER;
> >
> 
> I was thinking that either a counter or a state flag could make sense.
> Doing both would be pointless.  The counter could use the low bits to
> indicate the state.  The benefit of the counter would be that the
> RCU-waiting CPU could observe that the counter has incremented and
> that therefore a grace period has elapsed.  Getting it right would
> require lots of care.

So if you mean:

   
   ...
   this_cpu_inc(rcu_qs_ctr);
   

I don't see how this would work reliably: how do you handle the case 
of a SCHED_FIFO task never returning from user-space (common technique 
in RT apps)? synchronize_rcu() would block indefinitely as it would 
never see rcu_qs_ctr increase.

We have to be able to observe user-mode anyway, for system-time 
statistics purposes, and that flag could IMHO also drive the RCU GC 
machinery.

> > > The problem is that I don't see how TIF_RCU_THINGY can work 
> > > reliably. If the remote CPU sets it, it'll be too late and we'll 
> > > still enter user mode without seeing it.  If it's just an 
> > > optimization, though, then it should be fine.
> >
> > Well, after setting it, the remote CPU has to re-check whether the 
> > RT CPU has entered user-mode - before it goes to wait.
> 
> How?
> 
> Suppose the exit path looked like:
> 
> this_cpu_write(rcu_state, IN_USER);
> 
> if (ti->flags & _TIF_RCU_NOTIFY) {
> if (test_and_clear_bit(TIF_RCU_NOTIFY, &ti->flags))
> slow_notify_rcu_that_we_are_exiting();
> }
> 
> iret or sysret;

No, it would look like this:

   this_cpu_write(rcu_state, IN_USER);
   iret or sysret;

I.e. IN_USER is set well after all notifications are checked. No 
kernel execution happens afterwards. (No extra checks added - the 
regular return-to-user-work checks would handle TIF_RCU_NOTIFY.)

( Same goes for idle: we just mark it IN_IDLE and move it back to 
  IN_KERNEL after the idling ends. )

> The RCU-waiting CPU sees that rcu_state == IN_KERNEL and sets 
> _TIF_RCU_NOTIFY.  This could happen arbitrarily late before IRET 
> because stores can be delayed.  (It could even happen after sysret, 
> IIRC, but IRET is serializing.)

All it has to do in the synchronize_rcu() slowpath is something like:

if (per_cpu(rcu_state, rt_cpu) == IN_KERNEL) {
smp_mb__before_atomic();
set_tsk_thread_flag(remote_task, TIF_RCU_NOTIFY);
smp_rmb();
if (per_cpu(rcu_state, rt_cpu) == IN_KERNEL)
... go wait ...
}
/* Cool, we observed quiescent state: */

The cost of the trivial barrier is nothing compared to the 'go wait' 
cost which we will pay in 99.9% of the cases!

> If we put an mfence after this_cpu_set or did an unconditional 
> test_and_clear_bit on ti->flags then this problem goes away, but 
> that would probably be slower than we'd like.

We are talking about a dozen cycles, while a typical synchronize_rcu() 
will wait millions (sometimes billions) of cycles. There's absolutely 
zero performance concern here and it's all CPU local in any case.

In fact a user-mode/kernel-mode flag speeds up naive implementations 
of synchronize_rcu(): because it's able to observe extended quiescent 
state immediately, without having to wait for a counter to increase 
(which was always the classic way to observe grace periods).

If all CPUs are in user mode or are idle (which is rather common!) 
then synchronize_rcu() could return almost immediately - while 
previously it had to wait for scheduling or periodic timer irqs to 
trigger on all CPUs - adding many millisecs of delay even in the best 
of cases.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for May 7 (openrisc, sparc qemu BUG due to 'hrtimer: Get rid of hrtimer softirq')

2015-05-07 Thread Geert Uytterhoeven
On Fri, May 8, 2015 at 3:54 AM, Guenter Roeck  wrote:
> On Thu, May 07, 2015 at 04:47:15PM +1000, Stephen Rothwell wrote:
>> Hi all,
>>
>> Changes since 20150506:
>>
>> The ext4 tree gained a build failure so I used the version from
>> next-20150506.
>>
>> The vfs tree gained a conflict against the f2fs tree.
>>
>> The rcu tree gained conflicts against the tip tree.
>>
>> Non-merge commits (relative to Linus' tree): 2460
>>  2454 files changed, 114321 insertions(+), 47105 deletions(-)
>>
>> 
>>
> Seen when running openrisc target in qemu:
>
> Switched to clocksource openrisc_timer
> BUG: failure at kernel/irq_work.c:135/irq_work_run_list()!
> Kernel panic - not syncing: BUG!
> ---[ end Kernel panic - not syncing: BUG!
>
>
> A similar crash with the same BUG is seen with sparc smp qemu tests.
>
> kernel BUG at kernel/irq_work.c:135!

https://lkml.org/lkml/2015/5/7/433

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: earlycon: no match?

2015-05-07 Thread Sascha Hauer
On Thu, May 07, 2015 at 01:22:11PM -0400, Peter Hurley wrote:
> On 05/07/2015 01:09 PM, Maciej W. Rozycki wrote:
> > On Mon, 4 May 2015, Peter Hurley wrote:
> > 
>  Since 2007, 'console=' is a early param synonym for 'earlycon='; IOW,
>  the message is new but not the behavior.
> >>>
> >>> "console=" had nothing to do with early param before.
> >>
> >> *Yes, it has* since the commit I referenced in the previous email and
> >> the email before that when I noted that 'console=' and 'earlycon=' are
> >> synonyms if CONFIG_SERIAL_EARLYCON=y, and it has been that way since
> >> 2007.
> >>
> >> The only thing that has changed is that I added a diagnostic; _to repeat_,
> >> the earlycon matching code has always run in this case, only the
> >> diagnostic is new.
> > 
> >  What's the point of having two parameters as synonyms whose syntax is not 
> > compatible to each other in the general case?  I'd expect the following 
> > cases to be handled:
> > 
> > 1. Regular console only (no early console requested) => `console=foo...'.
> > 
> > 2. Both early and regular console => `earlycon=blah... console=foo...'.
> > 
> > 3. Early console handing over to regular console => `earlycon=blah...'.
> 
>   4. Early console only => `earlycon=blah...'
> 
> How to distinguish between 3 & 4?

Given that the aliasing only makes sense for console=uart, and
console=uart8250, we can do the following. Not exactly the nicest
code, but only slighty more ugly than what we have now in
do_early_param()

Sascha

--8<

>From e4d5a09877e48308ee0cf4170f2eef8aa2f747f5 Mon Sep 17 00:00:00 2001
From: Sascha Hauer 
Date: Fri, 8 May 2015 08:23:47 +0200
Subject: [PATCH] param: console: Do not treat console as synonym for earlycon

Since 2007 console= and earlycon= are treated as synonyms, but the
syntax for both options is different. The only case in which they
are identical is for console=uart or console=uart8250. All other
cases currently lead to the warning:

earlycon: no match for xxx

This patch drops the general aliasing, but keeps the current
behaviour for console=uart and console=uart8250 to keep the kernel
parameters compatible.

Signed-off-by: Sascha Hauer 
---
 init/main.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/init/main.c b/init/main.c
index 2115055..bfcbbc5 100644
--- a/init/main.c
+++ b/init/main.c
@@ -413,12 +413,15 @@ static noinline void __init_refok rest_init(void)
 static int __init do_early_param(char *param, char *val, const char *unused)
 {
const struct obs_kernel_param *p;
+   bool earlyconalias = false;
+
+   if (val && !strcmp(param, "console") &&
+   (!strncmp(val, "uart,", 5) || !strncmp(val, "uart8250,", 9)))
+   earlyconalias = true;
 
for (p = __setup_start; p < __setup_end; p++) {
if ((p->early && parameq(param, p->str)) ||
-   (strcmp(param, "console") == 0 &&
-strcmp(p->str, "earlycon") == 0)
-   ) {
+   (earlyconalias && strcmp(p->str, "earlycon") == 0)) {
if (p->setup_func(val) != 0)
pr_warn("Malformed early option '%s'\n", param);
}
-- 
2.1.4

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/5] dmaengine: pxa: add pxa dmaengine driver

2015-05-07 Thread Vinod Koul
On Sat, Apr 11, 2015 at 09:40:34PM +0200, Robert Jarzmik wrote:
> This is a new driver for pxa SoCs, which is also compatible with the former
> mmp_pdma.
The rationale is fine, is there a plan to remove old mmp_pdma then?

> +config PXA_DMA
> + bool "PXA DMA support"
no prompt?

> +
> +#define DRCMR(n) n) < 64) ? 0x0100 : 0x1100) + (((n) & 0x3f) << 2))
care to put a comment on this calculation

> +#define DRCMR_MAPVLD BIT(7)  /* Map Valid (read / write) */
> +#define DRCMR_CHLNUM 0x1f/* mask for Channel Number (read / write) */
> +
> +#define DDADR_DESCADDR   0xfff0  /* Address of next descriptor 
> (mask) */
> +#define DDADR_STOP   BIT(0)  /* Stop (read / write) */
> +
> +#define DCMD_INCSRCADDR  BIT(31) /* Source Address Increment Setting. */
> +#define DCMD_INCTRGADDR  BIT(30) /* Target Address Increment Setting. */
> +#define DCMD_FLOWSRC BIT(29) /* Flow Control by the source. */
> +#define DCMD_FLOWTRG BIT(28) /* Flow Control by the target. */
> +#define DCMD_STARTIRQEN  BIT(22) /* Start Interrupt Enable */
> +#define DCMD_ENDIRQENBIT(21) /* End Interrupt Enable */
> +#define DCMD_ENDIAN  BIT(18) /* Device Endian-ness. */
> +#define DCMD_BURST8  (1 << 16)   /* 8 byte burst */
> +#define DCMD_BURST16 (2 << 16)   /* 16 byte burst */
> +#define DCMD_BURST32 (3 << 16)   /* 32 byte burst */
> +#define DCMD_WIDTH1  (1 << 14)   /* 1 byte width */
> +#define DCMD_WIDTH2  (2 << 14)   /* 2 byte width (HalfWord) */
> +#define DCMD_WIDTH4  (3 << 14)   /* 4 byte width (Word) */
> +#define DCMD_LENGTH  0x01fff /* length mask (max = 8K - 1) */
Please namespace these ...

> +#define tx_to_pxad_desc(tx)  \
> + container_of(tx, struct pxad_desc_sw, async_tx)
> +#define to_pxad_chan(dchan)  \
> + container_of(dchan, struct pxad_chan, vc.chan)
> +#define to_pxad_dev(dmadev)  \
> + container_of(dmadev, struct pxad_device, slave)
> +#define to_pxad_sw_desc(_vd) \
> + container_of((_vd), struct pxad_desc_sw, vd)
> +
> +#define pdma_err(pdma, fmt, arg...) \
> + dev_err(pdma->slave.dev, "%s: " fmt, __func__, ## arg)
> +#define chan_dbg(_chan, fmt, arg...) \
> + dev_dbg(&(_chan)->vc.chan.dev->device, "%s(chan=%p): " fmt, \
> + __func__, (_chan), ## arg)
> +#define chan_vdbg(_chan, fmt, arg...)
> \
> + dev_vdbg(&(_chan)->vc.chan.dev->device, "%s(chan=%p): " fmt,\
> + __func__, (_chan), ## arg)
> +#define chan_warn(_chan, fmt, arg...)
> \
> + dev_warn(&(_chan)->vc.chan.dev->device, "%s(chan=%p): " fmt,\
> +  __func__, (_chan), ## arg)
> +#define chan_err(_chan, fmt, arg...) \
> + dev_err(&(_chan)->vc.chan.dev->device, "%s(chan=%p): " fmt, \
> + __func__, (_chan), ## arg)
am not a big fan of driver specfic debug macros, can we use dev_ ones please

> +
> +#define _phy_readl_relaxed(phy, _reg)
> \
> + readl_relaxed((phy)->base + _reg((phy)->idx))
> +#define phy_readl_relaxed(phy, _reg) \
> + ({  \
> + u32 _v; \
> + _v = readl_relaxed((phy)->base + _reg((phy)->idx)); \
> + chan_vdbg(phy->vchan, "readl(%s): 0x%08x\n", #_reg, \
> +   _v);  \
> + _v; \
> + })
> +#define phy_writel(phy, val, _reg)   \
> + do {\
> + writel((val), (phy)->base + _reg((phy)->idx));  \
> + chan_vdbg((phy)->vchan, "writel(0x%08x, %s)\n", \
> +   (u32)(val), #_reg);   \
> + } while (0)
> +#define phy_writel_relaxed(phy, val, _reg)   \
> + do {\
> + writel_relaxed((val), (phy)->base + _reg((phy)->idx));  \
> + chan_vdbg((phy)->vchan, "writel(0x%08x, %s)\n", \
> +   (u32)(val), #_reg);   \
> + } while (0)
> +
> +/*
??
Does this code compile?

> +
> +static struct pxad_phy *lookup_phy(struct pxad_chan *pchan)
> +{
> + int prio, i;
> + struct pxad_device *pdev = to_pxad_dev(pchan->vc.chan.device);
> + struct pxad_phy *phy, *found = NULL;
> + unsigned long flags;
> +
> + /*
> +  * dma channel priorities
> +  * ch 0 - 3,  16 - 19  <--> (0)
> +  * ch 4 - 7,  20 - 23  <--> (1)
> +  * ch 8 - 11, 24 

Re: [PATCH v2 00/20] libnd: non-volatile memory device support

2015-05-07 Thread Williams, Dan J
On Tue, 2015-05-05 at 02:06 +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 28, 2015 06:22:05 PM Dan Williams wrote:
> > On Tue, Apr 28, 2015 at 5:25 PM, Rafael J. Wysocki  
> > wrote:
> > > On Tuesday, April 28, 2015 02:24:12 PM Dan Williams wrote:
> > >> Changes since v1 [1]: Incorporates feedback received prior to April 24.
> > >>
> 
> [cut]
> 
> > >
> > > I'm wondering what's wrong with CCing all of the series to linux-acpi?
> > >
> > > Is there anything in it that the people on that list should not see, by 
> > > any
> > > chance?
> > 
> > linux-acpi may not care about the dimm-metadata labeling patches that
> > are completely independent of ACPI, but might as well include
> > linux-acpi on the whole series at this point.
> 
> I've gone through the ACPI-related patches in this series (other than [2/20]
> that I've commented directly) and while I haven't found anything horrible in
> them, I don't quite feel confident enough to ACK them.
> 
> What I'm really missing in this series is a design document describing all 
> that
> from a high-level perspective and making it clear where all of the pieces go
> and what their respective roles are.  Also reordering the series to introduce
> the nd subsystem to start with and then its users might help here.

Here you go, and also see the "Supporting Documents" section if you need
more details, or just ask.  This is the reworked document after pushing
NFIT specifics out of the core implementation.  The core apis are
nd_bus_register(), nd_dimm_create(), nd_pmem_region_create(), and
nd_blk_region_create().

---

  LIBND: Non-volatile Devices
  libnd - kernel / libndctl - userspace helper library
   linux-nvd...@lists.01.org
  v10


Glossary
Overview
Supporting Documents
Git Trees
LIBND PMEM and BLK
Why BLK?
PMEM vs BLK
BLK-REGIONs, PMEM-REGIONs, Atomic Sectors, and DAX
Example NVDIMM Platform
LIBND Kernel Device Model and LIBNDCTL Userspace API
LIBNDCTL: Context
libndctl: instantiate a new library context example
LIBND/LIBNDCTL: Bus
libnd: control class device in /sys/class
libnd: bus
libndctl: bus enumeration example
LIBND/LIBNDCTL: DIMM (NMEM)
libnd: DIMM (NMEM)
libndctl: DIMM enumeration example
LIBND/LIBNDCTL: Region
libnd: region
libndctl: region enumeration example
Why Not Encode the Region Type into the Region Name?
How Do I Determine the Major Type of a Region?
LIBND/LIBNDCTL: Namespace
libnd: namespace
libndctl: namespace enumeration example
libndctl: namespace creation example
Why the Term "namespace"?
LIBND/LIBNDCTL: Block Translation Table "btt"
libnd: btt layout
libndctl: btt creation example
Summary LIBNDCTL Diagram


Glossary


PMEM: A system physical address range where writes are persistent.  A
block device composed of PMEM is capable of DAX.  A PMEM address range
may span/interleave several DIMMs.

BLK: A set of one or more programmable memory mapped apertures provided
by a DIMM to access its media.  This indirection precludes the
performance benefit of interleaving, but enables DIMM-bounded failure
modes .

DPA: DIMM Physical Address, is a DIMM-relative offset.  With one DIMM in
the system there would be a 1:1 system-physical-address:DPA association.
Once more DIMMs are added an memory controller interleave must be
decoded to determine the DPA associated with a given
system-physical-address.  BLK capacity always has a 1:1 relationship
with a single-dimm's DPA range.

DAX: File system extensions to bypass the page cache and block layer to
mmap persistent memory, from a PMEM block device, directly into a
process address space.

BTT: Block Translation Table: Persistent memory is byte addressable.
Existing software may have an expectation that the power-fail-atomicity
of writes is at least one sector, 512 bytes.  The BTT is an indirection
table with atomic update semantics to front a PMEM/BLK block device
driver and present arbitrary atomic sector sizes.

LABEL: Metadata stored on a DIMM device that partitions and identifies
(persistently names) storage between PMEM and BLK.  It also partitions
BLK storage to host BTTs with different parameters per BLK-partition.
Note that traditional partition tables, GPT/MBR, are layered on top of a
BLK or PMEM device.


Overview


The libnd subsystem provides support for three types of NVDIMMs, PMEM,
BLK, and NVDIMM platforms that can simultaneously support PMEM and BLK
mode access capabilities on a given set of DIMMs.  These three modes of
operation are described by the "N

Re: [PATCH v3 1/2] perf/kvm: Port perf kvm to powerpc

2015-05-07 Thread Ingo Molnar

* Hemant Kumar  wrote:

> 
> On 05/08/2015 09:58 AM, Ingo Molnar wrote:
> >* Hemant Kumar  wrote:
> >
> >>  # perf kvm stat report -p 60515
> >>Analyze events for pid(s) 60515, all VCPUs:
> >>
> >>VM-EXITSamples  Samples% Time%Min Time Max  
> >>   Time Avg time
> >>
> >>H_DATA_STORAGE   500635.30% 0.13%  1.94us 49.46us 
> >>12.37us ( +-   0.52% )
> >>HV_DECREMENTER   445731.43% 0.02%  0.72us 16.14us  
> >>1.91us ( +-   0.96% )
> >>SYSCALL   269018.97% 0.10%  2.84us528.24us 
> >> 18.29us ( +-   3.75% )
> >>RETURN_TO_HOST   178912.61%99.76%  1.58us 672791.91us  
> >>27470.23us ( +-   3.00% )
> >>   EXTERNAL240 1.69% 0.00%0.69us 10.67us
> >>   1.33us ( +-   5.34% )
> >Where is the last line misaligned? Copy & paste error or does perf kvm
> >produce it in such a way?
> 
> Its a copy-paste error. Thanks for pointing this out.
> 
> Shall I resend the patches with the correct alignment of the o/p?

I don't think that's necessary, as long as the code is fine.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry

2015-05-07 Thread Paul E. McKenney
On Thu, May 07, 2015 at 02:49:39PM +0200, Ingo Molnar wrote:
> 
> * Ingo Molnar  wrote:
> 
> > The TIF_RCU_QS thing is just a fancy way for synchronize_rcu() 
> > (being executed on some other CPU not doing RT work) to 
> > intelligently wait for the remote (RT work doing) CPU to finish 
> > executing kernel code, without polling or so.
> 
> it's basically a cheap IPI being inserted on the remote CPU.
> 
> We need the TIF_RCU_QS callback not just to wait intelligently, but 
> mainly to elapse a grace period, otherwise synchronize_rcu() might not 
> ever make progress: think a SCHED_FIFO task doing some kernel work, 
> synchronize_rcu() stumbling upon it - but the SCHED_FIFO task 
> otherwise never scheduling and never getting any timer irqs either, 
> and thus never entering quiescent state.
> 
> (Cc:-ed Paul too, he might be interested in this as well.)

Hmmm...  So the point is that a NO_HZ_FULL CPU periodically posts
callbacks to indicate that it has passed through a quiescent state,
for example, upon entry to and/or exit from userspace?  These callbacks
would then be offloaded to some other CPU.

But the callback would not be invoked until RCU saw a grace period,
so I must be missing something here...  Probably that the TIF_RCU_QS
callback is not an RCU callback, but something else?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] perf/kvm: Port perf kvm to powerpc

2015-05-07 Thread Hemant Kumar


On 05/08/2015 09:58 AM, Ingo Molnar wrote:

* Hemant Kumar  wrote:


  # perf kvm stat report -p 60515
Analyze events for pid(s) 60515, all VCPUs:

VM-EXITSamples  Samples% Time%Min Time Max
Time Avg time

H_DATA_STORAGE   500635.30% 0.13%  1.94us 49.46us 
12.37us ( +-   0.52% )
HV_DECREMENTER   445731.43% 0.02%  0.72us 16.14us  
1.91us ( +-   0.96% )
SYSCALL   269018.97% 0.10%  2.84us528.24us 
18.29us ( +-   3.75% )
RETURN_TO_HOST   178912.61%99.76%  1.58us 672791.91us  
27470.23us ( +-   3.00% )
   EXTERNAL240 1.69% 0.00%0.69us 10.67us  
1.33us ( +-   5.34% )

Where is the last line misaligned? Copy & paste error or does perf kvm
produce it in such a way?


Its a copy-paste error. Thanks for pointing this out.

Shall I resend the patches with the correct alignment of the o/p?


Thanks,

Ingo



--
Thanks,
Hemant Kumar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 07/11] md/raid5: split bio for chunk_aligned_read

2015-05-07 Thread Ming Lin
On 05/07/2015 09:14 PM, NeilBrown wrote:
> On Wed,  6 May 2015 23:34:17 -0700 Ming Lin  wrote:
> 
>> If a read request fits entirely in a chunk, it will be passed directly to the
>> underlying device (providing it hasn't failed of course).  If it doesn't fit,
>> the slightly less efficient path that uses the stripe_cache is used.
>> Requests that get to the stripe cache are always completely split up as
>> necessary.
>>
>> So with RAID5, ripping out the merge_bvec_fn doesn't cause it to stop work,
>> but could cause it to take the less efficient path more often.
>>
>> All that is needed to manage this is for 'chunk_aligned_read' do some bio
>> splitting, much like the RAID0 code does.
>>
>> Cc: Neil Brown 
>> Cc: linux-r...@vger.kernel.org
>> Signed-off-by: Ming Lin 
>> ---
>>  drivers/md/raid5.c | 42 +-
>>  1 file changed, 37 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 7f4a717..b18f548 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -4738,7 +4738,7 @@ static void raid5_align_endio(struct bio *bi, int 
>> error)
>>  add_bio_to_retry(raid_bi, conf);
>>  }
>>  
>> -static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio)
>> +static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio)
>>  {
>>  struct r5conf *conf = mddev->private;
>>  int dd_idx;
>> @@ -4747,7 +4747,7 @@ static int chunk_aligned_read(struct mddev *mddev, 
>> struct bio * raid_bio)
>>  sector_t end_sector;
>>  
>>  if (!in_chunk_boundary(mddev, raid_bio)) {
>> -pr_debug("chunk_aligned_read : non aligned\n");
>> +pr_debug("%s: non aligned\n", __func__);
>>  return 0;
>>  }
>>  /*
>> @@ -4822,6 +4822,36 @@ static int chunk_aligned_read(struct mddev *mddev, 
>> struct bio * raid_bio)
>>  }
>>  }
>>  
>> +static struct bio *chunk_aligned_read(struct mddev *mddev, struct bio 
>> *raid_bio)
>> +{
>> +struct bio *split;
>> +
>> +do {
>> +sector_t sector = raid_bio->bi_iter.bi_sector;
>> +unsigned chunk_sects = mddev->chunk_sectors;
>> +unsigned sectors;
>> +
>> +if (likely(is_power_of_2(chunk_sects)))
>> +sectors = chunk_sects - (sector & (chunk_sects-1));
>> +else
>> +sectors = chunk_sects - sector_div(sector, chunk_sects);
> 
> RAID5 doesn't currently allow non-power-of-2 chunks.  So this test is
> pointless, but not really harmful.  Maybe someday we will.
> 
> I'm equally happy for it to stay or go.

Then it's better for it to go.
Thanks.

>From d40e9dfaae261cc86170193305e2022d2e1cda1a Mon Sep 17 00:00:00 2001
From: Ming Lin 
Date: Wed, 6 May 2015 22:51:24 -0700
Subject: [PATCH 07/11] md/raid5: split bio for chunk_aligned_read

If a read request fits entirely in a chunk, it will be passed directly to the
underlying device (providing it hasn't failed of course).  If it doesn't fit,
the slightly less efficient path that uses the stripe_cache is used.
Requests that get to the stripe cache are always completely split up as
necessary.

So with RAID5, ripping out the merge_bvec_fn doesn't cause it to stop work,
but could cause it to take the less efficient path more often.

All that is needed to manage this is for 'chunk_aligned_read' do some bio
splitting, much like the RAID0 code does.

Cc: Neil Brown 
Cc: linux-r...@vger.kernel.org
Acked-by: NeilBrown 
Signed-off-by: Ming Lin 
---
 drivers/md/raid5.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 7f4a717..1978aa9 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4738,7 +4738,7 @@ static void raid5_align_endio(struct bio *bi, int error)
add_bio_to_retry(raid_bi, conf);
 }
 
-static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio)
+static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio)
 {
struct r5conf *conf = mddev->private;
int dd_idx;
@@ -4747,7 +4747,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct 
bio * raid_bio)
sector_t end_sector;
 
if (!in_chunk_boundary(mddev, raid_bio)) {
-   pr_debug("chunk_aligned_read : non aligned\n");
+   pr_debug("%s: non aligned\n", __func__);
return 0;
}
/*
@@ -4822,6 +4822,31 @@ static int chunk_aligned_read(struct mddev *mddev, 
struct bio * raid_bio)
}
 }
 
+static struct bio *chunk_aligned_read(struct mddev *mddev, struct bio 
*raid_bio)
+{
+   struct bio *split;
+
+   do {
+   sector_t sector = raid_bio->bi_iter.bi_sector;
+   unsigned chunk_sects = mddev->chunk_sectors;
+   unsigned sectors = chunk_sects - (sector & (chunk_sects-1));
+
+   if (sectors < bio_sectors(raid_bio)) {
+   split = bio_split(

[PATCH v4 1/2] arm: perf: Fix callchain parse error with kernel tracepoint events

2015-05-07 Thread Hou Pengyang
For ARM, when tracing with tracepoint events, the IP and cpsr are set
to 0, preventing the perf code parsing the callchain and resolving the
symbols correctly.

 ./perf record -e sched:sched_switch -g --call-graph dwarf ls
[ perf record: Captured and wrote 0.006 MB perf.data ]
 ./perf report -f
Samples: 5  of event 'sched:sched_switch', Event count (approx.): 5
Children  SelfCommand  Shared Object Symbol
100.00%   100.00%  ls   [unknown] [.] 

The fix is to implement perf_arch_fetch_caller_regs for ARM, which fills
several necessary registers used for callchain unwinding, including pc,sp,
fp and cpsr.

With this patch, callchain can be parsed correctly as :

   .
-  100.00%   100.00%  ls   [kernel.kallsyms]  [k] __sched_text_start
   + __sched_text_start
+   20.00% 0.00%  ls   libc-2.18.so   [.] _dl_addr
+   20.00% 0.00%  ls   libc-2.18.so   [.] write
   .

Jean Pihet found this in ARM and come up with a patch:
http://thread.gmane.org/gmane.linux.kernel/1734283/focus=1734280

This patch rewrite Jean's patch in C.

Signed-off-by: Hou Pengyang 
---
 arch/arm/include/asm/perf_event.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm/include/asm/perf_event.h 
b/arch/arm/include/asm/perf_event.h
index d9cf138..4f9dec4 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -19,4 +19,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 #define perf_misc_flags(regs)  perf_misc_flags(regs)
 #endif
 
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+   (regs)->ARM_pc = (__ip); \
+   (regs)->ARM_fp = (unsigned long) __builtin_frame_address(0); \
+   (regs)->ARM_sp = current_stack_pointer; \
+   (regs)->ARM_cpsr = SVC_MODE; \
+}
+
 #endif /* __ARM_PERF_EVENT_H__ */
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mips:Fix build error for ip32_defconfig configuration

2015-05-07 Thread Joshua Kinard
On 05/07/2015 20:52, Nicholas Krause wrote:
> This fixes the make error when building the ip32_defconfig
> configuration due to using sgio2_cmos_devinit rather then
> the correct function,sgio2_rtc_devinit in a device_initcall
> below this function's definition.
> 
> Signed-off-by: Nicholas Krause 
> ---
>  arch/mips/sgi-ip32/ip32-platform.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/mips/sgi-ip32/ip32-platform.c 
> b/arch/mips/sgi-ip32/ip32-platform.c
> index 0134db2..2a7efcb 100644
> --- a/arch/mips/sgi-ip32/ip32-platform.c
> +++ b/arch/mips/sgi-ip32/ip32-platform.c
> @@ -130,9 +130,9 @@ struct platform_device ip32_rtc_device = {
>   .resource   = ip32_rtc_resources,
>  };
>  
> -+static int __init sgio2_rtc_devinit(void)
> +static  __init int sgio2_rtc_devinit(void)
>  {
>   return platform_device_register(&ip32_rtc_device);
>  }
>  
> -device_initcall(sgio2_cmos_devinit);
> +device_initcall(sgio2_rtc_devinit);
> 

I believe I sent this patch in already, back on 04/19/2015.  Didn't get an
acknowledgement on it from akpm, though.

--J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 0/2] arm & arm64: perf: Fix callchain parse error with kernel tracepoint events

2015-05-07 Thread Hou Pengyang
For arm & arm64, when tracing with tracepoint events, the IP and cpsr 
are set to 0, preventing the perf code parsing the callchain and 
resolving the symbols correctly. 

These two patches fix this by implementing perf_arch_fetch_caller_regs
for arm and arm64, which fills several necessary register info for 
callchain unwinding and symbol resolving.

v3->v4:
 - fix compile errors

v2->v3:
 - split the original patch into two, one for arm and the other arm64;
 - change '|=' to '=' when setting cpsr. 

Hou Pengyang (2):
  arm: perf: Fix callchain parse error with kernel tracepoint events
  arm64: perf: Fix callchain parse error with kernel tracepoint events

 arch/arm/include/asm/perf_event.h   | 7 +++
 arch/arm64/include/asm/perf_event.h | 7 +++
 2 files changed, 14 insertions(+)

-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 2/2] arm64: perf: Fix callchain parse error with kernel tracepoint events

2015-05-07 Thread Hou Pengyang
For ARM64, when tracing with tracepoint events, the IP and pstate are set
to 0, preventing the perf code parsing the callchain and resolving the 
symbols correctly.

 ./perf record -e sched:sched_switch -g --call-graph dwarf ls
[ perf record: Captured and wrote 0.146 MB perf.data ]
 ./perf report -f
Samples: 194  of event 'sched:sched_switch', Event count (approx.): 194 
Children  SelfCommand  Shared Object Symbol
100.00%   100.00%  ls   [unknown] [.] 

The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills
several necessary registers used for callchain unwinding, including pc,sp,
fp and spsr .

With this patch, callchain can be parsed correctly as follows:

 ..
+2.63% 0.00%  ls   [kernel.kallsyms]  [k] vfs_symlink
+2.63% 0.00%  ls   [kernel.kallsyms]  [k] follow_down
+2.63% 0.00%  ls   [kernel.kallsyms]  [k] pfkey_get
+2.63% 0.00%  ls   [kernel.kallsyms]  [k] do_execveat_common.isra.33
-2.63% 0.00%  ls   [kernel.kallsyms]  [k] pfkey_send_policy_notify
 pfkey_send_policy_notify
 pfkey_get
 v9fs_vfs_rename
 page_follow_link_light
 link_path_walk
 el0_svc_naked
...

Signed-off-by: Hou Pengyang 
---
 arch/arm64/include/asm/perf_event.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/include/asm/perf_event.h 
b/arch/arm64/include/asm/perf_event.h
index d26d1d5..6471773 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -24,4 +24,11 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 #define perf_misc_flags(regs)  perf_misc_flags(regs)
 #endif
 
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+   (regs)->pc = (__ip);\
+   (regs)->regs[AARCH64_INSN_REG_FP] = (unsigned long) 
__builtin_frame_address(0); \
+   (regs)->sp = current_stack_pointer; \
+   (regs)->pstate = PSR_MODE_EL1h; \
+}
+
 #endif
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arm64: bpf: fix signedness bug in loading 64-bit immediate

2015-05-07 Thread Xi Wang
Consider "(u64)insn1.imm << 32 | imm" in the arm64 JIT.  Since imm is
signed 32-bit, it is sign-extended to 64-bit, losing the high 32 bits.
The fix is to convert imm to u32 first and zero-extend it to u64.

Also extend test_bpf to catch this JIT bug; the interpreter is correct.

Before:
test_bpf: #58 load 64-bit immediate ret -1 != 1 FAIL (1 times)

After:
test_bpf: #58 load 64-bit immediate 74 PASS

Fixes: 30d3d94cc3d5 ("arm64: bpf: add 'load 64-bit immediate' instruction")
Cc: Zi Shen Lim 
Cc: Alexei Starovoitov 
Cc: Catalin Marinas 
Cc: Will Deacon 
Signed-off-by: Xi Wang 
---
 arch/arm64/net/bpf_jit_comp.c | 2 +-
 lib/test_bpf.c| 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index edba042b2325..14cdc099fda0 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -487,7 +487,7 @@ emit_cond_jmp:
return -EINVAL;
}
 
-   imm64 = (u64)insn1.imm << 32 | imm;
+   imm64 = ((u64)(u32)insn1.imm) << 32 | (u64)(u32)imm;
emit_a64_mov_i64(dst, imm64, ctx);
 
return 1;
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 80d78c51f65f..9f6849891b5f 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -1755,7 +1755,8 @@ static struct bpf_test tests[] = {
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JEQ, R3, 0x1234, 1),
BPF_EXIT_INSN(),
-   BPF_ALU64_IMM(BPF_MOV, R0, 1),
+   BPF_LD_IMM64(R0, 0x1LL),
+   BPF_ALU64_IMM(BPF_RSH, R0, 32), /* R0 = 1 */
BPF_EXIT_INSN(),
},
INTERNAL,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

2015-05-07 Thread Ingo Molnar

Al,

I was wondering about the struct page rules of 
iov_iter_get_pages_alloc(), used in various places. There's no 
documentation whatsoever in lib/iov_iter.c, nor in 
include/linux/uio.h, and the changelog that introduced it only says:

 commit 91f79c43d1b54d7154b118860d81b39bad07dfff
 Author: Al Viro 
 Date:   Fri Mar 21 04:58:33 2014 -0400

new helper: iov_iter_get_pages_alloc()

same as iov_iter_get_pages(), except that pages array is allocated
(kmalloc if possible, vmalloc if that fails) and left for caller to
free.  Lustre and NFS ->direct_IO() switched to it.

Signed-off-by: Al Viro 

So if code does iov_iter_get_pages_alloc() on a user address that has 
a real struct page behind it - and some other code does a regular 
get_user_pages() on it, we'll have two sets of struct page 
descriptors, the 'real' one, and a fake allocated one, right?

How does that work? Nobody else can ever discover these fake page 
structs, so they don't really serve any 'real' synchronization purpose 
other than the limited role of IO completion.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for May 8

2015-05-07 Thread Stephen Rothwell
Hi all,

Changes since 20150507:

New tree : rtc

The ext4 tree still had its build failure so I used the version from
next-20150506.

Non-merge commits (relative to Linus' tree): 2631
 2535 files changed, 117241 insertions(+), 49326 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm
defconfig.

Below is a summary of the state of the merge.

I am currently merging 215 trees (counting Linus' and 30 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (3e0283a53f7d Merge tag 'pm+acpi-4.1-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm)
Merging fixes/master (b94d525e58dc Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging kbuild-current/rc-fixes (c517d838eb7d Linux 4.0-rc1)
Merging arc-current/for-curr (e4140819dadc ARC: signal handling robustify)
Merging arm-current/fixes (3b8786ff7a1b ARM: 8352/1: perf: Fix the pmu node 
name in warning message)
Merging m68k-current/for-linus (b24f670b7f5b m68k/mac: Fix out-of-bounds array 
index in OSS IRQ source initialization)
Merging metag-fixes/fixes (0164a711c97b metag: Fix ioremap_wc/ioremap_cached 
build errors)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-merge-mpe/fixes (0aab3747091d powerpc/powernv: Restore 
non-volatile CRs after nap)
Merging powerpc-merge/merge (c517d838eb7d Linux 4.0-rc1)
Merging sparc/master (acc455cffa75 sparc64: Setup sysfs to mark LDOM sockets, 
cores and threads correctly)
Merging net/master (31ccd0e66d41 tcp_westwood: fix tcp_westwood_info())
Merging ipsec/master (bdddbf6996c0 xfrm: fix a race in xfrm_state_lookup_byspi)
Merging sound-current/for-linus (2c674fac5b16 ALSA: hda/realtek - Add ALC298 
alias name for Dell)
Merging pci-current/for-linus (5ebe6afaf005 Linux 4.1-rc2)
Merging wireless-drivers/master (f67382186489 ath9k: fix per-packet tx power 
configuration)
Merging driver-core.current/driver-core-linus (b787f68c36d4 Linux 4.1-rc1)
Merging tty.current/tty-linus (5ebe6afaf005 Linux 4.1-rc2)
Merging usb.current/usb-linus (0d3bba0287d4 cdc-acm: prevent infinite loop when 
parsing CDC headers.)
Merging usb-gadget-fixes/fixes (c94e289f195e usb: gadget: remove incorrect 
__init/__exit annotations)
Merging usb-serial-fixes/usb-linus (82ee3aeb9295 USB: visor: Match I330 phone 
more precisely)
Merging staging.current/staging-linus (b787f68c36d4 Linux 4.1-rc1)
Merging char-misc.current/char-misc-linus (f26443a8ab76 Merge tag 
'extcon-fixes-for-4.1-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
char-misc-linus)
Merging input-current/for-linus (48853389f206 Merge branch 'next' into 
for-linus)
Merging crypto-current/master (f440c4ee3e53 hwrng: bcm63xx - Fix driver 
compilation)
Merging ide/master (d681f1166919 ide: remove deprecated use of pci api)
Merging devicetree-current/devicetree/merge (41d9489319f2 drivers/of: Add empty 
ranges quirk for PA-Semi)
Merging rr-fixes/fixes (f47689345931 lguest: update help text.)
Merging vfio-fixes/for-linus (db7d4d7f4021 vfio: Fix runaway interruptible 
timeout)
Merging kselftest-fixes/fixes (b787f68c36d4 Linux 4.1-rc1)
Merging drm-intel-fixes/for-linux-next-fixes (736a69ca8c99 drm/i915: Drop 
PIPE-A quirk for 945GSE HP Mini)
Merging asm-generic/master (643165c8bbc8 Merge tag 'uaccess_for_upstream' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost into asm-generic)
Merging arc/for-next (b787f68c36d4 Linux 4.1-rc1)
Merging arm/for-next (3faec0e04e82 Merge b

Re: [PATCH 1/2] md/raid5: avoid duplicate code

2015-05-07 Thread Yuanhan Liu
On Fri, May 08, 2015 at 03:28:00PM +1000, NeilBrown wrote:
> On Wed,  6 May 2015 17:45:49 +0800 Yuanhan Liu 
> wrote:
> 
> > Move the code that put one idle sh(hot in cache, but happens to be
> > zero referenced) back to active stage to __find_stripe(). Because
> > that's what need to do every time you invoke __find_stripe().
> > 
> > Moving it there avoids duplicate code, as well as makes a bit more
> > sense, IMO, as it tells a whole story now.
> 
> Thanks for this.  It is a good cleanup.
> 
> However I don't want to make any new changes to the RAID5 code until I find a
> couple of bugs that I'm hunting.  So I won't apply it just yet.
> Remind me in a couple of weeks if I seem to have forgotten.

Got it. Thanks.


--yliu
> 
> > 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  drivers/md/raid5.c | 50 ++
> >  1 file changed, 18 insertions(+), 32 deletions(-)
> > 
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 77dfd72..e7fa818 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -567,8 +567,25 @@ static struct stripe_head *__find_stripe(struct r5conf 
> > *conf, sector_t sector,
> >  
> > pr_debug("__find_stripe, sector %llu\n", (unsigned long long)sector);
> > hlist_for_each_entry(sh, stripe_hash(conf, sector), hash)
> > -   if (sh->sector == sector && sh->generation == generation)
> > +   if (sh->sector == sector && sh->generation == generation) {
> > +   if (!atomic_inc_not_zero(&sh->count)) {
> > +   spin_lock(&conf->device_lock);
> > +   if (!atomic_read(&sh->count)) {
> > +   if (!test_bit(STRIPE_HANDLE, 
> > &sh->state))
> > +   
> > atomic_inc(&conf->active_stripes);
> > +   BUG_ON(list_empty(&sh->lru) &&
> > +  !test_bit(STRIPE_EXPANDING, 
> > &sh->state));
> > +   list_del_init(&sh->lru);
> > +   if (sh->group) {
> > +   sh->group->stripes_cnt--;
> > +   sh->group = NULL;
> > +   }
> > +   }
> > +   atomic_inc(&sh->count);
> > +   spin_unlock(&conf->device_lock);
> > +   }
> > return sh;
> > +   }
> > pr_debug("__stripe %llu not in cache\n", (unsigned long long)sector);
> > return NULL;
> >  }
> > @@ -698,21 +715,6 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
> > init_stripe(sh, sector, previous);
> > atomic_inc(&sh->count);
> > }
> > -   } else if (!atomic_inc_not_zero(&sh->count)) {
> > -   spin_lock(&conf->device_lock);
> > -   if (!atomic_read(&sh->count)) {
> > -   if (!test_bit(STRIPE_HANDLE, &sh->state))
> > -   atomic_inc(&conf->active_stripes);
> > -   BUG_ON(list_empty(&sh->lru) &&
> > -  !test_bit(STRIPE_EXPANDING, &sh->state));
> > -   list_del_init(&sh->lru);
> > -   if (sh->group) {
> > -   sh->group->stripes_cnt--;
> > -   sh->group = NULL;
> > -   }
> > -   }
> > -   atomic_inc(&sh->count);
> > -   spin_unlock(&conf->device_lock);
> > }
> > } while (sh == NULL);
> >  
> > @@ -771,22 +773,6 @@ static void stripe_add_to_batch_list(struct r5conf 
> > *conf, struct stripe_head *sh
> > hash = stripe_hash_locks_hash(head_sector);
> > spin_lock_irq(conf->hash_locks + hash);
> > head = __find_stripe(conf, head_sector, conf->generation);
> > -   if (head && !atomic_inc_not_zero(&head->count)) {
> > -   spin_lock(&conf->device_lock);
> > -   if (!atomic_read(&head->count)) {
> > -   if (!test_bit(STRIPE_HANDLE, &head->state))
> > -   atomic_inc(&conf->active_stripes);
> > -   BUG_ON(list_empty(&head->lru) &&
> > -  !test_bit(STRIPE_EXPANDING, &head->state));
> > -   list_del_init(&head->lru);
> > -   if (head->group) {
> > -   head->group->stripes_cnt--;
> > -   head->group = NULL;
> > -   }
> > -   }
> > -   atomic_inc(&head->count);
> > -   spin_unlock(&conf->device_lock);
> > -   }
> > spin_unlock_irq(conf->hash_locks + hash);
> >  
> > if (!head)
> 


--
To unsubscribe from this list: send the line "unsubscribe

Re: [̈́PATCHv4 12/12] phy: add driver for TI TUSB1210 ULPI PHY

2015-05-07 Thread Kishon Vijay Abraham I



On Thursday 07 May 2015 11:49 AM, Heikki Krogerus wrote:

TUSB1210 ULPI PHY has vendor specific register for eye
diagram tuning. On some platforms the system firmware has
set optimized value to it. In order to not loose the
optimized value, the driver stores it during probe and
restores it every time the PHY is powered back on.

Signed-off-by: Heikki Krogerus 
Acked-by: David Cohen 
Cc: Kishon Vijay Abraham I 


Acked-by: Kishon Vijay Abraham I 

---
  drivers/phy/Kconfig|   7 +++
  drivers/phy/Makefile   |   1 +
  drivers/phy/phy-tusb1210.c | 153 +
  3 files changed, 161 insertions(+)
  create mode 100644 drivers/phy/phy-tusb1210.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index a53bd5b..fceac96 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -309,4 +309,11 @@ config PHY_QCOM_UFS
help
  Support for UFS PHY on QCOM chipsets.

+config PHY_TUSB1210
+   tristate "TI TUSB1210 ULPI PHY module"
+   depends on USB_ULPI_BUS
+   select GENERIC_PHY
+   help
+ Support for TI TUSB1210 USB ULPI PHY.
+
  endmenu
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index f126251..0a20418 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -40,3 +40,4 @@ obj-$(CONFIG_PHY_STIH41X_USB) += phy-stih41x-usb.o
  obj-$(CONFIG_PHY_QCOM_UFS)+= phy-qcom-ufs.o
  obj-$(CONFIG_PHY_QCOM_UFS)+= phy-qcom-ufs-qmp-20nm.o
  obj-$(CONFIG_PHY_QCOM_UFS)+= phy-qcom-ufs-qmp-14nm.o
+obj-$(CONFIG_PHY_TUSB1210) += phy-tusb1210.o
diff --git a/drivers/phy/phy-tusb1210.c b/drivers/phy/phy-tusb1210.c
new file mode 100644
index 000..07efdd3
--- /dev/null
+++ b/drivers/phy/phy-tusb1210.c
@@ -0,0 +1,153 @@
+/**
+ * tusb1210.c - TUSB1210 USB ULPI PHY driver
+ *
+ * Copyright (C) 2015 Intel Corporation
+ *
+ * Author: Heikki Krogerus 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+
+#include "ulpi_phy.h"
+
+#define TUSB1210_VENDOR_SPECIFIC2  0x80
+#define TUSB1210_VENDOR_SPECIFIC2_IHSTX_SHIFT  0
+#define TUSB1210_VENDOR_SPECIFIC2_ZHSDRV_SHIFT 4
+#define TUSB1210_VENDOR_SPECIFIC2_DP_SHIFT 6
+
+struct tusb1210 {
+   struct ulpi *ulpi;
+   struct phy *phy;
+   struct gpio_desc *gpio_reset;
+   struct gpio_desc *gpio_cs;
+   u8 vendor_specific2;
+};
+
+static int tusb1210_power_on(struct phy *phy)
+{
+   struct tusb1210 *tusb = phy_get_drvdata(phy);
+
+   gpiod_set_value_cansleep(tusb->gpio_reset, 1);
+   gpiod_set_value_cansleep(tusb->gpio_cs, 1);
+
+   /* Restore the optional eye diagram optimization value */
+   if (tusb->vendor_specific2)
+   ulpi_write(tusb->ulpi, TUSB1210_VENDOR_SPECIFIC2,
+  tusb->vendor_specific2);
+
+   return 0;
+}
+
+static int tusb1210_power_off(struct phy *phy)
+{
+   struct tusb1210 *tusb = phy_get_drvdata(phy);
+
+   gpiod_set_value_cansleep(tusb->gpio_reset, 0);
+   gpiod_set_value_cansleep(tusb->gpio_cs, 0);
+
+   return 0;
+}
+
+static struct phy_ops phy_ops = {
+   .power_on = tusb1210_power_on,
+   .power_off = tusb1210_power_off,
+   .owner = THIS_MODULE,
+};
+
+static int tusb1210_probe(struct ulpi *ulpi)
+{
+   struct gpio_desc *gpio;
+   struct tusb1210 *tusb;
+   u8 val, reg;
+   int ret;
+
+   tusb = devm_kzalloc(&ulpi->dev, sizeof(*tusb), GFP_KERNEL);
+   if (!tusb)
+   return -ENOMEM;
+
+   gpio = devm_gpiod_get(&ulpi->dev, "reset");
+   if (!IS_ERR(gpio)) {
+   ret = gpiod_direction_output(gpio, 0);
+   if (ret)
+   return ret;
+   gpiod_set_value_cansleep(gpio, 1);
+   tusb->gpio_reset = gpio;
+   }
+
+   gpio = devm_gpiod_get(&ulpi->dev, "cs");
+   if (!IS_ERR(gpio)) {
+   ret = gpiod_direction_output(gpio, 0);
+   if (ret)
+   return ret;
+   gpiod_set_value_cansleep(gpio, 1);
+   tusb->gpio_cs = gpio;
+   }
+
+   /*
+* VENDOR_SPECIFIC2 register in TUSB1210 can be used for configuring eye
+* diagram optimization and DP/DM swap.
+*/
+
+   /* High speed output drive strength configuration */
+   device_property_read_u8(&ulpi->dev, "ihstx", &val);
+   reg = val << TUSB1210_VENDOR_SPECIFIC2_IHSTX_SHIFT;
+
+   /* High speed output impedance configuration */
+   device_property_read_u8(&ulpi->dev, "zhsdrv", &val);
+   reg |= val << TUSB1210_VENDOR_SPECIFIC2_ZHSDRV_SHIFT;
+
+   /* DP/DM swap control */
+   device_property_read_u8(&ulpi->dev, "datapolarity", &val);
+   reg |= val << TUSB1210_VENDOR_SPECIFIC2_DP_SHIFT;
+
+   if (reg) {
+   ulpi_write(ulpi, TUSB1210_VENDOR

Re: [̈́PATCHv4 11/12] phy: helpers for USB ULPI PHY registering

2015-05-07 Thread Kishon Vijay Abraham I



On Friday 08 May 2015 12:25 AM, Felipe Balbi wrote:

On Thu, May 07, 2015 at 09:19:31AM +0300, Heikki Krogerus wrote:

ULPI PHYs need to be bound to their controllers with a
lookup. This adds helpers that the ULPI drivers can use to
do both, the registration of the PHY and the lookup, at the
same time.

Signed-off-by: Heikki Krogerus 
Acked-by: David Cohen 
Cc: Kishon Vijay Abraham I 


Kishon, need your Acked-by here and on the following patch. I think it's
easier to merge it through my tree although there is not real harsh
depedency, apparently.


Actually the next patch depends on USB_ULPI_BUS ;-)

Acked-by: Kishon Vijay Abraham I 



---
  drivers/phy/ulpi_phy.h | 31 +++
  1 file changed, 31 insertions(+)
  create mode 100644 drivers/phy/ulpi_phy.h

diff --git a/drivers/phy/ulpi_phy.h b/drivers/phy/ulpi_phy.h
new file mode 100644
index 000..ac49fb6
--- /dev/null
+++ b/drivers/phy/ulpi_phy.h
@@ -0,0 +1,31 @@
+#include 
+
+/**
+ * Helper that registers PHY for a ULPI device and adds a lookup for binding it
+ * and it's controller, which is always the parent.
+ */
+static inline struct phy
+*ulpi_phy_create(struct ulpi *ulpi, struct phy_ops *ops)
+{
+   struct phy *phy;
+   int ret;
+
+   phy = phy_create(&ulpi->dev, NULL, ops);
+   if (IS_ERR(phy))
+   return phy;
+
+   ret = phy_create_lookup(phy, "usb2-phy", dev_name(ulpi->dev.parent));
+   if (ret) {
+   phy_destroy(phy);
+   return ERR_PTR(ret);
+   }
+
+   return phy;
+}
+
+/* Remove a PHY that was created with ulpi_phy_create() and it's lookup. */
+static inline void ulpi_phy_destroy(struct ulpi *ulpi, struct phy *phy)
+{
+   phy_remove_lookup(phy, "usb2-phy", dev_name(ulpi->dev.parent));
+   phy_destroy(phy);
+}
--
2.1.4




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[EDT] oom_killer: find bulkiest task based on pss value

2015-05-07 Thread Yogesh Narayan Gaur

EP-2DAD0AFA905A4ACB804C4F82A001242F
Hi Andrew,

Presently in oom_kill.c we calculate badness score of the victim task as per 
the present RSS counter value of the task.
RSS counter value for any task is usually '[Private (Dirty/Clean)] + [Shared 
(Dirty/Clean)]' of the task.
We have encountered a situation where values for Private fields are less but 
value for Shared fields are more and hence make total RSS counter value large. 
Later on oom situation killing task with highest RSS value but as Private field 
values are not large hence memory gain after killing this process is not as per 
the expectation.

For e.g. take below use-case scenario, in which 3 process are running in 
system. 
All these process done mmap for file exist in present directory and then 
copying data from this file to local allocated pointers in while(1) loop with 
some sleep. Out of 3 process, 2 process has mmaped file with MAP_SHARED setting 
and one has mapped file with MAP_PRIVATE setting.
I have all 3 processes in background and checks RSS/PSS value from user space 
utility (utility over cat /proc/pid/smaps)
Before OOM, below is the consumed memory status for these 3 process (all 
processes run with oom_score_adj = 0)

Comm : 1prg,  Pid : 213 (values in kB)
  Rss Shared  Private  Pss
  Process :  375764194596181168 278460

Comm : 3prg,  Pid : 217 (values in kB)
  RssShared   Private Pss
  Process :  305760  32 305728305738

Comm : 2prg,  Pid : 218 (values in kB)
  Rss  Shared   Private Pss
  Process :  389980 194596 195384292676


Thus as per present code design, first it would select process [2prg : 218] as 
bulkiest process as its RSS value is highest to kill. But if we kill this 
process then only ~195MB would be free as compare to expected ~389MB.
Thus identifying the task based on RSS value is not accurate design and killing 
that identified process didn’t release expected memory back to system.

We need to calculate victim task based on PSS instead of RSS as PSS value 
calculates as
PSS value = [Private (Dirty/Clean)] + [Shared (Dirty/Clean) / no. of shared 
task]
For above use-case scenario also, it can be checked that process [3prg : 217] 
is having largest PSS value and by killing this process we can gain maximum 
memory (~305MB) as compare to killing process identified based on RSS value.

--
Regards,
Yogesh Gaur.N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü¨}©ž²Æ 
zÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ߢf”ù^jÇ«y§m…á@A«a¶Úÿ
0¶ìh®å’i

[PATCH v3] manpage: update FALLOC_FL_INSERT_RANGE flag in fallocate

2015-05-07 Thread Namjae Jeon
Update FALLOC_FL_INSERT_RANGE flag in fallocate.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 man2/fallocate.2 | 89 
 1 file changed, 84 insertions(+), 5 deletions(-)

diff --git a/man2/fallocate.2 b/man2/fallocate.2
index 0cc1a00..0d31027 100644
--- a/man2/fallocate.2
+++ b/man2/fallocate.2
@@ -228,6 +228,59 @@ ext4, for extent-based files (since Linux 3.15)
 .IP *
 SMB3 (since Linux 3.17)
 .\" commit 30175628bf7f521e9ee31ac98fa6d6fe7441a556
+.SS Increasing file space
+flag (available since Linux 4.1)
+.\" commit dd46c787788d5bf5b974729d43e4c405814a4c7d
+Specifying the
+.BR FALLOC_FL_INSERT_RANGE
+flag in
+.I mode
+will increase the file space by inserting a hole within the file size without
+overwriting any existing data.
+The hole will start at
+.I offset
+and continue for
+.I len
+bytes.
+For inserting hole inside file, the contents of the file starting at
+.I offset
+will be shifted towards right by
+.I len
+bytes.
+Inserting a hole inside the file will increase the file size by
+.I len
+bytes.
+
+This mode has the same limitation as
+.BR FALLOC_FL_COLLAPSE_RANGE
+regarding the
+granularity of the operation.
+If the granularity requirements are not met,
+.BR fallocate ()
+will fail with the error
+.BR EINVAL.
+If the
+.I offset
+is greater than or equal to the end of file, an error is
+returned.
+For such type of operations, i.e. inserting a hole at the end of file,
+.BR ftruncate(2)
+should be used.
+In case
+.IR offset + len
+exceeds the maximum file size, errno will be set to
+.B EFBIG.
+
+No other flags may be specified in
+.IR mode
+in conjunction with
+.BR FALLOC_FL_INSERT_RANGE .
+
+As of Linux 4.1,
+.B FALLOC_FL_INSERT_RANGE
+is supported by
+XFS.
+.\" commit a904b1ca5751faf5ece8600e18cd3b674afcca1b
 .SH RETURN VALUE
 On success,
 .BR fallocate ()
@@ -245,6 +298,12 @@ is not a valid file descriptor, or is not opened for 
writing.
 .IR offset + len
 exceeds the maximum file size.
 .TP
+.B EFBIG
+.I mode
+is
+.BR FALLOC_FL_INSERT_RANGE ,
+the current file size+len exceeds the maximum file size.
+.TP
 .B EINTR
 A signal was caught during execution.
 .TP
@@ -273,7 +332,17 @@ reaches or passes the end of the file.
 .B EINVAL
 .I mode
 is
-.BR FALLOC_FL_COLLAPSE_RANGE ,
+.BR FALLOC_FL_INSERT_RANGE
+and the range specified by
+.I offset
+reaches or passes the end of the file.
+.TP
+.B EINVAL
+.I mode
+is
+.BR FALLOC_FL_COLLAPSE_RANGE
+or
+.BR FALLOC_FL_INSERT_RANGE ,
 but either
 .I offset
 or
@@ -282,18 +351,24 @@ is not a multiple of the filesystem block size.
 .TP
 .B EINVAL
 .I mode
-contains both
+contains either of
 .B FALLOC_FL_COLLAPSE_RANGE
+or
+.B FALLOC_FL_INSERT_RANGE
 and other flags;
 no other flags are permitted with
-.BR FALLOC_FL_COLLAPSE_RANGE .
+.BR FALLOC_FL_COLLAPSE_RANGE
+or
+.BR FALLOC_FL_INSERT_RANGE .
 .TP
 .B EINVAL
 .I mode
 is
 .BR FALLOC_FL_COLLAPSE_RANGE
 or
-.BR FALLOC_FL_ZERO_RANGE ,
+.BR FALLOC_FL_ZERO_RANGE
+or
+.BR FALLOC_FL_INSERT_RANGE ,
 but the file referred to by
 .I fd
 is not a regular file.
@@ -345,6 +420,8 @@ specifies
 .BR FALLOC_FL_PUNCH_HOLE
 or
 .BR FALLOC_FL_COLLAPSE_RANGE
+or
+.BR FALLOC_FL_INSERT_RANGE
 and
 the file referred to by
 .I fd
@@ -363,7 +440,9 @@ refers to a pipe or FIFO.
 .B ETXTBSY
 .I mode
 specifies
-.BR FALLOC_FL_COLLAPSE_RANGE ,
+.BR FALLOC_FL_COLLAPSE_RANGE
+or
+.BR FALLOC_FL_INSERT_RANGE ,
 but the file referred to by
 .IR fd
 is currently being executed.
-- 
1.8.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 09/10] usb:fsl:otg: Resolve OTG crash issue with another host

2015-05-07 Thread Ramneek Mehresh


> -Original Message-
> From: Sergei Shtylyov [mailto:sergei.shtyl...@cogentembedded.com]
> Sent: Thursday, May 07, 2015 6:32 PM
> To: Mehresh Ramneek-B31383; linux-kernel@vger.kernel.org
> Cc: ba...@ti.com; linux-...@vger.kernel.org; st...@rowland.harvard.edu;
> gre...@linuxfoundation.org
> Subject: Re: [PATCH 09/10] usb:fsl:otg: Resolve OTG crash issue with another
> host
> 
> Hello.
> 
> On 5/7/2015 3:47 PM, Ramneek Mehresh wrote:
> 
> > Resolves kernel crash issue when a USB flash drive is inserted into
> > USB1 port with USB2 port configured as otg. Removing "else" block so
> > that the controller coming up in "non-otg" mode doesn't return
> > -ENODEV. Returning "ENODEV" results in platform framework unbinding
> > platform-drv from controller resulting in kernel crash later in hub
> > driver
> 
> > Signed-off-by: Ramneek Mehresh 
> > ---
> >   drivers/usb/host/ehci-fsl.c | 3 ---
> >   1 file changed, 3 deletions(-)
> 
> > diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
> > index 4bd4b0c..8d55f2b 100644
> > --- a/drivers/usb/host/ehci-fsl.c
> > +++ b/drivers/usb/host/ehci-fsl.c
> > @@ -180,9 +180,6 @@ static int usb_hcd_fsl_probe(const struct hc_driver
> *driver,
> > }
> >
> > ehci_fsl->have_hcd = 1;
> > -   } else {
> > -   dev_err(&pdev->dev, "wrong operating mode\n");
> > -   return -ENODEV;
> 
>Isn't it easier to just not add this code in the patch #7?
> 
Will do, thanks
> WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] md/raid5: avoid duplicate code

2015-05-07 Thread NeilBrown
On Wed,  6 May 2015 17:45:49 +0800 Yuanhan Liu 
wrote:

> Move the code that put one idle sh(hot in cache, but happens to be
> zero referenced) back to active stage to __find_stripe(). Because
> that's what need to do every time you invoke __find_stripe().
> 
> Moving it there avoids duplicate code, as well as makes a bit more
> sense, IMO, as it tells a whole story now.

Thanks for this.  It is a good cleanup.

However I don't want to make any new changes to the RAID5 code until I find a
couple of bugs that I'm hunting.  So I won't apply it just yet.
Remind me in a couple of weeks if I seem to have forgotten.

Thanks,
NeilBrown


> 
> Signed-off-by: Yuanhan Liu 
> ---
>  drivers/md/raid5.c | 50 ++
>  1 file changed, 18 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 77dfd72..e7fa818 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -567,8 +567,25 @@ static struct stripe_head *__find_stripe(struct r5conf 
> *conf, sector_t sector,
>  
>   pr_debug("__find_stripe, sector %llu\n", (unsigned long long)sector);
>   hlist_for_each_entry(sh, stripe_hash(conf, sector), hash)
> - if (sh->sector == sector && sh->generation == generation)
> + if (sh->sector == sector && sh->generation == generation) {
> + if (!atomic_inc_not_zero(&sh->count)) {
> + spin_lock(&conf->device_lock);
> + if (!atomic_read(&sh->count)) {
> + if (!test_bit(STRIPE_HANDLE, 
> &sh->state))
> + 
> atomic_inc(&conf->active_stripes);
> + BUG_ON(list_empty(&sh->lru) &&
> +!test_bit(STRIPE_EXPANDING, 
> &sh->state));
> + list_del_init(&sh->lru);
> + if (sh->group) {
> + sh->group->stripes_cnt--;
> + sh->group = NULL;
> + }
> + }
> + atomic_inc(&sh->count);
> + spin_unlock(&conf->device_lock);
> + }
>   return sh;
> + }
>   pr_debug("__stripe %llu not in cache\n", (unsigned long long)sector);
>   return NULL;
>  }
> @@ -698,21 +715,6 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
>   init_stripe(sh, sector, previous);
>   atomic_inc(&sh->count);
>   }
> - } else if (!atomic_inc_not_zero(&sh->count)) {
> - spin_lock(&conf->device_lock);
> - if (!atomic_read(&sh->count)) {
> - if (!test_bit(STRIPE_HANDLE, &sh->state))
> - atomic_inc(&conf->active_stripes);
> - BUG_ON(list_empty(&sh->lru) &&
> -!test_bit(STRIPE_EXPANDING, &sh->state));
> - list_del_init(&sh->lru);
> - if (sh->group) {
> - sh->group->stripes_cnt--;
> - sh->group = NULL;
> - }
> - }
> - atomic_inc(&sh->count);
> - spin_unlock(&conf->device_lock);
>   }
>   } while (sh == NULL);
>  
> @@ -771,22 +773,6 @@ static void stripe_add_to_batch_list(struct r5conf 
> *conf, struct stripe_head *sh
>   hash = stripe_hash_locks_hash(head_sector);
>   spin_lock_irq(conf->hash_locks + hash);
>   head = __find_stripe(conf, head_sector, conf->generation);
> - if (head && !atomic_inc_not_zero(&head->count)) {
> - spin_lock(&conf->device_lock);
> - if (!atomic_read(&head->count)) {
> - if (!test_bit(STRIPE_HANDLE, &head->state))
> - atomic_inc(&conf->active_stripes);
> - BUG_ON(list_empty(&head->lru) &&
> -!test_bit(STRIPE_EXPANDING, &head->state));
> - list_del_init(&head->lru);
> - if (head->group) {
> - head->group->stripes_cnt--;
> - head->group = NULL;
> - }
> - }
> - atomic_inc(&head->count);
> - spin_unlock(&conf->device_lock);
> - }
>   spin_unlock_irq(conf->hash_locks + hash);
>  
>   if (!head)



pgpZF8gy1TP_a.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 2/2] watchdog: dw_wdt: keepalive the watchdog at write time

2015-05-07 Thread Dmitry Torokhov
On Thu, May 07, 2015 at 09:27:45PM -0700, Doug Anderson wrote:
> If you've got code that does this in a tight loop
>   1. Open watchdog
>   2. Send 'expect close'
>   3. Close watchdog
> ...you'll eventually trigger a watchdog reset.  You can reproduce this
> by using daisydog (1) and running:
>   while true; do daisydog -c > /dev/null; done
> 
> The problem is that each time you write to the watchdog for 'expect
> close' it moves the timer .5 seconds out.  The timer thus never fires
> and never pats the watchdog for you.
> 
> 1: http://git.chromium.org/gitweb/?p=chromiumos/third_party/daisydog.git
> 
> Signed-off-by: Doug Anderson 
> Reviewed-by: Guenter Roeck 
> Tested-by: Jisheng Zhang 

Reviewed-by: Dmitry Torokhov 

> ---
> Changes in v2: None
> 
>  drivers/watchdog/dw_wdt.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
> index a284abd..6ea0634 100644
> --- a/drivers/watchdog/dw_wdt.c
> +++ b/drivers/watchdog/dw_wdt.c
> @@ -215,6 +215,7 @@ static ssize_t dw_wdt_write(struct file *filp, const char 
> __user *buf,
>   }
>  
>   dw_wdt_set_next_heartbeat();
> + dw_wdt_keepalive();
>   mod_timer(&dw_wdt.timer, jiffies + WDT_TIMEOUT);
>  
>   return len;
> -- 
> 2.2.0.rc0.207.ga3a616c
> 

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 04/10] eeprom: Add a simple EEPROM framework for eeprom consumers

2015-05-07 Thread Sascha Hauer
On Tue, May 05, 2015 at 12:46:32PM +0100, Srinivas Kandagatla wrote:
> Hi Stephen,
> 
> Sorry I took so long to reply.
> 
> 
> On 09/04/15 15:45, Stephen Boyd wrote:
> >On 04/07, Srinivas Kandagatla wrote:
> >>On 07/04/15 19:45, Stephen Boyd wrote:
> >>>On 03/30, Srinivas Kandagatla wrote:
> >>>
> >>>Do you have an overview of how to use these APIs? Maybe some
> >>>Documentation/ is in order? I'm mostly interested in how the
> >>>blocks array is supposed to work and how this hooks up to drivers
> >>>that are using DT.
> >>
> >>Only doc ATM is function level kernel doc in c file.
> >>May be I can explain you for now and I will try to add some
> >>documentation with some usage examples in next version.
> >
> >Thanks.
> >
> >>
> >>eeprom block array is just another way intended to get hold of
> >>eeprom content for non-DT providers/consumers, but DT
> >>consumers/providers can also use it. As of today SOC/mach level code
> >>could use it as well.
> >>
> >>In eeprom_cell_get() case the lookup of provider is done based on
> >>provider name, this provider name is generally supplied by all the
> >>providers (both DT/non DT).
> >>
> >>for example in qfprom case,
> >>provider is registered from DT with eeprom config containing a unique name:
> >>static struct eeprom_config econfig = {
> >>.name = "qfprom",
> >>.id = 0,
> >>};
> >>
> >>In the consumer case, the tsens driver could do some like in non DT way:
> >>
> >>struct eeprom_block blocks[4] ={
> >>{
> >>.offset = 0x404,
> >>.count = 0x4,
> >>},
> >>{
> >>.offset = 0x408,
> >>.count = 0x4,
> >>},
> >>{
> >>.offset = 0x40c,
> >>.count = 0x4,
> >>},
> >>{
> >>.offset = 0x410,
> >>.count = 0x4,
> >>},
> >>};
> >>calib_cell = eeprom_cell_get("qfprom0", blocks, 4);
> >>
> >>
> >>Or in DT way
> >>calib_cell  = of_eeprom_cell_get(np, "calib");
> >>
> >
> >Ok. I guess I was hoping for a more device centric approach like
> >we have for clks/regulators/etc. That way a driver doesn't need
> >to know it's using DT or not to figure out which API to call.
> 
> That would be the best. Its easy to wrap up whats in eeprom core to
> eeprom_get_cell(dev, "cell-name") for both DT and non-dt cases, if I
> remove the nasty global name space thing.
> 
> Only thing which is limiting it is the existing bindings which are
> just phandle based. For eeprom to be more of device centric we need
> more
> generic bindings/property names like
> 
> nvrom-cell = <&abc>, <&edf>
> nvrom-cell-names = "cell1", "cell2";
> 
> Also we can have name associated to each eeprom cell which would
> help for non-dt cases. So they can just lookup by the cell name.
> 
> 
> Sacha, Are you ok with such binding?  As this can provide a single
> interface for dt and non-dt. I remember you requested for changing
> from generic properties to specific property names.

Yes, I am fine with such a binding. The same type of binding is used for
clocks and other stuff already, so it has proven good and people are
famliar with it.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] watchdog: dw_wdt: No need for a spinlock

2015-05-07 Thread Dmitry Torokhov
On Thu, May 07, 2015 at 09:27:44PM -0700, Doug Anderson wrote:
> Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
> problem is that while holding the spinlock we call:
> -> dw_wdt_set_top()
>-> dw_wdt_top_in_seconds()
>   -> clk_get_rate()
>  -> clk_prepare_lock()
> -> mutex_lock()
> 
> Locking a mutex while holding a spinlock is not allowed and leads to
> warnings like "BUG: spinlock wrong CPU on CPU#1", among other
> problems.
> 
> There's no reason to use a spinlock.  Only dw_wdt_open() was protected
> and the test_and_set_bit() at the start of that function protects us
> anyway.
> 
> Signed-off-by: Doug Anderson 
> ---
> Changes in v2:
> - Don't switch to mutex; just don't use spinlock at all as per Dmitry

Reviewed-by: Dmitry Torokhov 

> 
>  drivers/watchdog/dw_wdt.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
> index d0bb949..a284abd 100644
> --- a/drivers/watchdog/dw_wdt.c
> +++ b/drivers/watchdog/dw_wdt.c
> @@ -35,7 +35,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -61,7 +60,6 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once 
> started "
>  #define WDT_TIMEOUT  (HZ / 2)
>  
>  static struct {
> - spinlock_t  lock;
>   void __iomem*regs;
>   struct clk  *clk;
>   unsigned long   in_use;
> @@ -177,7 +175,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>   /* Make sure we don't get unloaded. */
>   __module_get(THIS_MODULE);
>  
> - spin_lock(&dw_wdt.lock);
>   if (!dw_wdt_is_enabled()) {
>   /*
>* The watchdog is not currently enabled. Set the timeout to
> @@ -190,8 +187,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>  
>   dw_wdt_set_next_heartbeat();
>  
> - spin_unlock(&dw_wdt.lock);
> -
>   return nonseekable_open(inode, filp);
>  }
>  
> @@ -348,8 +343,6 @@ static int dw_wdt_drv_probe(struct platform_device *pdev)
>   if (ret)
>   return ret;
>  
> - spin_lock_init(&dw_wdt.lock);
> -
>   ret = misc_register(&dw_wdt_miscdev);
>   if (ret)
>   goto out_disable_clk;
> -- 
> 2.2.0.rc0.207.ga3a616c
> 

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

2015-05-07 Thread Oza (Pawandeep) Oza
It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

Oza: 
We have to place red button as our last resort, if we don’t press we pass the 
time or miss the point where we can go back and debug.
So that is something by design.

Regards,
-Oza


-Original Message-
From: Mike Galbraith [mailto:umgwanakikb...@gmail.com] 
Sent: Friday, May 08, 2015 10:42 AM
To: Oza (Pawandeep) Oza
Cc: pawandeep oza; linux-kernel@vger.kernel.org; malayasen rout
Subject: Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

On Fri, 2015-05-08 at 04:16 +, Oza (Pawandeep) Oza wrote:
> So Mike, is this reason strong enough for you ?

Nope.  I think you did the right thing in removing your dependency on
jiffies reliability in a dying box.  You don't have to convince me of
anything though, CC timer subsystem maintainer, see what he says.

> I understand your point: solve the BUG, and I do tend to agree with you.
> 
> But by design and implementation, the BUG() is just a beginning of the end 
> for dying kernel.
> And what happens in between this 'the beginning' and 'the end' is not less 
> important. 
> (because say,  on our platform we want to get clean RAMDUMP to analyze what 
> happened, and for that we want to get clean reboot)

I don't see anybody else having any trouble getting crash dumps.  I
spent yet another long day just yesterday, rummaging through one.

> Also,
> If somebody's design is to legally Crash the kernel (e.g. where kernel is 
> actually not faulty).
> Then, I do expect that tick/timekeeping framework do its job as long as it 
> can do, and it should do, because kernel is not faulty.
> But in this case it doesn’t handover jiffies incrementing job sanely.

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

-Mike



Re: [PATCH 00/12] [RFC] x86: Memory Protection Keys

2015-05-07 Thread Kevin Easton
On Thu, May 07, 2015 at 08:18:43PM +0100, One Thousand Gnomes wrote:
> > We could keep heap metadata as R/O and only make it R/W inside of
> > malloc() itself to catch corruption more quickly.
> 
> If you implement multiple malloc pools you can chop up lots of stuff.
> 
> In library land it isn't just stuff like malloc, you can use it as
> a debug weapon to protect library private data from naughty application
> code.

How could a library (or debugger, for that matter) arbitrate ownership
of the protection domains with the application?

One interesting use for it might be to be to provide an interface to
allocate memory and associate it with a lock that's supposed to be held
while accessing that memory.  The allocation function hashes the lock
address down to one of the 15 non-zero protection domains and applies 
that key to the memory, the lock function then adds RW access to the
appropriate protection domain and the unlock function removes it.

- Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] pinctrl: move strict option to pinmux_ops

2015-05-07 Thread Sonic Zhang
Hi Walleij,

Acked-by: Sonic Zhang 


Sonic

On Thu, May 7, 2015 at 5:53 PM, Linus Walleij  wrote:
> While the pinmux_ops are ideally just a vtable for pin mux
> calls, the "strict" setting belongs so intuitively with the
> pin multiplexing that we should move it here anyway. Putting
> it in the top pinctrl_desc makes no sense.
>
> Cc: Sonic Zhang 
> Signed-off-by: Linus Walleij 
> ---
>  Documentation/pinctrl.txt   | 2 +-
>  drivers/pinctrl/pinctrl-adi2.c  | 2 +-
>  drivers/pinctrl/pinmux.c| 4 ++--
>  include/linux/pinctrl/pinctrl.h | 3 ---
>  include/linux/pinctrl/pinmux.h  | 4 
>  5 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/pinctrl.txt b/Documentation/pinctrl.txt
> index d6b2bed94c43..4976389e432d 100644
> --- a/Documentation/pinctrl.txt
> +++ b/Documentation/pinctrl.txt
> @@ -73,7 +73,6 @@ static struct pinctrl_desc foo_desc = {
> .pins = foo_pins,
> .npins = ARRAY_SIZE(foo_pins),
> .owner = THIS_MODULE,
> -   .strict = true,
>  };
>
>  int __init foo_probe(void)
> @@ -715,6 +714,7 @@ static struct pinmux_ops foo_pmxops = {
> .get_function_name = foo_get_fname,
> .get_function_groups = foo_get_groups,
> .set_mux = foo_set_mux,
> +   .strict = true,
>  };
>
>  /* Pinmux operations are handled by some pin controller */
> diff --git a/drivers/pinctrl/pinctrl-adi2.c b/drivers/pinctrl/pinctrl-adi2.c
> index fbd492668da1..49df9037b41e 100644
> --- a/drivers/pinctrl/pinctrl-adi2.c
> +++ b/drivers/pinctrl/pinctrl-adi2.c
> @@ -703,6 +703,7 @@ static struct pinmux_ops adi_pinmux_ops = {
> .get_function_name = adi_pinmux_get_func_name,
> .get_function_groups = adi_pinmux_get_groups,
> .gpio_request_enable = adi_pinmux_request_gpio,
> +   .strict = true,
>  };
>
>
> @@ -710,7 +711,6 @@ static struct pinctrl_desc adi_pinmux_desc = {
> .name = DRIVER_NAME,
> .pctlops = &adi_pctrl_ops,
> .pmxops = &adi_pinmux_ops,
> -   .strict = true,
> .owner = THIS_MODULE,
>  };
>
> diff --git a/drivers/pinctrl/pinmux.c b/drivers/pinctrl/pinmux.c
> index 2546fa783464..c58c168b06c2 100644
> --- a/drivers/pinctrl/pinmux.c
> +++ b/drivers/pinctrl/pinmux.c
> @@ -107,7 +107,7 @@ static int pin_request(struct pinctrl_dev *pctldev,
> desc->name, desc->gpio_owner, owner);
> goto out;
> }
> -   if (pctldev->desc->strict && desc->mux_usecount &&
> +   if (ops->strict && desc->mux_usecount &&
> strcmp(desc->mux_owner, owner)) {
> dev_err(pctldev->dev,
> "pin %s already requested by %s; cannot claim 
> for %s\n",
> @@ -123,7 +123,7 @@ static int pin_request(struct pinctrl_dev *pctldev,
> desc->name, desc->mux_owner, owner);
> goto out;
> }
> -   if (pctldev->desc->strict && desc->gpio_owner) {
> +   if (ops->strict && desc->gpio_owner) {
> dev_err(pctldev->dev,
> "pin %s already requested by %s; cannot claim 
> for %s\n",
> desc->name, desc->gpio_owner, owner);
> diff --git a/include/linux/pinctrl/pinctrl.h b/include/linux/pinctrl/pinctrl.h
> index fc6b0348c375..66e4697516de 100644
> --- a/include/linux/pinctrl/pinctrl.h
> +++ b/include/linux/pinctrl/pinctrl.h
> @@ -114,8 +114,6 @@ struct pinctrl_ops {
>   * of the pins field above
>   * @pctlops: pin control operation vtable, to support global concepts like
>   * grouping of pins, this is optional.
> - * @strict: check both gpio_owner and mux_owner strictly before approving
> -   the pin request
>   * @pmxops: pinmux operations vtable, if you support pinmuxing in your driver
>   * @confops: pin config operations vtable, if you support pin configuration 
> in
>   * your driver
> @@ -134,7 +132,6 @@ struct pinctrl_desc {
> const struct pinctrl_ops *pctlops;
> const struct pinmux_ops *pmxops;
> const struct pinconf_ops *confops;
> -   bool strict;
> struct module *owner;
>  #ifdef CONFIG_GENERIC_PINCONF
> unsigned int num_custom_params;
> diff --git a/include/linux/pinctrl/pinmux.h b/include/linux/pinctrl/pinmux.h
> index 511bda9ed4bf..d3740fa7073f 100644
> --- a/include/linux/pinctrl/pinmux.h
> +++ b/include/linux/pinctrl/pinmux.h
> @@ -56,6 +56,9 @@ struct pinctrl_dev;
>   * depending on whether the GPIO is configured as input or output,
>   * a direction selector function may be implemented as a backing
>   * to the GPIO controllers that need pin muxing.
> + * @strict: do not allow simultaneous use of the same pin for GPIO and 
> another
> + * function. Check both gpio_owner and mux_owner strictly before 
> approving
> + * the pin request.
>   */
>  struct pinmux_ops {
> int (*request) (

[PATCH] iscsi_ibft: filter null v4-mapped v6 addresses

2015-05-07 Thread Chris Leech
I've had reports of UEFI platforms failing iSCSI boot in various
configurations, that ended up being caused by network initialization
scripts getting tripped up by unexpected null addresses (0.0.0.0) being
reported for gateways, dhcp servers, and dns servers.

The tianocore EDK2 iSCSI driver generates an iBFT table that always uses
IPv4-mapped IPv6 addresses for the NIC structure fields.  This results
in values that are "not present or not specified" being reported as
:::0.0.0.0 rather than all zeros as specified.

The iscsi_ibft module filters unspecified fields from the iBFT from
sysfs, preventing userspace from using invalid values and making it easy
to check for the presence of a value.  This currently fails in regard to
these mapped null addresses.

In order to remain consistent with how the iBFT information is exposed,
we should accommodate the behavior of the tianocore iSCSI driver as it's
already in the wild in a large number of servers.

Tested under qemu using an OVMF build of tianocore EDK2.

Signed-off-by: Chris Leech 
---
 drivers/firmware/iscsi_ibft.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c
index 071c2c9..7279123 100644
--- a/drivers/firmware/iscsi_ibft.c
+++ b/drivers/firmware/iscsi_ibft.c
@@ -186,8 +186,20 @@ struct ibft_kobject {
 
 static struct iscsi_boot_kset *boot_kset;
 
+/* fully null address */
 static const char nulls[16];
 
+/* IPv4-mapped IPv6 :::0.0.0.0 */
+static const char mapped_nulls[16] = { 0x00, 0x00, 0x00, 0x00,
+   0x00, 0x00, 0x00, 0x00,
+   0x00, 0x00, 0xff, 0xff,
+   0x00, 0x00, 0x00, 0x00 };
+
+static int address_not_null(u8 *ip)
+{
+   return (memcmp(ip, nulls, 16) && memcmp(ip, mapped_nulls, 16));
+}
+
 /*
  * Helper functions to parse data properly.
  */
@@ -445,7 +457,7 @@ static umode_t ibft_check_nic_for(void *data, int type)
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_IP_ADDR:
-   if (memcmp(nic->ip_addr, nulls, sizeof(nic->ip_addr)))
+   if (address_not_null(nic->ip_addr))
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_SUBNET_MASK:
@@ -456,21 +468,19 @@ static umode_t ibft_check_nic_for(void *data, int type)
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_GATEWAY:
-   if (memcmp(nic->gateway, nulls, sizeof(nic->gateway)))
+   if (address_not_null(nic->gateway))
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_PRIMARY_DNS:
-   if (memcmp(nic->primary_dns, nulls,
-  sizeof(nic->primary_dns)))
+   if (address_not_null(nic->primary_dns))
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_SECONDARY_DNS:
-   if (memcmp(nic->secondary_dns, nulls,
-  sizeof(nic->secondary_dns)))
+   if (address_not_null(nic->secondary_dns))
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_DHCP:
-   if (memcmp(nic->dhcp, nulls, sizeof(nic->dhcp)))
+   if (address_not_null(nic->dhcp))
rc = S_IRUGO;
break;
case ISCSI_BOOT_ETH_VLAN:
@@ -536,23 +546,19 @@ static umode_t __init ibft_check_initiator_for(void 
*data, int type)
rc = S_IRUGO;
break;
case ISCSI_BOOT_INI_ISNS_SERVER:
-   if (memcmp(init->isns_server, nulls,
-  sizeof(init->isns_server)))
+   if (address_not_null(init->isns_server))
rc = S_IRUGO;
break;
case ISCSI_BOOT_INI_SLP_SERVER:
-   if (memcmp(init->slp_server, nulls,
-  sizeof(init->slp_server)))
+   if (address_not_null(init->slp_server))
rc = S_IRUGO;
break;
case ISCSI_BOOT_INI_PRI_RADIUS_SERVER:
-   if (memcmp(init->pri_radius_server, nulls,
-  sizeof(init->pri_radius_server)))
+   if (address_not_null(init->pri_radius_server))
rc = S_IRUGO;
break;
case ISCSI_BOOT_INI_SEC_RADIUS_SERVER:
-   if (memcmp(init->sec_radius_server, nulls,
-  sizeof(init->sec_radius_server)))
+   if (address_not_null(init->sec_radius_server))
rc = S_IRUGO;
break;
case ISCSI_BOOT_INI_INITIATOR_NAME:
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info

Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

2015-05-07 Thread Mike Galbraith
On Fri, 2015-05-08 at 04:16 +, Oza (Pawandeep) Oza wrote:
> So Mike, is this reason strong enough for you ?

Nope.  I think you did the right thing in removing your dependency on
jiffies reliability in a dying box.  You don't have to convince me of
anything though, CC timer subsystem maintainer, see what he says.

> I understand your point: solve the BUG, and I do tend to agree with you.
> 
> But by design and implementation, the BUG() is just a beginning of the end 
> for dying kernel.
> And what happens in between this 'the beginning' and 'the end' is not less 
> important. 
> (because say,  on our platform we want to get clean RAMDUMP to analyze what 
> happened, and for that we want to get clean reboot)

I don't see anybody else having any trouble getting crash dumps.  I
spent yet another long day just yesterday, rummaging through one.

> Also,
> If somebody's design is to legally Crash the kernel (e.g. where kernel is 
> actually not faulty).
> Then, I do expect that tick/timekeeping framework do its job as long as it 
> can do, and it should do, because kernel is not faulty.
> But in this case it doesn’t handover jiffies incrementing job sanely.

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 0/6] powernv: cpufreq: Report frequency throttle by OCC

2015-05-07 Thread Viresh Kumar
On 4 May 2015 at 14:24, Shilpasri G Bhat  wrote:
> This patchset intends to add frequency throttle reporting mechanism
> to powernv-cpufreq driver when OCC throttles the frequency. OCC is an
> On-Chip-Controller which takes care of the power and thermal safety of
> the chip. The CPU frequency can be throttled during an OCC reset or
> when OCC tries to limit the max allowed frequency. The patchset will
> report such conditions so as to keep the user informed about reason
> for the drop in performance of workloads when frequency is throttled.
>
> Changes from v2:
> - Split into multiple patches
> - Semantic fixes
>
> Shilpasri G Bhat (6):
>   cpufreq: poowernv: Handle throttling due to Pmax capping at chip level
>   powerpc/powernv: Add definition of OPAL_MSG_OCC message type
>   cpufreq: powernv: Register for OCC related opal_message notification
>   cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE
>   cpufreq: powernv: Report Psafe only if PMSR.psafe_mode_active bit is
> set
>   cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling
>
>  arch/powerpc/include/asm/opal-api.h |   8 ++
>  drivers/cpufreq/powernv-cpufreq.c   | 199 
> +---
>  2 files changed, 192 insertions(+), 15 deletions(-)

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] block: discard bdi_unregister() in favour of bdi_destroy()

2015-05-07 Thread NeilBrown


bdi_unregister() now contains very little functionality.

It contains a "WARN_ON" if bdi->dev is NULL.  This warning is of no
real consequence as bdi->dev isn't needed by anything else in the function,
and it triggers if
   blk_cleanup_queue() -> bdi_destroy()
is called before bdi_unregister, which happens since
  Commit: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.")

So this isn't wanted.

It also calls bdi_set_min_ratio().  This needs to be called after
writes through the bdi have all been flushed, and before the bdi is destroyed.
Calling it early is better than calling it late as it frees up a global
resource.

Calling it immediately after bdi_wb_shutdown() in bdi_destroy()
perfectly fits these requirements.

So bdi_unregister() can be discarded with the important content moved to
bdi_destroy(), as can the
  writeback_bdi_unregister
event which is already not used.

Reported-by: Mike Snitzer 
Cc: sta...@vger.kernel.org (v4.0)
Fixes: c4db59d31e39 ("fs: don't reassign dirty inodes to 
default_backing_dev_info")
Fixes: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.")
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Dan Williams 
Tested-by: Nicholas Moulin 
Signed-off-by: NeilBrown 

---

hi Jens,
 this is a revised version of the comment - no code change - make it suitable 
to add
to your linux-block tree.

Thanks,
NeilBrown


diff --git a/block/genhd.c b/block/genhd.c
index e351fc521053..1d4435478e8a 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -657,7 +657,6 @@ void del_gendisk(struct gendisk *disk)
disk->flags &= ~GENHD_FL_UP;
 
sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
-   bdi_unregister(&disk->queue->backing_dev_info);
blk_unregister_queue(disk);
blk_unregister_region(disk_devt(disk), disk->minors);
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index aff923ae8c4b..d87d8eced064 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -116,7 +116,6 @@ __printf(3, 4)
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
const char *fmt, ...);
 int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev);
-void bdi_unregister(struct backing_dev_info *bdi);
 int __must_check bdi_setup_and_register(struct backing_dev_info *, char *);
 void bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages,
enum wb_reason reason);
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 880dd7437172..c178d13d6f4c 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -250,7 +250,6 @@ DEFINE_EVENT(writeback_class, name, \
 DEFINE_WRITEBACK_EVENT(writeback_nowork);
 DEFINE_WRITEBACK_EVENT(writeback_wake_background);
 DEFINE_WRITEBACK_EVENT(writeback_bdi_register);
-DEFINE_WRITEBACK_EVENT(writeback_bdi_unregister);
 
 DECLARE_EVENT_CLASS(wbc_class,
TP_PROTO(struct writeback_control *wbc, struct backing_dev_info *bdi),
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6dc4580df2af..000e7b3b9896 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -359,23 +359,6 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
flush_delayed_work(&bdi->wb.dwork);
 }
 
-/*
- * Called when the device behind @bdi has been removed or ejected.
- *
- * We can't really do much here except for reducing the dirty ratio at
- * the moment.  In the future we should be able to set a flag so that
- * the filesystem can handle errors at mark_inode_dirty time instead
- * of only at writeback time.
- */
-void bdi_unregister(struct backing_dev_info *bdi)
-{
-   if (WARN_ON_ONCE(!bdi->dev))
-   return;
-
-   bdi_set_min_ratio(bdi, 0);
-}
-EXPORT_SYMBOL(bdi_unregister);
-
 static void bdi_wb_init(struct bdi_writeback *wb, struct backing_dev_info *bdi)
 {
memset(wb, 0, sizeof(*wb));
@@ -443,6 +426,7 @@ void bdi_destroy(struct backing_dev_info *bdi)
int i;
 
bdi_wb_shutdown(bdi);
+   bdi_set_min_ratio(bdi, 0);
 
WARN_ON(!list_empty(&bdi->work_list));
WARN_ON(delayed_work_pending(&bdi->wb.dwork));


pgpgQaaepr9Cy.pgp
Description: OpenPGP digital signature


Laptop will not resume from suspend

2015-05-07 Thread Warren Clemmons
Dear Kernel Maintainer,

Submitting the following information for kernel bug review.

[1.] Lenovo SL510 will not resume from suspend
[2.] When using the following three steps to suspend the laptop the will
not resume from suspend. The suspend light flash rapidly instead of slowly.
1. systemctl suspend
2. closing the laptop lid
3. selecting suspend from the desktop gui
[3.]
[4.]Linux version 4.1.0-040100rc2-generic (kernel@gomeisa) (gcc version
4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201505032335 SMP Mon May 4 03:36:35
UTC 2015
[5.]No Oops message
[6.]
[7.]Description:Ubuntu 15.04
Release:15.04
[7.1]Linux 7 4.1.0-040100rc2-generic #201505032335 SMP Mon May 4 03:36:35
UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Gnu C  ver_linux:
binutils   2.25
util-linux 2.25.2
mount  debug
module-init-tools  18
e2fsprogs  1.42.12
pcmciautils018
PPP2.4.6
Linux C Library2.21
Dynamic linker (ldd)   2.21
Procps 3.3.9
Net-tools  1.60
Kbd1.15.5
Sh-utils   8.23
wireless-tools 30
Modules Loaded snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec
snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi
kvm_intel arc4 kvm iwldvm mac80211 snd_seq i915 joydev uvcvideo
videobuf2_vmalloc videobuf2_memops serio_raw iwlwifi videobuf2_core
v4l2_common videodev snd_seq_device media thinkpad_acpi nvram snd_timer
drm_kms_helper lpc_ich jmb38x_ms cfg80211 snd drm memstick shpchp
i2c_algo_bit video wmi soundcore 8250_fintek ip6t_REJECT nf_reject_ipv6
nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT
mac_hid nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp
xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter
coretemp ip6_tables parport_pc nf_conntrack_netbios_ns ppdev
nf_conntrack_broadcast nf_nat_ftp nf_nat lp nf_conntrack_ftp nf_conntrack
parport iptable_filter ip_tables x_tables autofs4 psmouse ahci libahci
r8169 mii sdhci_pci sdhci
[7.2]processor: 0
vendor_id: GenuineIntel
cpu family: 6
model: 23
model name: Intel(R) Core(TM)2 Duo CPU T6670  @ 2.20GHz
stepping: 10
microcode: 0xa0b
cpu MHz: 1200.000
cache size: 2048 KB
physical id: 0
siblings: 2
core id: 0
cpu cores: 2
apicid: 0
initial apicid: 0
fpu: yes
fpu_exception: yes
cpuid level: 13
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida
dtherm tpr_shadow vnmi flexpriority
bugs:
bogomips: 4389.17
clflush size: 64
cache_alignment: 64
address sizes: 36 bits physical, 48 bits virtual
power management:

processor: 1
vendor_id: GenuineIntel
cpu family: 6
model: 23
model name: Intel(R) Core(TM)2 Duo CPU T6670  @ 2.20GHz
stepping: 10
microcode: 0xa0b
cpu MHz: 1200.000
cache size: 2048 KB
physical id: 0
siblings: 2
core id: 1
cpu cores: 2
apicid: 1
initial apicid: 1
fpu: yes
fpu_exception: yes
cpuid level: 13
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida
dtherm tpr_shadow vnmi flexpriority
bugs:
bogomips: 4389.17
clflush size: 64
cache_alignment: 64
address sizes: 36 bits physical, 48 bits virtual
power management:
[7.3]snd_hda_codec_hdmi 53248 1 - Live 0x
snd_hda_codec_realtek 86016 1 - Live 0x
snd_hda_codec_generic 77824 1 snd_hda_codec_realtek, Live 0x
snd_hda_intel 32768 3 - Live 0x
snd_hda_controller 36864 1 snd_hda_intel, Live 0x
snd_hda_codec 122880 5
snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_codec_generic,snd_hda_intel,snd_hda_controller,
Live 0x
snd_hda_core 36864 5
snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_codec_generic,snd_hda_controller,snd_hda_codec,
Live 0x
snd_hwdep 16384 1 snd_hda_codec, Live 0x
snd_pcm 114688 4
snd_hda_codec_hdmi,snd_hda_intel,snd_hda_controller,snd_hda_codec, Live
0x
snd_seq_midi 16384 0 - Live 0x
snd_seq_midi_event 16384 1 snd_seq_midi, Live 0x
snd_rawmidi 32768 1 snd_seq_midi, Live 0x
kvm_intel 159744 0 - Live 0x
arc4 16384 2 - Live 0x
kvm 507904 1 kvm_inte

Re: [PATCH v2 1/2] watchdog: dw_wdt: No need for a spinlock

2015-05-07 Thread Jisheng Zhang
On Thu, 7 May 2015 21:27:44 -0700
Doug Anderson  wrote:

> Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
> problem is that while holding the spinlock we call:
> -> dw_wdt_set_top()
>-> dw_wdt_top_in_seconds()
>   -> clk_get_rate()
>  -> clk_prepare_lock()
> -> mutex_lock()
> 
> Locking a mutex while holding a spinlock is not allowed and leads to
> warnings like "BUG: spinlock wrong CPU on CPU#1", among other
> problems.
> 
> There's no reason to use a spinlock.  Only dw_wdt_open() was protected
> and the test_and_set_bit() at the start of that function protects us
> anyway.
> 
> Signed-off-by: Doug Anderson 
> ---
> Changes in v2:
> - Don't switch to mutex; just don't use spinlock at all as per Dmitry
> 
>  drivers/watchdog/dw_wdt.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
> index d0bb949..a284abd 100644
> --- a/drivers/watchdog/dw_wdt.c
> +++ b/drivers/watchdog/dw_wdt.c
> @@ -35,7 +35,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -61,7 +60,6 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once 
> started "
>  #define WDT_TIMEOUT  (HZ / 2)
>  
>  static struct {
> - spinlock_t  lock;
>   void __iomem*regs;
>   struct clk  *clk;
>   unsigned long   in_use;
> @@ -177,7 +175,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>   /* Make sure we don't get unloaded. */
>   __module_get(THIS_MODULE);
>  
> - spin_lock(&dw_wdt.lock);
>   if (!dw_wdt_is_enabled()) {
>   /*
>* The watchdog is not currently enabled. Set the timeout to
> @@ -190,8 +187,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>  
>   dw_wdt_set_next_heartbeat();
>  
> - spin_unlock(&dw_wdt.lock);
> -
>   return nonseekable_open(inode, filp);
>  }
>  
> @@ -348,8 +343,6 @@ static int dw_wdt_drv_probe(struct platform_device *pdev)
>   if (ret)
>   return ret;
>  
> - spin_lock_init(&dw_wdt.lock);
> -
>   ret = misc_register(&dw_wdt_miscdev);
>   if (ret)
>   goto out_disable_clk;

Tested-by: Jisheng Zhang 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 02/10] drivers:host:fsl: Use ehci_overrides structure for EHCI drv

2015-05-07 Thread Ramneek Mehresh


> -Original Message-
> From: Alan Stern [mailto:st...@rowland.harvard.edu]
> Sent: Thursday, May 07, 2015 8:16 PM
> To: Mehresh Ramneek-B31383
> Cc: linux-kernel@vger.kernel.org; ba...@ti.com; linux-...@vger.kernel.org;
> gre...@linuxfoundation.org
> Subject: Re: [PATCH 02/10] drivers:host:fsl: Use ehci_overrides structure for
> EHCI drv
> 
> On Thu, 7 May 2015, Ramneek Mehresh wrote:
> 
> > Make use of ehci_driver_overrides structure for ehci-fsl driver
> >
> > Signed-off-by: Ramneek Mehresh 
> 
> You need to change a lot more than this.  See commit a76dd463c58e (USB:
> EHCI: make ehci-orion a separate driver) as an example of what is needed.  In
> the end, ehci-fsl.ko should be a new driver module, not compiled into ehci-
> hcd.ko.
> 
I can definitely make this change, but this patch set is about OTG functionality
fix for all FSL QorIQ socs. Changes you are asking are for FSL Host driver. For 
that 
I can float separate patch/patch set. Hence, I would request you to please 
accept the
Patch series in conext of OTG functionality fix 
> > +   ehci_init_driver(driver, &ehci_fsl_overrides);
> > +   driver->product_desc = "Freescale On-Chip EHCI Host Controller";
> > +   driver->start = ehci_run;
> > +   driver->start_port_reset = ehci_start_port_reset;
> 
> Why do you want to override driver->start?  The default value for this field 
> is
> already set to ehci_run.
> 
Will correct this
> Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/12] [RFC] x86: Memory Protection Keys

2015-05-07 Thread Ingo Molnar

* One Thousand Gnomes  wrote:

> On Thu, 7 May 2015 21:26:20 +0200
> Ingo Molnar  wrote:
> 
> > 
> > * One Thousand Gnomes  wrote:
> > 
> > > > We could keep heap metadata as R/O and only make it R/W inside of 
> > > > malloc() itself to catch corruption more quickly.
> > > 
> > > If you implement multiple malloc pools you can chop up lots of 
> > > stuff.
> > 
> > I'd say that a 64-bit address space is large enough to hide 
> > buffers in from accidental corruption, without any runtime page 
> > protection flipping overhead?
> 
> I'd say no. [...]

So if putting your buffers anywhere in a byte range of 
18446744073709551616 bytes large (well, 281474976710656 bytes with 
current CPUs) isn't enough to protect from stray writes? Could you 
outline the situations where that isn't enough?

> [...] And from actual real world demand for PK the answer is also 
> no. It's already a problem with very large data sets. [...]

So that's why I asked: what real world demand is there? Is it 
described/documented/reported anywhere public?

> [...] Worse still in many cases its a problem that nobody is 
> actually measuring or doing much about (because mprotect on many 
> gigabytes of data is expensive).

It's not necessarily expensive if the remote TLB shootdown guarantee 
is weakened (i.e. we could have an mprotect() flag that says "I don't 
need remote TLB shootdowns") - and nobody has asked for that yet 
AFAICS.

With 2MB or 1GB pages it would be even cheaper.

Also, the way databases usually protect themselves is by making a 
robust central engine and communicating with (complex) DB users via 
memory sharing and IPC.

> > I think libraries are happy enough to work without bugs - apps 
> > digging around in library data are in a "you keep all the broken 
> > pieces" situation, why would a library want to slow down every 
> > good citizen down with extra protection flipping/unflipping 
> > accesses?
> 
> For debugging, when the library maintained data is sensitive or 
> something you don't want corupted, or because the user puts security 
> first. Protection keys are an awful lot faster than mprotect.

There's no flushing of TLBs involved even locally, a PK 'flip' is just 
a handful of cycles no matter whether protections are narrowed or 
broadened, right?

> [...] You've got no synchronization and shootdowns to do just a CPU 
> register to load to indicate which mask of keys you are happy with. 
> That really changes what it is useful for, because it's cheap. It 
> means you can happily do stuff like
> 
>   while(data_blocks) {
>   allow_key_and_source_access();
>   do_crypto_func();
>   revoke_key_and_source_access();
>   do_network_io();  /* Can't accidentally leak keys or
>   input */
>   }

That looks useful if it's fast enough. I suspect a similar benefit 
could be gained if we allowed individually randomized anonymous 
mmap()s: the key wouldn't just be part of the heap, but isolated and 
randomized somewhere in a 64-bit (48-bit) address space.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] watchdog: dw_wdt: No need for a spinlock

2015-05-07 Thread Guenter Roeck

On 05/07/2015 09:27 PM, Doug Anderson wrote:

Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
problem is that while holding the spinlock we call:
-> dw_wdt_set_top()
-> dw_wdt_top_in_seconds()
   -> clk_get_rate()
  -> clk_prepare_lock()
 -> mutex_lock()

Locking a mutex while holding a spinlock is not allowed and leads to
warnings like "BUG: spinlock wrong CPU on CPU#1", among other
problems.

There's no reason to use a spinlock.  Only dw_wdt_open() was protected
and the test_and_set_bit() at the start of that function protects us
anyway.

Signed-off-by: Doug Anderson 


Reviewed-by: Guenter Roeck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] tracing/mm: Don't trace mm_page_pcpu_drain on offline cpus

2015-05-07 Thread Shreyas B Prabhu


On Wednesday 29 April 2015 10:38 PM, Steven Rostedt wrote:
> On Wed, 29 Apr 2015 21:28:38 +0530
> Shreyas B Prabhu  wrote:
> 
>>
>>
>> On Wednesday 29 April 2015 08:48 PM, Steven Rostedt wrote:
>>> On Wed, 29 Apr 2015 20:19:28 +0530
>>> Shreyas B Prabhu  wrote:
>>>
 IIUC there is no existing macro which can both add a condition and
 override printk format, hence the fall back to TRACE_EVENT_CONDITION.
>>>
>>> Hmm, want me to send you a patch that changes that?
>>>
>> I am not sure if its worth the effort now. It doesn't look like any
>> other trace point apart from the above use case will benefit from it.
>> Only smbus_write and smbus_reply seem to come close. But even they need
>> separate TP_fast_assign.
> 
> It shouldn't be a problem to implement. But I'm currently cleaning up
> those files, and any changes will cause nasty conflicts.
> 
> Lets do this. Push the current changes as is, and when I get around to
> adding a DEFINE_EVENT_PRINT_CONDITION(), we can modify that code to use
> it.
> 

Hi Steve,
Do you have any other suggestions for this patchset or will you take
them as is?

Thanks,
Shreyas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/5] Documentation: dmaengine: pxa-dma design

2015-05-07 Thread Vinod Koul
On Sat, Apr 11, 2015 at 09:40:32PM +0200, Robert Jarzmik wrote:
> Document the new design of the pxa dma driver.
> 
> Signed-off-by: Robert Jarzmik 
> ---
>  Documentation/dmaengine/pxa_dma.txt | 157 
> 
>  1 file changed, 157 insertions(+)
>  create mode 100644 Documentation/dmaengine/pxa_dma.txt
> 
> diff --git a/Documentation/dmaengine/pxa_dma.txt 
> b/Documentation/dmaengine/pxa_dma.txt
> new file mode 100644
> index 000..63db9fe
> --- /dev/null
> +++ b/Documentation/dmaengine/pxa_dma.txt
> @@ -0,0 +1,157 @@
> +PXA/MMP - DMA Slave controller
> +==
> +
> +Constraints
> +---
> +  a) Transfers hot queuing
> + A driver submitting a transfer and issuing it should be granted the 
> transfer
> + is queued even on a running DMA channel.
this is bit confusing, esp latter part.. do you mean "A driver submitting a
transfer and issuing it should be granted the transfer queue even on a
running DMA channel" ??

> + This implies that the queuing doesn't wait for the previous transfer 
> end,
> + and that the descriptor chaining is not only done in the irq/tasklet 
> code
> + triggered by the end of the transfer.
how is it differenat than current dmaengine semantics where you say
issue_pending() is invoked when current transfer finished? Here is you have
to do descriptor chaining so bit it.
> +
> +  b) All transfers having asked for confirmation should be signaled
> + Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback 
> call.
> + This implies that even if an irq/tasklet is triggered by end of tx1, but
> + at the time of irq/dma tx2 is already finished, tx1->complete() and
> + tx2->complete() should be called.
> +
> +  c) Channel residue calculation
> + A channel should be able to report how much advanced is a transfer. The
in a
> + granularity is still descriptor based.
This is not pxa specfic

> +
> +  d) Channel running state
> + A driver should be able to query if a channel is running or not. For the
> + multimedia case, such as video capture, if a transfer is submitted and 
> then
> + a check of the DMA channel reports a "stopped channel", the transfer 
> should
> + not be issued until the next "start of frame interrupt", hence the need 
> to
> + know if a channel is in running or stopped state.
How do you query that?

> +
> +  e) Bandwidth guarantee
> + The PXA architecture has 4 levels of DMAs priorities : high, normal, 
> low.
> + The high prorities get twice as much bandwidth as the normal, which get 
> twice
> + as much as the low priorities.
> + A driver should be able to request a priority, especially the real-time
> + ones such as pxa_camera with (big) throughputs.
and how..?

> +
> +  f) Transfer reusability
> + An issued and finished transfer should be "reusable". The choice of
> + "DMA_CTRL_ACK" should be left to the client, not the dma driver.
again how is this pxa specfic, if not documented we should move this to
dmaengine documentation

-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: another perf_fuzzer generated lockup

2015-05-07 Thread Vince Weaver

This is a new one I think, I hit it on the haswell machine running 
4.1-rc2.

The backtrace is complex enough I'm not really sure what's going on here.

The fuzzer has been having weird issues where it's been getting 
overflow signals from invalid fds.  This seems to happen
when an overflow signal interrupts the fuzzer mid-fork?
And the fuzzer code doesn't handle this well and attempts to call exit()
and/or kill the child from the signal handler that interrupted the 
fork() and that doesn't always go well.  I'm not sure if this is related, 
just that some of those actions seem to appear in the backtrace.


[33864.529861] [ cut here ]
[33864.534824] WARNING: CPU: 1 PID: 9852 at kernel/watchdog.c:302 
watchdog_overflow_callback+0x92/0xc0()
[33864.544682] Watchdog detected hard LOCKUP on cpu 1
[33864.549635] Modules linked in:
[33864.552943]  fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi 
coretemp snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic kvm 
snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core crct10dif_pclmul 
snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm aesni_intel aes_x86_64 lrw 
gf128mul evdev i915 iTCO_wdt iTCO_vendor_support glue_helper snd_timer ppdev 
ablk_helper psmouse drm_kms_helper cryptd snd drm pcspkr serio_raw lpc_ich 
soundcore parport_pc xhci_pci battery video processor i2c_i801 mei_me mei wmi 
i2c_algo_bit tpm_tis mfd_core xhci_hcd tpm parport button sg sr_mod sd_mod 
cdrom ehci_pci ahci ehci_hcd libahci libata e1000e ptp usbcore crc32c_intel 
scsi_mod fan usb_common pps_core thermal thermal_sys
[33864.622413] CPU: 1 PID: 9852 Comm: perf_fuzzer Not tainted 4.1.0-rc2+ #142
[33864.629776] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 
01/26/2014
[33864.637685]  81a209b5 88011ea45aa0 816d51d3 

[33864.645709]  88011ea45af0 88011ea45ae0 81072dfa 
88011ea45ac0
[33864.653731]  880119b8f800  88011ea45c40 
88011ea45ef8
[33864.661783] Call Trace:
[33864.664409][] dump_stack+0x45/0x57
[33864.670618]  [] warn_slowpath_common+0x8a/0xc0
[33864.677071]  [] warn_slowpath_fmt+0x46/0x50
[33864.683202]  [] ? native_apic_wait_icr_idle+0x24/0x30
[33864.690280]  [] watchdog_overflow_callback+0x92/0xc0
[33864.697294]  [] __perf_event_overflow+0x91/0x270
[33864.703916]  [] ? __perf_event_overflow+0xd9/0x270
[33864.710696]  [] ? x86_perf_event_set_period+0xda/0x180
[33864.717842]  [] perf_event_overflow+0x19/0x20
[33864.724195]  [] intel_pmu_handle_irq+0x1e2/0x450
[33864.730840]  [] perf_event_nmi_handler+0x2b/0x50
[33864.737436]  [] nmi_handle+0xa0/0x150
[33864.743025]  [] ? nmi_handle+0x5/0x150
[33864.748733]  [] default_do_nmi+0x4a/0x140
[33864.754705]  [] do_nmi+0x98/0xe0
[33864.759858]  [] end_repeat_nmi+0x1e/0x2e
[33864.765746]  [] ? check_chain_key+0xdb/0x1e0
[33864.772004]  [] ? check_chain_key+0xdb/0x1e0
[33864.778253]  [] ? check_chain_key+0xdb/0x1e0
[33864.784498]  <>[] 
__lock_acquire.isra.31+0x3b9/0x1000
[33864.792950]  [] ? __lock_acquire.isra.31+0x3b9/0x1000
[33864.800045]  [] lock_acquire+0xa5/0x130
[33864.805817]  [] ? __lock_task_sighand+0x6e/0x110
[33864.812468]  [] ? __lock_task_sighand+0x1a/0x110
[33864.819084]  [] _raw_spin_lock+0x31/0x40
[33864.824979]  [] ? __lock_task_sighand+0x6e/0x110
[33864.831623]  [] __lock_task_sighand+0x6e/0x110
[33864.838096]  [] ? __lock_task_sighand+0x1a/0x110
[33864.845314]  [] do_send_sig_info+0x2c/0x80
[33864.851949]  [] ? perf_swevent_event+0x67/0x90
[33864.858980]  [] send_sigio_to_task+0x12f/0x1a0
[33864.866005]  [] ? send_sigio_to_task+0x5/0x1a0
[33864.873047]  [] ? send_sigio+0x56/0x100
[33864.879411]  [] send_sigio+0xae/0x100
[33864.885564]  [] kill_fasync+0x97/0xf0
[33864.891713]  [] ? kill_fasync+0xf/0xf0
[33864.897983]  [] perf_event_wakeup+0xd4/0xf0
[33864.904662]  [] ? perf_event_wakeup+0x5/0xf0
[33864.911490]  [] ? perf_pending_event+0xe0/0x110
[33864.918580]  [] perf_pending_event+0xe0/0x110
[33864.925494]  [] irq_work_run_list+0x4c/0x80
[33864.932197]  [] irq_work_run+0x18/0x40
[33864.938469]  [] smp_trace_irq_work_interrupt+0x3f/0xc0
[33864.946263]  [] trace_irq_work_interrupt+0x6e/0x80
[33864.953646][] ? copy_page_range+0x527/0x9a0
[33864.961287]  [] ? copy_page_range+0x502/0x9a0
[33864.968265]  [] copy_process.part.23+0xc92/0x1b80
[33864.975589]  [] ? SYSC_kill+0x8e/0x230
[33864.981879]  [] do_fork+0xd8/0x420
[33864.987807]  [] ? f_setown+0x83/0xa0
[33864.993953]  [] ? SyS_fcntl+0x310/0x650
[33865.000348]  [] ? lockdep_sys_exit_thunk+0x12/0x14
[33865.007781]  [] SyS_clone+0x16/0x20
[33865.013830]  [] system_call_fastpath+0x16/0x7a
[33865.020843] ---[ end trace d3bd7d73656f3cba ]---
[33865.026418] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 
496.487 msecs
[33865.035874] perf interrupt took too long (3879951 > 5000), lowering 
kernel.perf_event_max_sample_rate to 25000

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH v3 1/2] perf/kvm: Port perf kvm to powerpc

2015-05-07 Thread Ingo Molnar

* Hemant Kumar  wrote:

>  # perf kvm stat report -p 60515
> Analyze events for pid(s) 60515, all VCPUs:
> 
>VM-EXITSamples  Samples% Time%Min Time Max
> Time Avg time
> 
> H_DATA_STORAGE   500635.30% 0.13%  1.94us 49.46us 
> 12.37us ( +-   0.52% )
> HV_DECREMENTER   445731.43% 0.02%  0.72us 16.14us  
> 1.91us ( +-   0.96% )
>SYSCALL   269018.97% 0.10%  2.84us528.24us 
> 18.29us ( +-   3.75% )
> RETURN_TO_HOST   178912.61%99.76%  1.58us 672791.91us  
> 27470.23us ( +-   3.00% )
>   EXTERNAL240 1.69% 0.00%0.69us 10.67us  
> 1.33us ( +-   5.34% )

Where is the last line misaligned? Copy & paste error or does perf kvm 
produce it in such a way?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] watchdog: dw_wdt: keepalive the watchdog at write time

2015-05-07 Thread Doug Anderson
If you've got code that does this in a tight loop
  1. Open watchdog
  2. Send 'expect close'
  3. Close watchdog
...you'll eventually trigger a watchdog reset.  You can reproduce this
by using daisydog (1) and running:
  while true; do daisydog -c > /dev/null; done

The problem is that each time you write to the watchdog for 'expect
close' it moves the timer .5 seconds out.  The timer thus never fires
and never pats the watchdog for you.

1: http://git.chromium.org/gitweb/?p=chromiumos/third_party/daisydog.git

Signed-off-by: Doug Anderson 
Reviewed-by: Guenter Roeck 
Tested-by: Jisheng Zhang 
---
Changes in v2: None

 drivers/watchdog/dw_wdt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
index a284abd..6ea0634 100644
--- a/drivers/watchdog/dw_wdt.c
+++ b/drivers/watchdog/dw_wdt.c
@@ -215,6 +215,7 @@ static ssize_t dw_wdt_write(struct file *filp, const char 
__user *buf,
}
 
dw_wdt_set_next_heartbeat();
+   dw_wdt_keepalive();
mod_timer(&dw_wdt.timer, jiffies + WDT_TIMEOUT);
 
return len;
-- 
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] watchdog: dw_wdt: No need for a spinlock

2015-05-07 Thread Doug Anderson
Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
problem is that while holding the spinlock we call:
-> dw_wdt_set_top()
   -> dw_wdt_top_in_seconds()
  -> clk_get_rate()
 -> clk_prepare_lock()
-> mutex_lock()

Locking a mutex while holding a spinlock is not allowed and leads to
warnings like "BUG: spinlock wrong CPU on CPU#1", among other
problems.

There's no reason to use a spinlock.  Only dw_wdt_open() was protected
and the test_and_set_bit() at the start of that function protects us
anyway.

Signed-off-by: Doug Anderson 
---
Changes in v2:
- Don't switch to mutex; just don't use spinlock at all as per Dmitry

 drivers/watchdog/dw_wdt.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
index d0bb949..a284abd 100644
--- a/drivers/watchdog/dw_wdt.c
+++ b/drivers/watchdog/dw_wdt.c
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -61,7 +60,6 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once 
started "
 #define WDT_TIMEOUT(HZ / 2)
 
 static struct {
-   spinlock_t  lock;
void __iomem*regs;
struct clk  *clk;
unsigned long   in_use;
@@ -177,7 +175,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
*filp)
/* Make sure we don't get unloaded. */
__module_get(THIS_MODULE);
 
-   spin_lock(&dw_wdt.lock);
if (!dw_wdt_is_enabled()) {
/*
 * The watchdog is not currently enabled. Set the timeout to
@@ -190,8 +187,6 @@ static int dw_wdt_open(struct inode *inode, struct file 
*filp)
 
dw_wdt_set_next_heartbeat();
 
-   spin_unlock(&dw_wdt.lock);
-
return nonseekable_open(inode, filp);
 }
 
@@ -348,8 +343,6 @@ static int dw_wdt_drv_probe(struct platform_device *pdev)
if (ret)
return ret;
 
-   spin_lock_init(&dw_wdt.lock);
-
ret = misc_register(&dw_wdt_miscdev);
if (ret)
goto out_disable_clk;
-- 
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] staging: unisys: remove unused variable

2015-05-07 Thread Sudip Mukherjee
On Thu, May 07, 2015 at 10:04:32PM +0200, Greg Kroah-Hartman wrote:
> On Thu, May 07, 2015 at 03:06:52PM +0530, Sudip Mukherjee wrote:
> > the previous patch of the series made this variable unused.
> 
> What do you mean?  There was only one other patch in this series, never
> send a patch that causes a build warning.
This patch is 2/2 , the 1/2 patch when applied will cause a build
warning about unused variable.

should i then send a v2 mentioning that "1/2 will cause a build warning
which is fixed in 2/2" ?

regards
sudip
> 
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: fuzzer triggers NULL pointer derefreence in x86_schedule_events

2015-05-07 Thread Vince Weaver
On Thu, 7 May 2015, Peter Zijlstra wrote:

> Indeed so; and we can make an analogous argument for hwc. However:
> 
> > I think it is more likely related to the bitmask (idxmsk).  But then
> > it is always allocated with the constraint even with the HT bug
> > workaround.  So most, likely the index is bogus and you touch outside
> > the idxmsk[] array.
> 
> [428232.701319] BUG: unable to handle kernel NULL pointer dereference at  
>  (null)
> 
> But the thing really tried to touch NULL, not some random address that
> faulted.
> 
> As always, Vince has found us a good puzzle ;-)

and sorry I haven't been much help tracking it down.  I'm trying to 
trigger it again, but this particular bug only pops up after a week or so 
of fuzzing.  

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] gpio: Add GPIO support for Broadcom STB SoCs

2015-05-07 Thread Gregory Fong
On Thu, May 07, 2015 at 10:18:49AM +0200, Paul Bolle wrote:
> Just a nit: a license mismatch.
> 
> On Wed, 2015-05-06 at 01:37 -0700, Gregory Fong wrote:
> > --- /dev/null
> > +++ b/drivers/gpio/gpio-brcmstb.c
> 
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License as
> > + * published by the Free Software Foundation version 2.
> > + *
> > + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> > + * kind, whether express or implied; without even the implied warranty
> > + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> 
> This states the license is GPL v2.
> 
> > +MODULE_LICENSE("GPL");
> 
> And, according to include/linux/module.h, this states the license is GPL
> v2 or later. So I think either the comment at the top of this file or
> the ident used in the MODULE_LICENSE() macro needs to change.

Will fix that, thanks.

Gregory
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: WARNING perfevents: irq loop stuck!

2015-05-07 Thread Vince Weaver
On Fri, 1 May 2015, Ingo Molnar wrote:

> So fffe corresponds to 2 events left until overflow, 
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> allow these super short periods.
> 
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> one similar to bdw_limit_period()? Something like the patch below?
> 
> Totally untested and such. I picked 128 because of Broadwell, but 
> lower values might work as well. You could try to increase it to 3 and 
> upwards and see which one stops triggering stuck NMI loops?

I spent a lot of time trying to come up with a test case that triggered 
this more reliably but failed.

It definitely is an issue with PMC0 being -2 causing the PMC0 bit in the 
status register getting stuck and no clearing.  Often there is also a PEBS 
event active at the same time but that might be coincidence.

With your patch applied I can't trigger the issue. I haven't tried 
narrowing down the exact value yet.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

2015-05-07 Thread Oza (Pawandeep) Oza
So Mike, is this reason strong enough for you ?

I understand your point: solve the BUG, and I do tend to agree with you.

But by design and implementation, the BUG() is just a beginning of the end for 
dying kernel.
And what happens in between this 'the beginning' and 'the end' is not less 
important. 
(because say,  on our platform we want to get clean RAMDUMP to analyze what 
happened, and for that we want to get clean reboot)

Also,
If somebody's design is to legally Crash the kernel (e.g. where kernel is 
actually not faulty).
Then, I do expect that tick/timekeeping framework do its job as long as it can 
do, and it should do, because kernel is not faulty.
But in this case it doesn’t handover jiffies incrementing job sanely.

In other words, 
"no one can relies on jiffies, or rather the code which is based on jiffies 
will never forward progress in this path"

Regards,
-Oza


-Original Message-
From: Oza (Pawandeep) Oza 
Sent: Thursday, May 07, 2015 2:17 PM
To: 'Mike Galbraith'
Cc: pawandeep oza; linux-kernel@vger.kernel.org; malayasen rout
Subject: RE: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

Oh ok.
So the reason why I cared was:

There is a code in our base which relies on jiffies, but since jiffies are not 
incrementing, the code waits there and loops forever.
And forward progress is on halt. (on cpu0, since that is the only cpu, which is 
alive)

We have changed the code to use mdelay and things move on.

But that means that in the patch which I mentioned, 
any code which relies on jiffies will stuck forever and will not allow rest of 
the code to get executed and hence no forward progress.
specially if that code is running with preempt_disable();

Regards,
-Oza


-Original Message-
From: Mike Galbraith [mailto:umgwanakikb...@gmail.com] 
Sent: Thursday, May 07, 2015 2:00 PM
To: Oza (Pawandeep) Oza
Cc: pawandeep oza; linux-kernel@vger.kernel.org; malayasen rout
Subject: Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

On Thu, 2015-05-07 at 07:05 +, Oza (Pawandeep) Oza wrote:
> : )
> 
> Well, I am not sure, if problem was communicated clearly from my side.

I understood.  I just don't understand why you'd care deeply whether
CPU0 halts or eternally waits.  Both render it harmless and useless.

-Mike

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH v3 07/11] md/raid5: split bio for chunk_aligned_read

2015-05-07 Thread NeilBrown
On Wed,  6 May 2015 23:34:17 -0700 Ming Lin  wrote:

> If a read request fits entirely in a chunk, it will be passed directly to the
> underlying device (providing it hasn't failed of course).  If it doesn't fit,
> the slightly less efficient path that uses the stripe_cache is used.
> Requests that get to the stripe cache are always completely split up as
> necessary.
> 
> So with RAID5, ripping out the merge_bvec_fn doesn't cause it to stop work,
> but could cause it to take the less efficient path more often.
> 
> All that is needed to manage this is for 'chunk_aligned_read' do some bio
> splitting, much like the RAID0 code does.
> 
> Cc: Neil Brown 
> Cc: linux-r...@vger.kernel.org
> Signed-off-by: Ming Lin 
> ---
>  drivers/md/raid5.c | 42 +-
>  1 file changed, 37 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 7f4a717..b18f548 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4738,7 +4738,7 @@ static void raid5_align_endio(struct bio *bi, int error)
>   add_bio_to_retry(raid_bi, conf);
>  }
>  
> -static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio)
> +static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio)
>  {
>   struct r5conf *conf = mddev->private;
>   int dd_idx;
> @@ -4747,7 +4747,7 @@ static int chunk_aligned_read(struct mddev *mddev, 
> struct bio * raid_bio)
>   sector_t end_sector;
>  
>   if (!in_chunk_boundary(mddev, raid_bio)) {
> - pr_debug("chunk_aligned_read : non aligned\n");
> + pr_debug("%s: non aligned\n", __func__);
>   return 0;
>   }
>   /*
> @@ -4822,6 +4822,36 @@ static int chunk_aligned_read(struct mddev *mddev, 
> struct bio * raid_bio)
>   }
>  }
>  
> +static struct bio *chunk_aligned_read(struct mddev *mddev, struct bio 
> *raid_bio)
> +{
> + struct bio *split;
> +
> + do {
> + sector_t sector = raid_bio->bi_iter.bi_sector;
> + unsigned chunk_sects = mddev->chunk_sectors;
> + unsigned sectors;
> +
> + if (likely(is_power_of_2(chunk_sects)))
> + sectors = chunk_sects - (sector & (chunk_sects-1));
> + else
> + sectors = chunk_sects - sector_div(sector, chunk_sects);

RAID5 doesn't currently allow non-power-of-2 chunks.  So this test is
pointless, but not really harmful.  Maybe someday we will.

I'm equally happy for it to stay or go.

Acked-by: NeilBrown 

Thanks,
NeilBrown


> +
> + if (sectors < bio_sectors(raid_bio)) {
> + split = bio_split(raid_bio, sectors, GFP_NOIO, 
> fs_bio_set);
> + bio_chain(split, raid_bio);
> + } else
> + split = raid_bio;
> +
> + if (!raid5_read_one_chunk(mddev, split)) {
> + if (split != raid_bio)
> + generic_make_request(raid_bio);
> + return split;
> + }
> + } while (split != raid_bio);
> +
> + return NULL;
> +}
> +
>  /* __get_priority_stripe - get the next stripe to process
>   *
>   * Full stripe writes are allowed to pass preread active stripes up until
> @@ -5099,9 +5129,11 @@ static void make_request(struct mddev *mddev, struct 
> bio * bi)
>* data on failed drives.
>*/
>   if (rw == READ && mddev->degraded == 0 &&
> -  mddev->reshape_position == MaxSector &&
> -  chunk_aligned_read(mddev,bi))
> - return;
> + mddev->reshape_position == MaxSector) {
> + bi = chunk_aligned_read(mddev, bi);
> + if (!bi)
> + return;
> + }
>  
>   if (unlikely(bi->bi_rw & REQ_DISCARD)) {
>   make_discard_request(mddev, bi);



pgprw2fVOBeD3.pgp
Description: OpenPGP digital signature


Re: [V3 PATCH 3/5] device property: Introduces device_dma_is_coherent()

2015-05-07 Thread santosh.shilim...@oracle.com

On 5/7/15 5:37 PM, Suravee Suthikulpanit wrote:

Currently, device drivers, which support both OF and ACPI,
need to call two separate APIs, of_dma_is_coherent() and
acpi_dma_is_coherent()) to determine device coherency attribute.

This patch simplifies this process by introducing a new device
property API, device_dma_is_coherent(), which calls the appropriate
interface based on the booting architecture.

Signed-off-by: Suravee Suthikulpanit 
---
  drivers/base/property.c  | 12 
  include/linux/property.h |  2 ++
  2 files changed, 14 insertions(+)

diff --git a/drivers/base/property.c b/drivers/base/property.c
index 1d0b116..8123c6e 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -14,6 +14,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 

  /**
@@ -519,3 +520,14 @@ unsigned int device_get_child_node_count(struct device 
*dev)
return count;
  }
  EXPORT_SYMBOL_GPL(device_get_child_node_count);
+
+bool device_dma_is_coherent(struct device *dev)
+{
+   if (IS_ENABLED(CONFIG_OF) && dev->of_node)


Do you really need that IS_ENABLED(CONFIG_OF) ?
In other words, dev->of_node should be null for !CONFIG_OF

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 1/4] ARM: UniPhier: add basic support for UniPhier architecture

2015-05-07 Thread Masahiro Yamada
Initial commit for a new SoC family, UniPhier, developed by
Socionext Inc. (formerly, System LSI Business Division of
Panasonic Corporation).

This commit includes a minimal set of components for booting the
kernel, including SMP support.

Signed-off-by: Masahiro Yamada 
---

Changes in v7: None
Changes in v6: None
Changes in v5:
  - Move syscon_regmap_lookup_by_compatible() call to smp_prepare_cpus handler

Changes in v4:
  - Access to SMP register with base + offset(0x1208).

Changes in v3:
  - Replace  with 
  - Rename uniphier_board_dt_compat to uniphier_dt_compat
  - Move uniphier_secondary_startup into platsmp.c as __naked function
  - Add return before "err:" label (bug fix)
  - Use syscon driver rather than hard-coded register address

Changes in v2:
  - Fix SoC compatible string
"socionext,ph1-proxstream2" -> "socionext,proxstream2"

 arch/arm/Kconfig  |  2 +
 arch/arm/Makefile |  1 +
 arch/arm/mach-uniphier/Kconfig| 11 +
 arch/arm/mach-uniphier/Makefile   |  2 +
 arch/arm/mach-uniphier/platsmp.c  | 90 +++
 arch/arm/mach-uniphier/uniphier.c | 30 +
 6 files changed, 136 insertions(+)
 create mode 100644 arch/arm/mach-uniphier/Kconfig
 create mode 100644 arch/arm/mach-uniphier/Makefile
 create mode 100644 arch/arm/mach-uniphier/platsmp.c
 create mode 100644 arch/arm/mach-uniphier/uniphier.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 45df48b..b2e0d988 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -937,6 +937,8 @@ source "arch/arm/mach-tegra/Kconfig"
 
 source "arch/arm/mach-u300/Kconfig"
 
+source "arch/arm/mach-uniphier/Kconfig"
+
 source "arch/arm/mach-ux500/Kconfig"
 
 source "arch/arm/mach-versatile/Kconfig"
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 985227c..fe8f9ef 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -200,6 +200,7 @@ machine-$(CONFIG_ARCH_SUNXI)+= sunxi
 machine-$(CONFIG_ARCH_TEGRA)   += tegra
 machine-$(CONFIG_ARCH_U300)+= u300
 machine-$(CONFIG_ARCH_U8500)   += ux500
+machine-$(CONFIG_ARCH_UNIPHIER)+= uniphier
 machine-$(CONFIG_ARCH_VERSATILE)   += versatile
 machine-$(CONFIG_ARCH_VEXPRESS)+= vexpress
 machine-$(CONFIG_ARCH_VT8500)  += vt8500
diff --git a/arch/arm/mach-uniphier/Kconfig b/arch/arm/mach-uniphier/Kconfig
new file mode 100644
index 000..a017b1d
--- /dev/null
+++ b/arch/arm/mach-uniphier/Kconfig
@@ -0,0 +1,11 @@
+config ARCH_UNIPHIER
+   bool "Socionext UniPhier SoCs"
+   depends on ARCH_MULTI_V7
+   select ARM_AMBA
+   select ARM_GLOBAL_TIMER
+   select ARM_GIC
+   select HAVE_ARM_SCU
+   select HAVE_ARM_TWD
+   help
+ Support for UniPhier SoC family developed by Socionext Inc.
+ (formerly, System LSI Business Division of Panasonic Corporation)
diff --git a/arch/arm/mach-uniphier/Makefile b/arch/arm/mach-uniphier/Makefile
new file mode 100644
index 000..60bd226
--- /dev/null
+++ b/arch/arm/mach-uniphier/Makefile
@@ -0,0 +1,2 @@
+obj-y  := uniphier.o
+obj-$(CONFIG_SMP)  += platsmp.o
diff --git a/arch/arm/mach-uniphier/platsmp.c b/arch/arm/mach-uniphier/platsmp.c
new file mode 100644
index 000..5943e1c
--- /dev/null
+++ b/arch/arm/mach-uniphier/platsmp.c
@@ -0,0 +1,90 @@
+/*
+ * Copyright (C) 2015 Masahiro Yamada 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct regmap *sbcm_regmap;
+
+static void __init uniphier_smp_prepare_cpus(unsigned int max_cpus)
+{
+   static cpumask_t only_cpu_0 = { CPU_BITS_CPU0 };
+   unsigned long scu_base_phys = 0;
+   void __iomem *scu_base;
+
+   sbcm_regmap = syscon_regmap_lookup_by_compatible(
+   "socionext,uniphier-system-bus-controller-misc");
+   if (IS_ERR(sbcm_regmap)) {
+   pr_err("failed to regmap system-bus-controller-misc\n");
+   goto err;
+   }
+
+   if (scu_a9_has_base())
+   scu_base_phys = scu_a9_get_base();
+
+   if (!scu_base_phys) {
+   pr_err("failed to get scu base\n");
+   goto err;
+   }
+
+   scu_base = ioremap(scu_base_phys, SZ_128);
+   if (!scu_base) {
+   pr_err("failed to remap scu base (0x%08lx)\n", scu_base_phys);
+   goto err;
+   }
+
+   scu_enable(scu_base);
+   iounmap(scu_base);

[PATCH v7 3/4] ARM: dts: UniPhier: add support for UniPhier SoCs and boards

2015-05-07 Thread Masahiro Yamada
Initial device trees for UniPhier SoCs: PH1-sLD3, PH1-LD4, PH1-Pro4,
and PH1-sLD8.

Signed-off-by: Masahiro Yamada 
---

Changes in v7:
  - Remove redundant "fifo-size" from the 16550A uart node

Changes in v6:
  - Remove redundant interrupt-parent property from timer nodes.

Changes in v5: None
Changes in v4:
  - Add system-bus-controller-misc node instead of uniphier-smp-reg node

Changes in v3:
  - License under GPL/X11
  - Drop "earlyprintk" kernel-parameter, add "stdout-path" property
  - Add syscon device for SMP boot support in order not to
hard-code register address in platsmp.c

Changes in v2: None

 arch/arm/boot/dts/Makefile   |   5 ++
 arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts   |  79 ++
 arch/arm/boot/dts/uniphier-ph1-ld4.dtsi  | 110 +
 arch/arm/boot/dts/uniphier-ph1-pro4-ref.dts  |  79 ++
 arch/arm/boot/dts/uniphier-ph1-pro4.dtsi | 117 +++
 arch/arm/boot/dts/uniphier-ph1-sld3-ref.dts  |  80 ++
 arch/arm/boot/dts/uniphier-ph1-sld3.dtsi | 117 +++
 arch/arm/boot/dts/uniphier-ph1-sld8-ref.dts  |  79 ++
 arch/arm/boot/dts/uniphier-ph1-sld8.dtsi | 110 +
 arch/arm/boot/dts/uniphier-support-card.dtsi |  65 +++
 10 files changed, 841 insertions(+)
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-ld4.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-pro4-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-pro4.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld3-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld3.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld8-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld8.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-support-card.dtsi

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 86217db..558b787 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -600,6 +600,11 @@ dtb-$(CONFIG_ARCH_U8500) += \
ste-hrefv60plus-tvk.dtb \
ste-ccu8540.dtb \
ste-ccu9540.dtb
+dtb-$(CONFIG_ARCH_UNIPHIER) += \
+   uniphier-ph1-sld3-ref.dtb \
+   uniphier-ph1-ld4-ref.dtb \
+   uniphier-ph1-pro4-ref.dtb \
+   uniphier-ph1-sld8-ref.dtb
 dtb-$(CONFIG_ARCH_VERSATILE) += \
versatile-ab.dtb \
versatile-pb.dtb
diff --git a/arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts 
b/arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts
new file mode 100644
index 000..200b0c9
--- /dev/null
+++ b/arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts
@@ -0,0 +1,79 @@
+/*
+ * Device Tree Source for UniPhier PH1-LD4 Reference Board
+ *
+ * Copyright (C) 2015 Masahiro Yamada 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+/include/ "uniphier-ph1-ld4.dtsi"
+/include/ "uniphier-support-card.dtsi"
+
+/ {
+   model = "UniPhier PH1-LD4 Reference Board";
+   compatible = "socionext,ph1-ld4-

[PATCH v7 4/4] MAINTAINERS: add myself as ARM/UniPhier maintainer

2015-05-07 Thread Masahiro Yamada
Signed-off-by: Masahiro Yamada 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1939513..3c31a27 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1525,6 +1525,13 @@ F:   drivers/rtc/rtc-ab3100.c
 F: drivers/rtc/rtc-coh901331.c
 T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git
 
+ARM/UNIPHIER ARCHITECTURE
+M: Masahiro Yamada 
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+S: Maintained
+F: arch/arm/mach-uniphier/
+N: uniphier
+
 ARM/Ux500 ARM ARCHITECTURE
 M: Linus Walleij 
 L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] block: loop: support DIO & AIO

2015-05-07 Thread Ming Lei
Hi Dave,

Thanks for your comment!

On Fri, May 8, 2015 at 6:20 AM, Dave Chinner  wrote:
> On Thu, May 07, 2015 at 08:32:39PM +0800, Ming Lei wrote:
>> On Thu, May 7, 2015 at 3:24 PM, Christoph Hellwig  wrote:
>> >> @@ -441,6 +500,12 @@ static void do_loop_switch(struct loop_device *lo, 
>> >> struct switch_request *p)
>> >>   mapping->host->i_bdev->bd_block_size : PAGE_SIZE;
>> >>   lo->old_gfp_mask = mapping_gfp_mask(mapping);
>> >>   mapping_set_gfp_mask(mapping, lo->old_gfp_mask & 
>> >> ~(__GFP_IO|__GFP_FS));
>> >> +
>> >> + lo->support_dio = mapping->a_ops && mapping->a_ops->direct_IO;
>> >> + if (lo->support_dio)
>> >> + lo->use_aio = true;
>> >> + else
>> >> + lo->use_aio = false;
>> >
>> > We need an explicit userspace op-in for this.  For one direct I/O can't
>>
>> Actually this patch is one simplified version, and my old version
>> has exported two sysfs files(use_aio, use_dio) which can control
>> if direct IO or AIO is used but only AIO is enabled if DIO is set. Finally
>> I think it isn't necessary because dio/aio works well from the tests,
>> and userspace shouldn't care if it is AIO or not if the performance
>> is good.
>
> Performance won't always be good.
>
> It looks to me that this has an unbound queue depth for AIO.  What
> throttles the amount of IO userspace can throw at an aio-enabled
> loop device? If it's unbound, then userspace can throw gigabytes of
> random write at the loop device and rather thanbe throttled at 128
> outstanding IOs, the queue will just keep growing. That will have
> adverse affects on dirty memory throttling, memory reclaim
> algorithms, read and write latency, etc.
>
> I suspect that if we are going to make the loop device use AIO, it
> will needs a proper queue depth limit (i.e.
> /sys/block/loop0/queue/nr_requests) enforced to avoid this sort of
> problem...

Loop has been converted to blk-mq, and the current queue depth is
128, so there isn't the problem you worried about, is there?

>> > handle sub-sector size access and people use the loop device as a
>> > workaround for that.
>>
>> Yes, user can do that, could you explain a bit what the problem is?
>
> I have a 4k sector backing device and a 512 byte sector filesystem
> image. I can't do 512 byte direct IO to the filesystem image, so I
> can't run tools that handle fs images in files using direct Io on
> that file. Create a loop device with the filesystem image, and now I
> can do 512 byte direct IO to the filesystem image, because all that
> direct IO to the filesystem image is now buffered by the loop
> device.
>
> If the loop device does direct io in this situation, the backing
> filesystem rejects direct IO from the loop device because it is not
> sector (4k) sized/aligned. User now swears, shouts and curses you
> from afar.

Yes, it is one problem, but looks it can be addressed by adding the
following in loop_set_fd():

 if (inode->i_sb->s_bdev)
blk_queue_logical_block_size(lo->lo_queue,
   bdev_io_min(inode->i_sb->s_bdev));

> DIO and AIO behaviour needs to be configurable through losetup, and
> most definitely not the default behaviour.

Could you share if there are other reasons for making it configurable via
losetup suppose the above issues can be fixed?

Thanks,
Ming Lei

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 2/4] ARM: multi_v7_defconfig: enable UniPhier SoC family

2015-05-07 Thread Masahiro Yamada
Add UniPhier, a new citizen in the ARM multi platform.

Signed-off-by: Masahiro Yamada 
---

Changes in v7: None
Changes in v6: None
Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index dca6983..a80ed4e 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -83,6 +83,7 @@ CONFIG_ARCH_TEGRA_3x_SOC=y
 CONFIG_ARCH_TEGRA_114_SOC=y
 CONFIG_ARCH_TEGRA_124_SOC=y
 CONFIG_TEGRA_EMC_SCALING_ENABLE=y
+CONFIG_ARCH_UNIPHIER=y
 CONFIG_ARCH_U8500=y
 CONFIG_MACH_HREFV60=y
 CONFIG_MACH_SNOWBALL=y
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 0/4] ARM: SoC: add a new platform, UniPhier (arch/arm/mach-uniphier)

2015-05-07 Thread Masahiro Yamada
This is an initial series for supporting Socionext UniPhier SoCs,
based on ARM Cortex-A9, mainly used for digital TVs, video recorders, etc.


Masahiro Yamada (4):
  ARM: UniPhier: add basic support for UniPhier architecture
  ARM: multi_v7_defconfig: enable UniPhier SoC family
  ARM: dts: UniPhier: add support for UniPhier SoCs and boards
  MAINTAINERS: add myself as ARM/UniPhier maintainer

 MAINTAINERS  |   7 ++
 arch/arm/Kconfig |   2 +
 arch/arm/Makefile|   1 +
 arch/arm/boot/dts/Makefile   |   5 ++
 arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts   |  79 ++
 arch/arm/boot/dts/uniphier-ph1-ld4.dtsi  | 110 +
 arch/arm/boot/dts/uniphier-ph1-pro4-ref.dts  |  79 ++
 arch/arm/boot/dts/uniphier-ph1-pro4.dtsi | 117 +++
 arch/arm/boot/dts/uniphier-ph1-sld3-ref.dts  |  80 ++
 arch/arm/boot/dts/uniphier-ph1-sld3.dtsi | 117 +++
 arch/arm/boot/dts/uniphier-ph1-sld8-ref.dts  |  79 ++
 arch/arm/boot/dts/uniphier-ph1-sld8.dtsi | 110 +
 arch/arm/boot/dts/uniphier-support-card.dtsi |  65 +++
 arch/arm/configs/multi_v7_defconfig  |   1 +
 arch/arm/mach-uniphier/Kconfig   |  11 +++
 arch/arm/mach-uniphier/Makefile  |   2 +
 arch/arm/mach-uniphier/platsmp.c |  90 +
 arch/arm/mach-uniphier/uniphier.c|  30 +++
 18 files changed, 985 insertions(+)
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-ld4-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-ld4.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-pro4-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-pro4.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld3-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld3.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld8-ref.dts
 create mode 100644 arch/arm/boot/dts/uniphier-ph1-sld8.dtsi
 create mode 100644 arch/arm/boot/dts/uniphier-support-card.dtsi
 create mode 100644 arch/arm/mach-uniphier/Kconfig
 create mode 100644 arch/arm/mach-uniphier/Makefile
 create mode 100644 arch/arm/mach-uniphier/platsmp.c
 create mode 100644 arch/arm/mach-uniphier/uniphier.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk samsung exynos5420 add CLK_RECALC_NEW_RATES flag to mout_apll and mout_kpll clock.

2015-05-07 Thread Krzysztof Kozlowski
On 08.05.2015 13:01, Anand Moon wrote:
> Hi Krysztof,
> 
> Actually the patch is based on https://lkml.org/lkml/2015/4/3/389
> I don't know how to measure the energy consumption when compared to
> cpufreq-exynos.

It does not have to be energy consumption. It may be something else
visible, like wrong values for clock rates. I don't know, it is your
patch so you should know *what is fixed* (or changed).

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] serial: of_serial: do not set port.type twice

2015-05-07 Thread Masahiro Yamada
The port.type has already been set by of_platform_serial_setup()
called from a few lines above.
Setting it to the same value is redundant.

Signed-off-by: Masahiro Yamada 
---

 drivers/tty/serial/of_serial.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/tty/serial/of_serial.c b/drivers/tty/serial/of_serial.c
index 137381e..28b9b47 100644
--- a/drivers/tty/serial/of_serial.c
+++ b/drivers/tty/serial/of_serial.c
@@ -188,7 +188,6 @@ static int of_platform_serial_probe(struct platform_device 
*ofdev)
{
struct uart_8250_port port8250;
memset(&port8250, 0, sizeof(port8250));
-   port.type = port_type;
port8250.port = port;
 
if (port.fifosize)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk samsung exynos5420 add CLK_RECALC_NEW_RATES flag to mout_apll and mout_kpll clock.

2015-05-07 Thread Anand Moon
Hi Krysztof,

Actually the patch is based on https://lkml.org/lkml/2015/4/3/389
I don't know how to measure the energy consumption when compared to
cpufreq-exynos.

I will update the commit log and resend it with you review.

-Anand Moon




On 8 May 2015 at 05:44, Krzysztof Kozlowski  wrote:
> 2015-05-08 2:48 GMT+09:00 Anand Moon :
>> Addition of CLK_RECALC_NEW_RATES flag to support Exynos5 cpu clk so that
>> correct divider values are re-calculated after both pre/post
>> clock notifiers had run for for mout_apll clock and mout_kpll clock.
> s/for for/for/
>
> Could you precise in commit message the observational effects
> *without* this patch? In other words: what is fixed? The divider will
> have incorrect values?
>
>>
>> Depend on https://lkml.org/lkml/2015/4/3/388
>>
>> Tested on OdroidXU3 Board.
>
> Thanks for providing this information. However
> 1. Patch dependency should not be part of commit message, It simply
> won't provide any meaningful information when they are merged.
> 2. Similarly testing platform also is not always put in commit message.
>
> So just put them after separator (triple-dash).
>
> After fixing the commit message:
> Reviewed-by: Krzysztof Kozlowski 
>
> Best regards,
> Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] watchdog: dw_wdt: Use a mutex, not a spinlock

2015-05-07 Thread Doug Anderson
Hi,

On Thu, May 7, 2015 at 6:47 PM, Guenter Roeck  wrote:
> On 05/07/2015 03:09 PM, Doug Anderson wrote:
>>
>> Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
>> problem is that while holding the spinlock we call:
>> -> dw_wdt_set_top()
>> -> dw_wdt_top_in_seconds()
>>-> clk_get_rate()
>>   -> clk_prepare_lock()
>>  -> mutex_lock()
>>
>> Locking a mutex while holding a spinlock is not allowed and leads to
>> warnings like "BUG: spinlock wrong CPU on CPU#1", among other
>> problems.
>>
>> There's no reason to use a spinlock, so switch to a mutex.
>>
>> Signed-off-by: Doug Anderson 
>
>
> Reviewed-by: Guenter Roeck 

As Dmitry pointed out in another context, and even better fix is to
just remove the spinlock altogether.  I'll send up v2...

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/6] cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE

2015-05-07 Thread Preeti U Murthy
On 05/08/2015 02:29 AM, Rafael J. Wysocki wrote:
> On Thursday, May 07, 2015 05:49:22 PM Preeti U Murthy wrote:
>> On 05/05/2015 02:11 PM, Preeti U Murthy wrote:
>>> On 05/05/2015 12:03 PM, Shilpasri G Bhat wrote:
 Hi Preeti,

 On 05/05/2015 09:30 AM, Preeti U Murthy wrote:
> Hi Shilpa,
>
> On 05/04/2015 02:24 PM, Shilpasri G Bhat wrote:
>> Re-evaluate the chip's throttled state on recieving OCC_THROTTLE
>> notification by executing *throttle_check() on any one of the cpu on
>> the chip. This is a sanity check to verify if we were indeed
>> throttled/unthrottled after receiving OCC_THROTTLE notification.
>>
>> We cannot call *throttle_check() directly from the notification
>> handler because we could be handling chip1's notification in chip2. So
>> initiate an smp_call to execute *throttle_check(). We are irq-disabled
>> in the notification handler, so use a worker thread to smp_call
>> throttle_check() on any of the cpu in the chipmask.
>
> I see that the first patch takes care of reporting *per-chip* throttling
> for pmax capping condition. But where are we taking care of reporting
> "pstate set to safe" and "freq control disabled" scenarios per-chip ?
>

 IMO let us not have "psafe" and "freq control disabled" states managed 
 per-chip.
 Because when the above two conditions occur it is likely to happen across 
 all
 chips during an OCC reset cycle. So I am setting 'throttled' to false on
 OCC_ACTIVE and re-verifying if it actually is the case by invoking
 *throttle_check().
>>>
>>> Alright like I pointed in the previous reply, a comment to indicate that
>>> psafe and freq control disabled conditions will fail when occ is
>>> inactive and that all chips face the consequence of this will help.
>>
>> From your explanation on the thread of the first patch of this series,
>> this will not be required.
>>
>> So,
>> Reviewed-by: Preeti U Murthy 
> 
> OK, so is the whole series reviewed now?

Yes the whole series has been reviewed.

Regards
Preeti U Murthy


> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] dmaengine: sun6i: Add support for Allwinner H3 (sun8i) variant

2015-05-07 Thread Vinod Koul
On Wed, May 06, 2015 at 12:13:42PM +0200, Maxime Ripard wrote:
> On Wed, May 06, 2015 at 11:31:31AM +0200, Jens Kuske wrote:
> > The H3 SoC has the same dma engine as the A31 (sun6i), with a
> > reduced amount of endpoints and physical channels. Add the proper
> > config data and compatible string to support it.
> > 
> > Signed-off-by: Jens Kuske 

This looks fine to me, I think can be merged now. Do you guys want the
mainatainers to pick up patches to their subsystem or merge them tgether,
though don't see any dependency though

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 3/8] dmaengine: Add driver for TI DMA crossbar on DRA7x

2015-05-07 Thread Vinod Koul
On Thu, May 07, 2015 at 12:48:34PM +0300, Peter Ujfalusi wrote:
> On 05/04/2015 08:38 AM, Vinod Koul wrote:
> > On Thu, Apr 09, 2015 at 12:35:49PM +0300, Peter Ujfalusi wrote:
> >> +int omap_dmaxbar_init(void)
> >> +{
> >> +  return platform_driver_register(&ti_dma_xbar_driver);
> >> +}
> >> +arch_initcall(omap_dmaxbar_init);
> > All looks fine except this bit, I think I did point out this last time as
> > well, though dont recall your answer. We rather depend on defered probe and
> > not rely on module ordering.
> 
> Can not find my previous response in my mailbox anymore thanks to Thunderbird:
> it corrupted all of my local mbox files when I did a backup from the server :(
> 
> I don't think the deferred probing is working with dmaengine since we return
> NULL in any case when the channel can not be requested for whatever reason.
> The request calls are eating up the error code (if any) which is coming when
> the channel is requested. With the exception of
> dma_request_slave_channel_reason(), which will return the reason for the
> failure, but most drivers are not using this.
Yes that was the reason for this API and I was expecting people ot start
using this and eliminate the init level dependecy

> 
> There is also a fallback in dma_request_slave_channel_compat() if the channel
> can not be requested via of/acpi it will try to get the channel via legacy
> mode also.
> 
> Should all drivers using DMA via dmaengine should return with -EPROBE_DEFER
> from their probe if they can not get the DMA channel? Some drivers uses the
> existence/non existence of the DMA resource as a means to decide to use DMA or
> PIO mode...
Yes we should do that.

> 
> If the crossbar is not in the same initcall level we can have bad race
> conditions also:
> omap-dma can handle up to 127 DMA requests.
> omap-dma is loaded but the crossbar driver is not.
> 
> A driver requests DMA for crossbar line 135:
> We will have failure from of_dma_request_slave_channel() since the CB driver
> is not yet loaded (returning with -EPROBE_DEFER), then the legacy call will
> try to get the channel from the loaded DMA driver, but that is going to fail
> as well (135 is not valid for omap-dma).
> 
> Another driver would request DMA for crossbar line 100:
> The legacy call will actually find it a valid request and get the channel from
> omap-dma driver, but this will not work in reality: the crossbar also need to
> be configured to route the signal to the correct line.
> This driver would think it has valid DMA, but in fact it has non working DMA.
I think you cna start doing the conversion of omap driver thuis you dont
have dependency on others.

Now as far as this series is concerned, rest of it looks good so I am
willing to merge to if you plan to work on defered probe :) I think its a
fair bargain!

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] vfs: add a O_NOMTIME flag

2015-05-07 Thread Andy Lutomirski
On May 8, 2015 8:11 AM, "Dave Chinner"  wrote:
>
> On Thu, May 07, 2015 at 10:20:53AM -0700, Zach Brown wrote:
> > On Thu, May 07, 2015 at 10:26:17AM +1000, Dave Chinner wrote:
> > > On Wed, May 06, 2015 at 03:00:12PM -0700, Zach Brown wrote:
> > > > Add the O_NOMTIME flag which prevents mtime from being updated which can
> > > > greatly reduce the IO overhead of writes to allocated and initialized
> > > > regions of files.
> > >
> > > Hmmm. How do backup programs now work out if the file has changed
> > > and hence needs copying again? ie. applications using this will
> > > break other critical infrastructure in subtle ways.
> >
> > By using backup infrastructure that doesn't use cmtime.  Like btrfs
> > send/recv.  Or application level backups that know how to do
> > incrementals from metadata in giant database files, say, without
> > walking, comparing, and copying the entire thing.
>
> "Use magical thing that doesn't exist"? Really?
>
> e.g. you can't do incremental backups with tools like xfsdump if
> mtime is not being updated.  The last thing an admin wants when
> doing disaster recovery is to find out that the app started using
> O_NOMTIME as a result of the upgrade they did 6 months ago. Hence
> the last 6 months of production data isn't in the backups despite
> the backup procedure having been extensively tested and verified
> when it was first put in place.
>
> > > > The criteria for using O_NOMTIME is the same as for using O_NOATIME:
> > > > owning the file or having the CAP_FOWNER capability.  If we're not
> > > > comfortable allowing owners to prevent mtime/ctime updates then we
> > > > should add a tunable to allow O_NOMTIME.  Maybe a mount option?
> > >
> > > I dislike "turn off safety for performance" options because Joe
> > > SpeedRacer will always select performance over safety.
> >
> > Well, for ceph there's no safety concern.  They never use cmtime in
> > these files.
>
> Understood.
>
> > So are you suggesting not implementing this
>
> No.
>
> > Or are we talking about adding some speed bumps
> > that ceph can flip on that might give Joe Speedracer pause?
>
> Yes, but not just Joe Speedracer - if it can be turned on silently
> by apps then it's a great big landmine that most users and sysadmins
> will not know about until it is too late.

What about programs like tar that explicitly override mtime?  No admin
buy-in is required for that.  Admittedly, that doesn't affect ctime,
nor is it as likely to bite unexpectedly as a nomtime flag.

I think it would be reasonably safe if a mount option had to be set to
allow O_NOCMTIME or such.

--Andy

> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Avoiding unnecessary jump relocations in gas?

2015-05-07 Thread Andy Lutomirski
On Thu, May 7, 2015 at 9:21 AM, H.J. Lu  wrote:
> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich  wrote:
> On 07.05.15 at 08:02,  wrote:
>>> AFAICT gas will produce relocations for jumps to global labels in the
>>> same file.  This doesn't seem directly harmful to me, except that, on
>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>
>>> This seems especially unfortunate, since even hidden and protected
>>> symbols have this problem.
>>>
>>> Given that many users don't want interposition support (especially the
>>> kernel and anyone using .hidden or .protected), it would be nice to
>>> have a command-line option to turn this off and probably also to turn
>>> it off by default for hidden and protected symbols.  Can gas do this?
>>
>> I've been running with the below changes (taken off of a bigger set
>> of changes, so the line numbers may look a little odd) for the last
>> couple of years. I never tried to submit this change because so far
>> I couldn't find the time to check whether this would have any
>> unwanted side effects on cases I don't normally use.
>>
>
> This is the patch I checked in.
>
> Thanks.
>
> --
> H.J.
> ---
> Branches to global non-weak symbols defined in the same segment with
> non-default visibility can be optimized the same way as branches to
> local symbols.

Would it make sense to also add a command line option along the lines
of gcc's -fno-semantic-interposition or some way to override the
default visibility?  AFAICS this patch helps but only if asm code gets
liberally sprinkled with .hidden or .protected directives.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] NFS: report more appropriate block size for directories.

2015-05-07 Thread NeilBrown

In glibc 2.21 (and several previous), a call to opendir() will
result in a 32K (BUFSIZ*4) buffer being allocated and passed to
getdents.

However a call to fdopendir() results in an 'fstat' request to
determine block size and a matching buffer allocated for subsequent
use with getdents.  This will typically be 1M.

The first getdents call on an NFS directory will always use
READDIR_PLUS (or NFSv4 equivalent) if available.  Subsequent getdents
calls only use this more expensive version if some 'stat' requests are
made between the getdents calls.

For this reason it is good to keep at least that first getdents call
relatively short.  When fdopendir() and readdir() is used on a large
directory, it takes approximately 32 times as long to complete as
using "opendir".  Current versions of 'find' use fdopendir() and
demonstrate this slowness.

'stat' on a directory currently returns the 'wsize'.  This number has
no meaning on directories.
Actual READDIR requests are limited to ->dtsize, which itself is
capped at 4 pages, coincidently the same as BUFSIZ*4.
So this is a meaningful number to use as the blocksize on directories,
and has the effect of making 'find' on large directories go a lot
faster.

Signed-off-by: NeilBrown 

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 96f2d55781fb..f8aebf59383f 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -678,6 +678,8 @@ int nfs_getattr(struct vfsmount *mnt, struct dentry 
*dentry, struct kstat *stat)
if (!err) {
generic_fillattr(inode, stat);
stat->ino = nfs_compat_user_ino64(NFS_FILEID(inode));
+   if (S_ISDIR(inode->i_mode))
+   stat->blksize = NFS_SERVER(inode)->dtsize;
}
 out:
trace_nfs_getattr_exit(inode, err);


pgpN4tu1MZnGa.pgp
Description: OpenPGP digital signature


Re: [RFC v1 02/11] genirq: Move field 'node' from struct irq_data into struct irq_common_data

2015-05-07 Thread Yun Wu (Abel)
On 2015/5/8 10:29, Yun Wu (Abel) wrote:

> Hi Gerry,
> On 2015/5/4 11:15, Jiang Liu wrote:
> 
>> NUMA node information is per-irq instead of per-irqchip, so move it into
>> struct irq_common_data.
>>
>> Signed-off-by: Jiang Liu 
>> ---
>>  arch/sh/kernel/irq.c  |2 +-
>>  arch/x86/kernel/apic/vector.c |8 
>>  arch/x86/platform/uv/uv_irq.c |2 +-
>>  include/linux/irq.h   |   20 ++--
>>  kernel/irq/internals.h|5 +
>>  kernel/irq/irqdesc.c  |   10 ++
>>  kernel/irq/irqdomain.c|4 ++--
>>  kernel/irq/manage.c   |2 +-
>>  kernel/irq/proc.c |2 +-
>>  9 files changed, 35 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c
>> index eb10ff84015c..8dc677cc136b 100644
>> --- a/arch/sh/kernel/irq.c
>> +++ b/arch/sh/kernel/irq.c
>> @@ -227,7 +227,7 @@ void migrate_irqs(void)
>>  for_each_active_irq(irq) {
>>  struct irq_data *data = irq_get_irq_data(irq);
>>  
>> -if (data->node == cpu) {
>> +if (irq_data_get_node(data) == cpu) {
>>  unsigned int newcpu = cpumask_any_and(data->affinity,
>>cpu_online_mask);
>>  if (newcpu >= nr_cpu_ids) {
>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>> index 96ce5068a926..983bea2a09ce 100644
>> --- a/arch/x86/kernel/apic/vector.c
>> +++ b/arch/x86/kernel/apic/vector.c
>> @@ -345,7 +345,7 @@ static int x86_vector_alloc_irqs(struct irq_domain 
>> *domain, unsigned int virq,
>>  struct irq_alloc_info *info = arg;
>>  struct apic_chip_data *data;
>>  struct irq_data *irq_data;
>> -int i, err;
>> +int i, err, node;
>>  
>>  if (disable_apic)
>>  return -ENXIO;
>> @@ -357,12 +357,13 @@ static int x86_vector_alloc_irqs(struct irq_domain 
>> *domain, unsigned int virq,
>>  for (i = 0; i < nr_irqs; i++) {
>>  irq_data = irq_domain_get_irq_data(domain, virq + i);
>>  BUG_ON(!irq_data);
>> +node = irq_data_get_node(irq_data);
>>  #ifdef  CONFIG_X86_IO_APIC
>>  if (virq + i < nr_legacy_irqs() && legacy_irq_data[virq + i])
>>  data = legacy_irq_data[virq + i];
>>  else
>>  #endif
>> -data = alloc_apic_chip_data(irq_data->node);
>> +data = alloc_apic_chip_data(node);
>>  if (!data) {
>>  err = -ENOMEM;
>>  goto error;
>> @@ -371,8 +372,7 @@ static int x86_vector_alloc_irqs(struct irq_domain 
>> *domain, unsigned int virq,
>>  irq_data->chip = &lapic_controller;
>>  irq_data->chip_data = data;
>>  irq_data->hwirq = virq + i;
>> -err = assign_irq_vector_policy(virq, irq_data->node, data,
>> -   info);
>> +err = assign_irq_vector_policy(virq, node, data, info);
>>  if (err)
>>  goto error;
>>  }
>> diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
>> index cdf86cd3fd97..bc992b7b041f 100644
>> --- a/arch/x86/platform/uv/uv_irq.c
>> +++ b/arch/x86/platform/uv/uv_irq.c
>> @@ -89,7 +89,7 @@ static int uv_domain_alloc(struct irq_domain *domain, 
>> unsigned int virq,
>>  return -EINVAL;
>>  
>>  chip_data = kmalloc_node(sizeof(*chip_data), GFP_KERNEL,
>> - irq_data->node);
>> + irq_data_get_node(irq_data));
>>  if (!chip_data)
>>  return -ENOMEM;
>>  
>> diff --git a/include/linux/irq.h b/include/linux/irq.h
>> index 3b6e0def7f5c..3f999a0af713 100644
>> --- a/include/linux/irq.h
>> +++ b/include/linux/irq.h
>> @@ -129,9 +129,13 @@ struct irq_domain;
>>   * struct irq_common_data - per irq data shared by all irqchips
>>   * @state_use_accessors: status information for irq chip functions.
>>   *  Use accessor functions to deal with it
>> + * @node:   node index useful for balancing
>>   */
>>  struct irq_common_data {
>>  unsigned intstate_use_accessors;
>> +#ifdef CONFIG_SMP
> 
> Would CONFIG_NUMA be a little more appropriate?
> Or even let @node be always compiled?

Sorry for comment before reading your next patch in which you replaced
CONFIG_SMP with CONFIG_NUMA.
And letting @node be under CONFIG_NUMA makes sense, because it's useless
in non-NUMA scenario.

Thanks,
Abel

> 
> 
>> +unsigned intnode;
>> +#endif
>>  };
>>  
>>  /**
>> @@ -139,7 +143,6 @@ struct irq_common_data {
>>   * @mask:   precomputed bitmask for accessing the chip registers
>>   * @irq:interrupt number
>>   * @hwirq:  hardware interrupt number, local to the interrupt domain
>> - * @node:   node index useful for balancing
>>   * @common: point to

Re: [PATCH] tools/thermal: tmon: fixed the 'make install' command

2015-05-07 Thread Zhang Rui
On Thu, 2015-05-07 at 13:31 -0700, Jacob Pan wrote:
> On Fri,  8 May 2015 03:39:04 +0930
> Anand Moon  wrote:
> 
> > To install tmon we issue "make install" which produces bellow error.
> > 
> looks good, there is no config file for now.
> 
> Thanks for the fix.
> Acked-by: Jacob Pan 
> 
patch applied. Thanks!

BTW, next time when sending a thermal related change, please cc
linux...@vger.kernel.org, or else the change may got because we usually
rely on https://patchwork.kernel.org/project/linux-pm/list/ to review
and scrub the patches. :)

thanks,
rui



> > root@odroidxu3:/usr/src/odroidxu3-4.y-testing/tools/thermal/tmon#
> > make install mkdir -p /usr/bin
> > install -m 755 -p "tmon" "/usr/bin/tmon"
> > mkdir -p /
> > install -m 644 -p "" "/"
> > install: cannot stat ‘’: No such file or directory
> > make: [install] Error 1 (ignored)
> > 
> > Signed-off-by: Anand Moon 
> > ---
> >  tools/thermal/tmon/Makefile | 8 
> >  1 file changed, 8 deletions(-)
> > 
> > diff --git a/tools/thermal/tmon/Makefile b/tools/thermal/tmon/Makefile
> > index 0788621..2e83dd3 100644
> > --- a/tools/thermal/tmon/Makefile
> > +++ b/tools/thermal/tmon/Makefile
> > @@ -12,10 +12,6 @@ TARGET=tmon
> >  INSTALL_PROGRAM=install -m 755 -p
> >  DEL_FILE=rm -f
> >  
> > -INSTALL_CONFIGFILE=install -m 644 -p
> > -CONFIG_FILE=
> > -CONFIG_PATH=
> > -
> >  # Static builds might require -ltinfo, for instance
> >  ifneq ($(findstring -static, $(LDFLAGS)),)
> >  STATIC := --static
> > @@ -38,13 +34,9 @@ valgrind: tmon
> >  install:
> > - mkdir -p $(INSTALL_ROOT)/$(BINDIR)
> > - $(INSTALL_PROGRAM) "$(TARGET)"
> > "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
> > -   - mkdir -p $(INSTALL_ROOT)/$(CONFIG_PATH)
> > -   - $(INSTALL_CONFIGFILE) "$(CONFIG_FILE)"
> > "$(INSTALL_ROOT)/$(CONFIG_PATH)" 
> >  uninstall:
> > $(DEL_FILE) "$(INSTALL_ROOT)/$(BINDIR)/$(TARGET)"
> > -   $(CONFIG_FILE) "$(CONFIG_PATH)"
> > -
> >  
> >  clean:
> > find . -name "*.o" | xargs $(DEL_FILE)
> 
> [Jacob Pan]


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 3/3] leds: Add ktd2692 flash LED driver

2015-05-07 Thread Ingi Kim
This patch adds a driver to support the ktd2692 flash LEDs.
ktd2692 can control flash current by ExpressWire interface.

Signed-off-by: Ingi Kim 
Acked-by: Seung-Woo Kim 
Reviewed-by: Varka Bhadram 
---
 drivers/leds/Kconfig|   9 +
 drivers/leds/Makefile   |   1 +
 drivers/leds/leds-ktd2692.c | 443 
 3 files changed, 453 insertions(+)
 create mode 100644 drivers/leds/leds-ktd2692.c

diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig
index 51059bb..bfbdbd1 100644
--- a/drivers/leds/Kconfig
+++ b/drivers/leds/Kconfig
@@ -505,6 +505,15 @@ config LEDS_MENF21BMC
  This driver can also be built as a module. If so the module
  will be called leds-menf21bmc.
 
+config LEDS_KTD2692
+   tristate "KTD2692 LED flash support"
+   depends on LEDS_CLASS_FLASH && GPIOLIB && OF
+   help
+ This option enables support for KTD2692 LED flash connected
+ through ExpressWire interface.
+
+ Say Y to enable this driver.
+
 comment "LED driver for blink(1) USB RGB LED is under Special HID drivers 
(HID_THINGM)"
 
 config LEDS_BLINKM
diff --git a/drivers/leds/Makefile b/drivers/leds/Makefile
index a739ae2..ed5ed79 100644
--- a/drivers/leds/Makefile
+++ b/drivers/leds/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_LEDS_BLINKM) += leds-blinkm.o
 obj-$(CONFIG_LEDS_SYSCON)  += leds-syscon.o
 obj-$(CONFIG_LEDS_VERSATILE)   += leds-versatile.o
 obj-$(CONFIG_LEDS_MENF21BMC)   += leds-menf21bmc.o
+obj-$(CONFIG_LEDS_KTD2692) += leds-ktd2692.o
 
 # LED SPI Drivers
 obj-$(CONFIG_LEDS_DAC124S085)  += leds-dac124s085.o
diff --git a/drivers/leds/leds-ktd2692.c b/drivers/leds/leds-ktd2692.c
new file mode 100644
index 000..9d878a4
--- /dev/null
+++ b/drivers/leds/leds-ktd2692.c
@@ -0,0 +1,443 @@
+/*
+ * LED driver : leds-ktd2692.c
+ *
+ * Copyright (C) 2015 Samsung Electronics
+ * Ingi Kim 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Value related the movie mode */
+#define KTD2692_MOVIE_MODE_CURRENT_LEVELS  16
+#define KTD2692_MM_TO_FL_RATIO(x)  ((x) / 3)
+#define KTD2962_MM_MIN_CURR_THRESHOLD_SCALE8
+
+/* Value related the flash mode */
+#define KTD2692_FLASH_MODE_TIMEOUT_LEVELS  8
+#define KTD2692_FLASH_MODE_TIMEOUT_DISABLE 0
+#define KTD2692_FLASH_MODE_CURR_PERCENT(x) (((x) * 16) / 100)
+
+/* Macro for getting offset of flash timeout */
+#define GET_TIMEOUT_OFFSET(timeout, step)  ((timeout) / (step))
+
+/* Base register address */
+#define KTD2692_REG_LVP_BASE   0x00
+#define KTD2692_REG_FLASH_TIMEOUT_BASE 0x20
+#define KTD2692_REG_MM_MIN_CURR_THRESHOLD_BASE 0x40
+#define KTD2692_REG_MOVIE_CURRENT_BASE 0x60
+#define KTD2692_REG_FLASH_CURRENT_BASE 0x80
+#define KTD2692_REG_MODE_BASE  0xA0
+
+/* Set bit coding time for expresswire interface */
+#define KTD2692_TIME_RESET_US  700
+#define KTD2692_TIME_DATA_START_TIME_US10
+#define KTD2692_TIME_HIGH_END_OF_DATA_US   350
+#define KTD2692_TIME_LOW_END_OF_DATA_US10
+#define KTD2692_TIME_SHORT_BITSET_US   4
+#define KTD2692_TIME_LONG_BITSET_US12
+
+/* KTD2692 default length of name */
+#define KTD2692_NAME_LENGTH20
+
+enum ktd2692_bitset {
+   KTD2692_LOW = 0,
+   KTD2692_HIGH,
+};
+
+/* Movie / Flash Mode Control */
+enum ktd2692_led_mode {
+   KTD2692_MODE_DISABLE = 0,   /* default */
+   KTD2692_MODE_MOVIE,
+   KTD2692_MODE_FLASH,
+};
+
+struct ktd2692_led_config_data {
+   /* maximum LED current in movie mode */
+   u32 movie_max_microamp;
+   /* maximum LED current in flash mode */
+   u32 flash_max_microamp;
+   /* maximum flash timeout */
+   u32 flash_max_timeout;
+   /* max LED brightness level */
+   enum led_brightness max_brightness;
+};
+
+struct ktd2692_context {
+   /* Related LED Flash class device */
+   struct led_classdev_flash fled_cdev;
+
+   /* secures access to the device */
+   struct mutex lock;
+   struct regulator *regulator;
+   struct work_struct work_brightness_set;
+
+   struct gpio_desc *aux_gpio;
+   struct gpio_desc *ctrl_gpio;
+
+   enum ktd2692_led_mode mode;
+   enum led_brightness torch_brightness;
+};
+
+static struct ktd2692_context *fled_cdev_to_led(
+   struct led_classdev_flash *fled_cdev)
+{
+   return container_of(fled_cdev, struct ktd2692_context, fled_cdev);
+}
+
+static void ktd2692_expresswire_start(struct ktd2692_context *led)
+{
+   gpiod_direction_output(led->ctrl_gpio, KTD2692_HIGH);
+   udelay(KTD269

[PATCH v8 1/3] of: Add vendor prefix for Kinetic technologies

2015-05-07 Thread Ingi Kim
This patch adds vendor prefix for Kinetic technologies

Signed-off-by: Ingi Kim 
Acked-by: Rob Herring 
Acked-by: Seung-Woo Kim 
Reviewed-by: Varka Bhadram 
---
 Documentation/devicetree/bindings/vendor-prefixes.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index fae26d0..90a4be1 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -100,6 +100,7 @@ iseeISEE 2007 S.L.
 isil   Intersil
 karo   Ka-Ro electronics GmbH
 keymileKeymile GmbH
+kinetic Kinetic Technologies
 lacie  LaCie
 lantiq Lantiq Semiconductor
 lenovo Lenovo Group Ltd.
-- 
2.0.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 0/3] Add ktd2692 Flash LED driver using LED Flash class

2015-05-07 Thread Ingi Kim
This patch adds ktd2692 Flash LED driver with LED Flash class

Change in v8:
- Add led-max-microamp mandatory property for LEDs current
  base on Jacek's patch [1]
  patch [1] [PATCH v6] DT: leds: Improve description of
  flash LEDs related properties
- Fix brightness calculation by max-microamp properties

Change in v7:
- Add flash-max-microamp property for Flash LED
- Change gpio-legacy interface to gpio consumer interface

Change in v6 resend:
- Adjust indent using checkpatch.pl script with strict option

Change in v6:
- Change goto label to if-else
- Change DT binding style for LED device binding

Change in v5:
- Clean up the code
- Fix help message of Kconfig
- Fix issue related with regulator and mutex usage
- Remove tab spaces in bindings

Change in v4:
- Clean up the code
- Modify binding documentation of ktd2692

Change in v3:
- Clean up the code
- Add aux gpio pin to control Flash LED

Change in v2:
- Introduction of LED Flash class as Jacek's comment
- Supplement of binding documentation
- Rename gpio control pin and remove unused pin
- Add regulator for the Flash LED

Ingi Kim (3):
  of: Add vendor prefix for Kinetic technologies
  leds: ktd2692: add device tree bindings for ktd2692
  leds: Add ktd2692 flash LED driver

 .../devicetree/bindings/leds/leds-ktd2692.txt  |  50 +++
 .../devicetree/bindings/vendor-prefixes.txt|   1 +
 drivers/leds/Kconfig   |   9 +
 drivers/leds/Makefile  |   1 +
 drivers/leds/leds-ktd2692.c| 443 +
 5 files changed, 504 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/leds/leds-ktd2692.txt
 create mode 100644 drivers/leds/leds-ktd2692.c

-- 
2.0.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 2/3] leds: ktd2692: add device tree bindings for ktd2692

2015-05-07 Thread Ingi Kim
This patch adds the device tree bindings for ktd2692 flash LEDs.
Add Optional properties of child node for Flash LED

Signed-off-by: Ingi Kim 
Acked-by: Seung-Woo Kim 
Reviewed-by: Varka Bhadram 
---
 .../devicetree/bindings/leds/leds-ktd2692.txt  | 50 ++
 1 file changed, 50 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/leds/leds-ktd2692.txt

diff --git a/Documentation/devicetree/bindings/leds/leds-ktd2692.txt 
b/Documentation/devicetree/bindings/leds/leds-ktd2692.txt
new file mode 100644
index 000..cf45492
--- /dev/null
+++ b/Documentation/devicetree/bindings/leds/leds-ktd2692.txt
@@ -0,0 +1,50 @@
+* Kinetic Technologies - KTD2692 Flash LED Driver
+
+KTD2692 is the ideal power solution for high-power flash LEDs.
+It uses ExpressWire single-wire programming for maximum flexibility.
+
+The ExpressWire interface through CTRL pin can control LED on/off and
+enable/disable the IC, Movie(max 1/3 of Flash current) / Flash mode current,
+Flash timeout, LVP(low voltage protection).
+
+Also, When the AUX pin is pulled high while CTRL pin is high,
+LED current will be ramped up to the flash-mode current level.
+
+Required properties:
+- compatible: "kinetic,ktd2692"
+- ctrl-gpio : gpio pin in order control CTRL pin
+- aux-gpio : gpio pin in order control AUX pin
+
+Optional properties:
+- vin-supply : "vin" LED supply (2.7V to 5.5V)
+  See Documentation/devicetree/bindings/regulator/regulator.txt
+
+A discrete LED element connected to the device must be represented by a child
+node - see Documentation/devicetree/bindings/leds/common.txt.
+
+Required properties for flash LED child nodes:
+  See Documentation/devicetree/bindings/leds/common.txt
+- led-max-microamp : Minimum Threshold for Timer protection
+  is defined internally (Maximum 300mA)
+- flash-max-microamp : Flash LED maximum current
+  Formula : I(mA) = 15000 / Rset
+- flash-max-timeout-us : Flash LED maximum timeout
+
+Optional properties for flash LED child nodes:
+- label : see Documentation/devicetree/bindings/leds/common.txt
+
+Example:
+
+ktd2692 {
+   compatible = "kinetic,ktd2692";
+   ctrl-gpio = <&gpc0 1 0>;
+   aux-gpio = <&gpc0 2 0>;
+   vin-supply = <&vbat>;
+
+   flash-led {
+   label = "ktd2692-flash";
+   led-max-microamp = <30>;
+   flash-max-microamp = <150>;
+   flash-max-timeout-us = <1835000>;
+   };
+};
-- 
2.0.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtd: cmdlinepart: allow fill-up partition at any point

2015-05-07 Thread Ben Shelton
Currently, a fill-up partition (indicated by '-') must be the last
partition, and no other partitions can go after it.  Change the
cmdlinepart parsing code to allow a fill-up partition at any point.
This is useful, for example, if you want to reserve a partition at the
end of the flash where the bad block table will go.

Signed-off-by: Ben Shelton 
---
 drivers/mtd/cmdlinepart.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/cmdlinepart.c b/drivers/mtd/cmdlinepart.c
index c850300..2d0eda2 100644
--- a/drivers/mtd/cmdlinepart.c
+++ b/drivers/mtd/cmdlinepart.c
@@ -97,6 +97,7 @@ static struct mtd_partition * newpart(char *s,
  char **retptr,
  int *num_parts,
  int this_part,
+ int size_remaining_found,
  unsigned char **extra_mem_ptr,
  int extra_mem_size)
 {
@@ -110,9 +111,16 @@ static struct mtd_partition * newpart(char *s,
 
/* fetch the partition size */
if (*s == '-') {
+   if (size_remaining_found) {
+   printk(KERN_ERR ERRP
+  "more than one '-' partition specified\n");
+   return ERR_PTR(-EINVAL);
+   }
+
/* assign all remaining space to this partition */
size = SIZE_REMAINING;
s++;
+   size_remaining_found = 1;
} else {
size = memparse(s, &s);
if (size < PAGE_SIZE) {
@@ -169,13 +177,10 @@ static struct mtd_partition * newpart(char *s,
 
/* test if more partitions are following */
if (*s == ',') {
-   if (size == SIZE_REMAINING) {
-   printk(KERN_ERR ERRP "no partitions allowed after a 
fill-up partition\n");
-   return ERR_PTR(-EINVAL);
-   }
/* more partitions follow, parse them */
parts = newpart(s + 1, &s, num_parts, this_part + 1,
-   &extra_mem, extra_mem_size);
+   size_remaining_found, &extra_mem,
+   extra_mem_size);
if (IS_ERR(parts))
return parts;
} else {
@@ -252,6 +257,7 @@ static int mtdpart_setup_real(char *s)
&s, /* out: updated cmdline ptr */
&num_parts, /* out: number of parts */
0,  /* first partition */
+   0,  /* size_remaining not found */
(unsigned char**)&this_mtd, /* out: extra mem */
mtd_id_len + 1 + sizeof(*this_mtd) +
sizeof(void*)-1 /*alignment*/);
@@ -313,6 +319,7 @@ static int parse_cmdline_partitions(struct mtd_info *master,
int i, err;
struct cmdline_mtd_partition *part;
const char *mtd_id = master->name;
+   int sr_part_num = -1;
 
/* parse command line */
if (!cmdline_parsed) {
@@ -339,8 +346,10 @@ static int parse_cmdline_partitions(struct mtd_info 
*master,
else
offset = part->parts[i].offset;
 
-   if (part->parts[i].size == SIZE_REMAINING)
-   part->parts[i].size = master->size - offset;
+   if (part->parts[i].size == SIZE_REMAINING) {
+   sr_part_num = i;
+   continue;
+   }
 
if (offset + part->parts[i].size > master->size) {
printk(KERN_WARNING ERRP
@@ -361,6 +370,16 @@ static int parse_cmdline_partitions(struct mtd_info 
*master,
}
}
 
+   /* if a partition was marked as SIZE_REMAINING */
+   if (sr_part_num != -1) {
+   /* fix up the size of the SIZE_REMAINING partition */
+   part->parts[sr_part_num].size = master->size - offset;
+
+   /* fix up the offsets of the subsequent partitions */
+   for (i = (sr_part_num + 1); i < part->num_parts; i++)
+   part->parts[i].offset += part->parts[sr_part_num].size;
+   }
+
*pparts = kmemdup(part->parts, sizeof(*part->parts) * part->num_parts,
  GFP_KERNEL);
if (!*pparts)
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/13] KVM: MMU: fix for CR4.SMEP=1, CR0.WP=0?

2015-05-07 Thread Xiao Guangrong



On 04/30/2015 07:36 PM, Paolo Bonzini wrote:

smep_andnot_wp is initialized in kvm_init_shadow_mmu and shadow pages
should not be reused for different values of it.  Thus, it has to be
added to the mask in kvm_mmu_pte_write.




Good catch!

Reviewed-by: Xiao Guangrong 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] watchdog: dw_wdt: keepalive the watchdog at write time

2015-05-07 Thread Jisheng Zhang
On Thu, 7 May 2015 15:09:24 -0700
Doug Anderson  wrote:

> If you've got code that does this in a tight loop
>   1. Open watchdog
>   2. Send 'expect close'
>   3. Close watchdog
> ...you'll eventually trigger a watchdog reset.  You can reproduce this
> by using daisydog (1) and running:
>   while true; do daisydog -c > /dev/null; done
> 
> The problem is that each time you write to the watchdog for 'expect
> close' it moves the timer .5 seconds out.  The timer thus never fires
> and never pats the watchdog for you.
> 
> 1: http://git.chromium.org/gitweb/?p=chromiumos/third_party/daisydog.git
> 
> Signed-off-by: Doug Anderson 
> ---
>  drivers/watchdog/dw_wdt.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
> index 3fa2f19..ff5d734 100644
> --- a/drivers/watchdog/dw_wdt.c
> +++ b/drivers/watchdog/dw_wdt.c
> @@ -220,6 +220,7 @@ static ssize_t dw_wdt_write(struct file *filp, const char 
> __user *buf,
>   }
>  
>   dw_wdt_set_next_heartbeat();
> + dw_wdt_keepalive();
>   mod_timer(&dw_wdt.timer, jiffies + WDT_TIMEOUT);
>  
>   return len;

Tested-by: Jisheng Zhang 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] vfs: add a O_NOMTIME flag

2015-05-07 Thread Dave Chinner
On Thu, May 07, 2015 at 12:53:46PM -0700, Andy Lutomirski wrote:
> On Thu, May 7, 2015 at 12:09 PM, Richard Weinberger
>  wrote:
> > On Thu, May 7, 2015 at 7:20 PM, Zach Brown  wrote:
> >> On Thu, May 07, 2015 at 10:26:17AM +1000, Dave Chinner wrote:
> >>> On Wed, May 06, 2015 at 03:00:12PM -0700, Zach Brown wrote:
> >>> > Add the O_NOMTIME flag which prevents mtime from being updated which can
> >>> > greatly reduce the IO overhead of writes to allocated and initialized
> >>> > regions of files.
> >>>
> >>> Hmmm. How do backup programs now work out if the file has changed
> >>> and hence needs copying again? ie. applications using this will
> >>> break other critical infrastructure in subtle ways.
> >>
> >> By using backup infrastructure that doesn't use cmtime.  Like btrfs
> >> send/recv.  Or application level backups that know how to do
> >> incrementals from metadata in giant database files, say, without
> >> walking, comparing, and copying the entire thing.
> >
> > But how can Joey random user know that some of his
> > applications are using O_NOMTIME and his KISS backup
> > program does no longer function as expected?
> >
> 
> Joey random user can't have a working KISS backup anyway, though,
> because we screw up mtime updates on mmap writes.  I have patches
> gathering dust that fix that, though.

They are close enough to be good for backup purposes. The mtime only
need change once per backup period - it doesn't need to be
millisecond accurate. Yes, I know you needed that changed for
different reasons (avoid variable page fault latency), but it
doesn't matter for once-a-day or even once-an-hour incremental
backup scans.

Besides, anyone who cares about accurate backups is doing a backup
from a snapshot so they data and metadata is consistent across the
entire backup. And that makes worries about mmap and mtime
completely irrelevant because a snapshot freezes the filesystem and
hence cleans all the mapped pages. Once the snapshot is taken
the next mmap write will trigger a page fault and so change the
mtime and it will be picked up in the next backup scan...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] watchdog: dw_wdt: Use a mutex, not a spinlock

2015-05-07 Thread Jisheng Zhang
On Thu, 7 May 2015 15:09:23 -0700
Doug Anderson  wrote:

> Right now the dw_wdt uses a spinlock to protect dw_wdt_open().  The
> problem is that while holding the spinlock we call:
> -> dw_wdt_set_top()
>-> dw_wdt_top_in_seconds()
>   -> clk_get_rate()
>  -> clk_prepare_lock()
> -> mutex_lock()
> 
> Locking a mutex while holding a spinlock is not allowed and leads to
> warnings like "BUG: spinlock wrong CPU on CPU#1", among other
> problems.
> 
> There's no reason to use a spinlock, so switch to a mutex.
> 
> Signed-off-by: Doug Anderson 
> ---
>  drivers/watchdog/dw_wdt.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/watchdog/dw_wdt.c b/drivers/watchdog/dw_wdt.c
> index d0bb949..3fa2f19 100644
> --- a/drivers/watchdog/dw_wdt.c
> +++ b/drivers/watchdog/dw_wdt.c
> @@ -30,12 +30,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -61,7 +61,7 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once 
> started "
>  #define WDT_TIMEOUT  (HZ / 2)
>  
>  static struct {
> - spinlock_t  lock;
> + struct mutexlock;
>   void __iomem*regs;
>   struct clk  *clk;
>   unsigned long   in_use;
> @@ -177,7 +177,7 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>   /* Make sure we don't get unloaded. */
>   __module_get(THIS_MODULE);
>  
> - spin_lock(&dw_wdt.lock);
> + mutex_lock(&dw_wdt.lock);
>   if (!dw_wdt_is_enabled()) {
>   /*
>* The watchdog is not currently enabled. Set the timeout to
> @@ -190,7 +190,7 @@ static int dw_wdt_open(struct inode *inode, struct file 
> *filp)
>  
>   dw_wdt_set_next_heartbeat();
>  
> - spin_unlock(&dw_wdt.lock);
> + mutex_unlock(&dw_wdt.lock);
>  
>   return nonseekable_open(inode, filp);
>  }
> @@ -348,7 +348,7 @@ static int dw_wdt_drv_probe(struct platform_device *pdev)
>   if (ret)
>   return ret;
>  
> - spin_lock_init(&dw_wdt.lock);
> + mutex_init(&dw_wdt.lock);
>  
>   ret = misc_register(&dw_wdt_miscdev);
>   if (ret)

Tested on marvell BG2Q SoC, no issue found. So

Tested-by: Jisheng Zhang 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] vfs: add a O_NOMTIME flag

2015-05-07 Thread Dave Chinner
On Thu, May 07, 2015 at 10:20:53AM -0700, Zach Brown wrote:
> On Thu, May 07, 2015 at 10:26:17AM +1000, Dave Chinner wrote:
> > On Wed, May 06, 2015 at 03:00:12PM -0700, Zach Brown wrote:
> > > Add the O_NOMTIME flag which prevents mtime from being updated which can
> > > greatly reduce the IO overhead of writes to allocated and initialized
> > > regions of files.
> > 
> > Hmmm. How do backup programs now work out if the file has changed
> > and hence needs copying again? ie. applications using this will
> > break other critical infrastructure in subtle ways.
> 
> By using backup infrastructure that doesn't use cmtime.  Like btrfs
> send/recv.  Or application level backups that know how to do
> incrementals from metadata in giant database files, say, without
> walking, comparing, and copying the entire thing.

"Use magical thing that doesn't exist"? Really?

e.g. you can't do incremental backups with tools like xfsdump if
mtime is not being updated.  The last thing an admin wants when
doing disaster recovery is to find out that the app started using
O_NOMTIME as a result of the upgrade they did 6 months ago. Hence
the last 6 months of production data isn't in the backups despite
the backup procedure having been extensively tested and verified
when it was first put in place.

> > > The criteria for using O_NOMTIME is the same as for using O_NOATIME:
> > > owning the file or having the CAP_FOWNER capability.  If we're not
> > > comfortable allowing owners to prevent mtime/ctime updates then we
> > > should add a tunable to allow O_NOMTIME.  Maybe a mount option?
> > 
> > I dislike "turn off safety for performance" options because Joe
> > SpeedRacer will always select performance over safety.
> 
> Well, for ceph there's no safety concern.  They never use cmtime in
> these files.

Understood.

> So are you suggesting not implementing this

No.

> Or are we talking about adding some speed bumps
> that ceph can flip on that might give Joe Speedracer pause?

Yes, but not just Joe Speedracer - if it can be turned on silently
by apps then it's a great big landmine that most users and sysadmins
will not know about until it is too late.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mmc: core: Check for timeout before checking mmc device state

2015-05-07 Thread Matt Bennett
On a system that has multiple devices on the mmc bus the host can
block on the mutex that protects access to the bus. Some operations
require the status of the device to be polled to see when the device
finishes executing the previous command that was sent to it (if
there is no busy detection in hardware). The current execution order
to check the status is:

LOOP
{
  1. Send command to device to retrieve the status (this can block).
  2. Check we haven't exceeded the timeout value. If we have then
 return an error.
  3. If the device is no longer in the program state then exit the
 loop and continue through the function.
}

If the send command blocks (and the timeout is exceeded) then the
function returns (and prints) an error even though the device has
likely left the programming state (due to the lengthy period of
time while the bus is blocked). By moving the timeout check before
retrieving the device status in the loop we better handle the case
where the mmc bus has been blocked but the device has left the
programming state.

Signed-off-by: Matt Bennett 
Cc: kuninori.morimoto...@renesas.com
Cc: jh80.ch...@samsung.com
Cc: sb...@codeaurora.org
Cc: johan.rudh...@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-...@vger.kernel.org
---
 drivers/mmc/card/block.c   | 22 +++---
 drivers/mmc/core/core.c| 20 ++--
 drivers/mmc/core/mmc_ops.c | 14 +++---
 3 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 2c25271..0abefde 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -747,6 +747,17 @@ static int card_busy_detect(struct mmc_card *card, 
unsigned int timeout_ms,
u32 status;
 
do {
+   /*
+* Timeout if the device never becomes ready for data and never
+* leaves the program state.
+*/
+   if (time_after(jiffies, timeout)) {
+   pr_err("%s: Card stuck in programming state! %s %s\n",
+   mmc_hostname(card->host),
+   req->rq_disk->disk_name, __func__);
+   return -ETIMEDOUT;
+   }
+
err = get_card_status(card, &status, 5);
if (err) {
pr_err("%s: error %d requesting status\n",
@@ -766,17 +777,6 @@ static int card_busy_detect(struct mmc_card *card, 
unsigned int timeout_ms,
break;
 
/*
-* Timeout if the device never becomes ready for data and never
-* leaves the program state.
-*/
-   if (time_after(jiffies, timeout)) {
-   pr_err("%s: Card stuck in programming state! %s %s\n",
-   mmc_hostname(card->host),
-   req->rq_disk->disk_name, __func__);
-   return -ETIMEDOUT;
-   }
-
-   /*
 * Some cards mishandle the status bits,
 * so make sure to check both the busy
 * indication and the card state.
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index c296bc0..6e56cb3 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2047,6 +2047,16 @@ static int mmc_do_erase(struct mmc_card *card, unsigned 
int from,
 
timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
do {
+   /* Timeout if the device never becomes ready for data and
+* never leaves the program state.
+*/
+   if (time_after(jiffies, timeout)) {
+   pr_err("%s: Card stuck in programming state! %s\n",
+   mmc_hostname(card->host), __func__);
+   err =  -EIO;
+   goto out;
+   }
+
memset(&cmd, 0, sizeof(struct mmc_command));
cmd.opcode = MMC_SEND_STATUS;
cmd.arg = card->rca << 16;
@@ -2060,16 +2070,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned 
int from,
goto out;
}
 
-   /* Timeout if the device never becomes ready for data and
-* never leaves the program state.
-*/
-   if (time_after(jiffies, timeout)) {
-   pr_err("%s: Card stuck in programming state! %s\n",
-   mmc_hostname(card->host), __func__);
-   err =  -EIO;
-   goto out;
-   }
-
} while (!(cmd.resp[0] & R1_READY_FOR_DATA) ||
 (R1_CURRENT_STATE(cmd.resp[0]) == R1_STATE_PRG));
 out:
diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
index 0ea042d..b30ed91 100644
--- a/drivers/mmc/core/mmc_ops.c
+++ b/drivers/mmc/core/mmc_ops.c
@@ -526,6 +526,13 @@ int

Re: [RFC v1 01/11] genirq: Introduce struct irq_common_data to host shared irq data

2015-05-07 Thread Yun Wu (Abel)
On 2015/5/4 11:15, Jiang Liu wrote:

[...]
> diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
> index dd1109fb241e..3010e99abf3e 100644
> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -47,6 +47,7 @@ struct pt_regs;
>   * @name:flow handler name for /proc/interrupts output
>   */
>  struct irq_desc {
> + struct irq_common_data  irq_common_data;

Hi Gerry,

Please update description as well. :)

Thanks,
Abel

>   struct irq_data irq_data;
>   unsigned int __percpu   *kstat_irqs;
>   irq_flow_handler_t  handle_irq;
> diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
> index df553b0af936..ed84299788b3 100644
> --- a/kernel/irq/internals.h
> +++ b/kernel/irq/internals.h
> @@ -170,27 +170,27 @@ irq_put_desc_unlock(struct irq_desc *desc, unsigned 
> long flags)
>   */
>  static inline void irqd_set_move_pending(struct irq_data *d)
>  {
> - d->state_use_accessors |= IRQD_SETAFFINITY_PENDING;
> + __irqd_to_state(d) |= IRQD_SETAFFINITY_PENDING;
>  }
>  
>  static inline void irqd_clr_move_pending(struct irq_data *d)
>  {
> - d->state_use_accessors &= ~IRQD_SETAFFINITY_PENDING;
> + __irqd_to_state(d) &= ~IRQD_SETAFFINITY_PENDING;
>  }
>  
>  static inline void irqd_clear(struct irq_data *d, unsigned int mask)
>  {
> - d->state_use_accessors &= ~mask;
> + __irqd_to_state(d) &= ~mask;
>  }
>  
>  static inline void irqd_set(struct irq_data *d, unsigned int mask)
>  {
> - d->state_use_accessors |= mask;
> + __irqd_to_state(d) |= mask;
>  }
>  
>  static inline bool irqd_has_set(struct irq_data *d, unsigned int mask)
>  {
> - return d->state_use_accessors & mask;
> + return __irqd_to_state(d) & mask;
>  }
>  
>  static inline void kstat_incr_irqs_this_cpu(unsigned int irq, struct 
> irq_desc *desc)
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 99793b9b6d23..eac1aac906ea 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -76,6 +76,7 @@ static void desc_set_defaults(unsigned int irq, struct 
> irq_desc *desc, int node,
>  {
>   int cpu;
>  
> + desc->irq_data.common = &desc->irq_common_data;
>   desc->irq_data.irq = irq;
>   desc->irq_data.chip = &no_irq_chip;
>   desc->irq_data.chip_data = NULL;
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 7fac311057b8..3552b8750efd 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -834,6 +834,7 @@ static struct irq_data *irq_domain_insert_irq_data(struct 
> irq_domain *domain,
>   if (irq_data) {
>   child->parent_data = irq_data;
>   irq_data->irq = child->irq;
> + irq_data->common = child->common;
>   irq_data->node = child->node;
>   irq_data->domain = domain;
>   }



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v1 02/11] genirq: Move field 'node' from struct irq_data into struct irq_common_data

2015-05-07 Thread Yun Wu (Abel)
Hi Gerry,
On 2015/5/4 11:15, Jiang Liu wrote:

> NUMA node information is per-irq instead of per-irqchip, so move it into
> struct irq_common_data.
> 
> Signed-off-by: Jiang Liu 
> ---
>  arch/sh/kernel/irq.c  |2 +-
>  arch/x86/kernel/apic/vector.c |8 
>  arch/x86/platform/uv/uv_irq.c |2 +-
>  include/linux/irq.h   |   20 ++--
>  kernel/irq/internals.h|5 +
>  kernel/irq/irqdesc.c  |   10 ++
>  kernel/irq/irqdomain.c|4 ++--
>  kernel/irq/manage.c   |2 +-
>  kernel/irq/proc.c |2 +-
>  9 files changed, 35 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c
> index eb10ff84015c..8dc677cc136b 100644
> --- a/arch/sh/kernel/irq.c
> +++ b/arch/sh/kernel/irq.c
> @@ -227,7 +227,7 @@ void migrate_irqs(void)
>   for_each_active_irq(irq) {
>   struct irq_data *data = irq_get_irq_data(irq);
>  
> - if (data->node == cpu) {
> + if (irq_data_get_node(data) == cpu) {
>   unsigned int newcpu = cpumask_any_and(data->affinity,
> cpu_online_mask);
>   if (newcpu >= nr_cpu_ids) {
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 96ce5068a926..983bea2a09ce 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -345,7 +345,7 @@ static int x86_vector_alloc_irqs(struct irq_domain 
> *domain, unsigned int virq,
>   struct irq_alloc_info *info = arg;
>   struct apic_chip_data *data;
>   struct irq_data *irq_data;
> - int i, err;
> + int i, err, node;
>  
>   if (disable_apic)
>   return -ENXIO;
> @@ -357,12 +357,13 @@ static int x86_vector_alloc_irqs(struct irq_domain 
> *domain, unsigned int virq,
>   for (i = 0; i < nr_irqs; i++) {
>   irq_data = irq_domain_get_irq_data(domain, virq + i);
>   BUG_ON(!irq_data);
> + node = irq_data_get_node(irq_data);
>  #ifdef   CONFIG_X86_IO_APIC
>   if (virq + i < nr_legacy_irqs() && legacy_irq_data[virq + i])
>   data = legacy_irq_data[virq + i];
>   else
>  #endif
> - data = alloc_apic_chip_data(irq_data->node);
> + data = alloc_apic_chip_data(node);
>   if (!data) {
>   err = -ENOMEM;
>   goto error;
> @@ -371,8 +372,7 @@ static int x86_vector_alloc_irqs(struct irq_domain 
> *domain, unsigned int virq,
>   irq_data->chip = &lapic_controller;
>   irq_data->chip_data = data;
>   irq_data->hwirq = virq + i;
> - err = assign_irq_vector_policy(virq, irq_data->node, data,
> -info);
> + err = assign_irq_vector_policy(virq, node, data, info);
>   if (err)
>   goto error;
>   }
> diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c
> index cdf86cd3fd97..bc992b7b041f 100644
> --- a/arch/x86/platform/uv/uv_irq.c
> +++ b/arch/x86/platform/uv/uv_irq.c
> @@ -89,7 +89,7 @@ static int uv_domain_alloc(struct irq_domain *domain, 
> unsigned int virq,
>   return -EINVAL;
>  
>   chip_data = kmalloc_node(sizeof(*chip_data), GFP_KERNEL,
> -  irq_data->node);
> +  irq_data_get_node(irq_data));
>   if (!chip_data)
>   return -ENOMEM;
>  
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 3b6e0def7f5c..3f999a0af713 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -129,9 +129,13 @@ struct irq_domain;
>   * struct irq_common_data - per irq data shared by all irqchips
>   * @state_use_accessors: status information for irq chip functions.
>   *   Use accessor functions to deal with it
> + * @node:node index useful for balancing
>   */
>  struct irq_common_data {
>   unsigned intstate_use_accessors;
> +#ifdef CONFIG_SMP

Would CONFIG_NUMA be a little more appropriate?
Or even let @node be always compiled?

Thanks,
Abel

> + unsigned intnode;
> +#endif
>  };
>  
>  /**
> @@ -139,7 +143,6 @@ struct irq_common_data {
>   * @mask:precomputed bitmask for accessing the chip registers
>   * @irq: interrupt number
>   * @hwirq:   hardware interrupt number, local to the interrupt domain
> - * @node:node index useful for balancing
>   * @common:  point to data shared by all irqchips
>   * @chip:low level interrupt hardware access
>   * @domain:  Interrupt translation domain; responsible for mapping
> @@ -160,7 +163,6 @@ struct irq_data {
>   u32 mask;
>   unsigned intirq;
>   unsigned long   

Re: [PATCH 10/10] drivers/crypto/nx: add hardware 842 crypto comp alg

2015-05-07 Thread Herbert Xu
On Thu, May 07, 2015 at 11:06:06AM -0400, Dan Streetman wrote:
> 
> The crypto 842-nx has (significant) code in it to handle any alignment
> and length input buffers, to match them to what the driver requires.
> Would it be better to move that into the crypto code, so that any
> crypto compression hw driver can request buffers be specifically
> aligned/sized?  I did have to use a header on each compressed buffer
> that needed re-alignment or re-sizing, so maybe it's not appropriate
> for common crypto compression code.

Yes we could certainly move this logic into the crypto layer, as
we do for ciphers and hashes.

But as you say we could make the next guy who writes a comp driver
do this :)

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >