Bad DMA from Marvell 9230

2014-03-26 Thread Benjamin Herrenschmidt
Hi Folks !

Do that ring any bell ?

I've been trying a 9230 on a power box here (a 9235 on the same machine
works fine) and it blows up with an IOMMU violation early during init.

>From what I can tell the scenario is:

- So we still haven't issued any command per-se, all our DMA command
buffers etc... are all 0's at the point of the error.

 - The core libata calls the AHCI driver's ahci_hardreset() for each
port in a separate thread. They all call sata_link_hardreset().

 - This in turns calls sata_link_resume() which write to the SCR_CONTROL
register as follow:

scontrol = (scontrol & 0x0f0) | 0x300;
if ((rc = sata_scr_write(link, SCR_CONTROL, scontrol)))
{
printk(" -> sata_link_resume FAIL 2\n");
return rc;
}

/*
 * Some PHYs react badly if SStatus is pounded
 * immediately after resuming.  Delay 200ms before
 * debouncing.
 */
ata_msleep(link->ap, 200);

I get the interrupt from the IOMMU about 2ms after the write to
SCR_CONTROL.

Now, pending misinterpretation of some bits on my side, it looks like
the bad DMA is a DMA *read* from address 0 (which we never map,
typically to catch driver bugs).

I went through a few theories with this one but so far none held. I
don't think it's a D2H FIS issue since the DMA pointers for that appear
to be setup properly, the memory mapped, etc...

I though the chip might incorrectly/inadvertently try to (pre)fetch a
command. At that point all 32 command slots are all 0's, so if it
ignored the size it might try to fetch from command address 0.

So I added a loop to fill all 32 slots with a valid command address
in ahci_hardreset:

+   for (i = 0; i < 32; i++)
+   ahci_fill_cmd_slot(pp, i, 0);
rc = sata_link_hardreset(link, timing, deadline, &online,
 ahci_check_ready);

But that had basically no effect.

I've contacted Marvell, but I was wondering if anybody here had already
experienced something similar or has an idea of what else the chip
might be doing wrong so we can try to find a workaround ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Massive read only kvm guests when backing file was missing

2014-03-26 Thread Michael S. Tsirkin
On Wed, Mar 26, 2014 at 11:08:03PM -0300, Alejandro Comisario wrote:
> Hi List!
> Hope some one can help me, we had a big issue in our cloud the other
> day, a couple of our openstack regions ( +2000 kvm guests with qcow2 )
> went read only filesystem from the guest side because the backing
> files directory (the openstack _base directory) was compromised and
> the data was lost, when we realized the data was lost, it took us 5
> mins to restore the backup of the backing files, but by that time all
> the kvm guests received some kind of IO error from the hypervisor
> layer, and went read only on root filesystem.
> 
> My question would be, is there a way to hold the IO operations against
> the backing files ( i thought that would be 99% READ operations ) for
> a little longer ( im asking this because i dont quite understand what
> is the process and when it raises the error ) in a case the backing
> files are missing (no IO possible) but is recoverable within minutes ?
> 
> Any tip  on how to achieve this if possible, or information about how
> backing files works on kvm, will be amazing.
> Waiting for feedback!
> 
> kindest regards.
> Alejandro Comisario


I'm guessing this is what happened: guests timed out meanwhile.
You can increase the timeout within the guest:
echo 600 > /sys/block/sda/device/timeout
to timeout after 10 minutes.

If you have installed qemu guest agent on your system, you can do this
from the host. Unfortunately by default it's memory can be pushed out to swap
and then on disk error access there might will fail :(
Maybe we should consider mlock on all its memory at least as an option.

You could pause your guests, restart them after the issue is resolved,
and we could I guess add functionality to pause VM on disk errors
automatically.
Stefan?


-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tick, broadcast: Prevent false alarm when force mask contains offline cpus

2014-03-26 Thread Srivatsa S. Bhat
On 03/27/2014 08:32 AM, Preeti U Murthy wrote:
> On 03/26/2014 04:51 PM, Srivatsa S. Bhat wrote:
>> On 03/26/2014 09:26 AM, Preeti U Murthy wrote:
>>> Its possible that the tick_broadcast_force_mask contains cpus which are not
>>> in cpu_online_mask when a broadcast tick occurs. This could happen under the
>>> following circumstance assuming CPU1 is among the CPUs waiting for 
>>> broadcast.
>>>
>>> CPU0CPU1
>>>
>>> Run CPU_DOWN_PREPARE notifiers
>>>
>>> Start stop_machine  Gets woken up by IPI to run
>>> stop_machine, sets itself in
>>> tick_broadcast_force_mask if the
>>> time of broadcast interrupt is around
>>> the same time as this IPI.
>>>
>>> Start stop_machine
>>>   set_cpu_online(cpu1, false)
>>> End stop_machineEnd stop_machine
>>>
>>> Broadcast interrupt
>>>   Finds that cpu1 in
>>>   tick_broadcast_force_mask is offline
>>>   and triggers the WARN_ON in
>>>   tick_handle_oneshot_broadcast()
>>>
>>> Clears all broadcast masks
>>> in CPU_DEAD stage.
>>>
>>> This WARN_ON was added to capture scenarios where the broadcast mask, be it
>>> oneshot/pending/force_mask contain offline cpus whose tick devices have been
>>> removed. But here is a case where we trigger the warn on in a valid 
>>> scenario.
>>>
>>> One could argue that the scenario is invalid and ought to be warned against
>>> because ideally the broadcast masks need to be cleared of the cpus about to
>>> go offine before clearing them in the online_mask so that we dont hit these
>>> scenarios.
>>>
>>> This would mean clearing the masks in CPU_DOWN_PREPARE stage.
>>
>> Not necessarily. We could clear the mask in the CPU_DYING stage. That way,
>> offline CPUs will automatically get cleared from the force_mask and hence
>> the tick-broadcast code will not need to have a special case to deal with
>> this scenario. What do you think?
> 
> Ok I gave some thought to this. This will not work with the hrtimer mode
> of broadcast framework going in. This is the feature that was added for
> implementations of such archs which do not have an external clock device
> to wake them up in deep idle states when the local timers stop. They
> assign one of the CPUs as an agent to wake them up. When this designated
> CPU gets hotplugged out, we need to assign this duty to some other CPU.
> 
> The way this is being done now is in
> tick_shutdown_broadcast_oneshot_control() which is also responsible for
> clearing the broadcast masks. When the hrtimer mode of broadcast is
> active, then in addition to clearing masks in this function we make the
> CPU executing this function take on the task of waking up CPUs in deep
> idle state if the hotplugged CPU was doing this earlier.
> 
> Currently tick_shutdown_broadcast_oneshot_control() is being executed in
> the CPU_DEAD notification and this is guarenteed to run on a CPU *other
> than* the dying CPU. Hence we can safely do this.
> 
> However if we move this function underneath CPU_DYING notifier, this
> will turn out to be a disaster since IIUC the dying CPU is running this
> notifier and will end up re-assigning the duty of waking up CPUs to itself.
> 

Actually, my suggestion was to remove the dying CPU from the force_mask alone,
in the CPU_DYING notifier. The rest of the cleanup (removing it from the other
masks, moving the broadcast duty to someone else etc can still be done at
the CPU_DEAD stage). Also, note that the CPU which is set in force_mask is
definitely not the one doing the broadcast.

Basically, my reasoning was this:

If we look at how the 3 broadcast masks (oneshot, pending and force) are
set and cleared during idle entry/exit, we see this pattern:

oneshot_mask: This is set at BROADCAST_ENTER and cleared at EXIT.
pending_mask: This is set at tick_handle_oneshot_broadcast and cleared at
  EXIT.
force_mask:   This is set at EXIT and cleared at the next call to
  tick_handle_oneshot_broadcast. (Also, if the CPU is set in this
  mask, the CPU doesn't enter deep idle states in subsequent
  idle durations, and keeps polling instead, until it gets the
  broadcast interrupt).

What we can derive from this is that force_mask is the only mask that can
remain set across an idle ENTER/EXIT sequence. Both of the other 2 masks
can never remain set across a full idle ENTER/EXIT sequence. And a CPU going
offline certainly goes through EXIT if it had gone through ENTER, before
entering stop_machine().

That means, force_mask is the only odd one out here, which can remain set
when entering stop_machine() for CPU offline. So that's the only mask that
needs to be cleared separately. The other 2 masks take care of themselves
automatically. So, we can have a CPU_DYING callbac

[PATCH] phy/at8031: enable at8031 to work on interrupt mode

2014-03-26 Thread Zhao Qiang
The at8031 can work on polling mode and interrupt mode.
Add ack_interrupt and config intr funcs to enable
interrupt mode for it.

Signed-off-by: Zhao Qiang 
---
 drivers/net/phy/at803x.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index bc71947..d034ef5 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -27,6 +27,9 @@
 #define AT803X_MMD_ACCESS_CONTROL  0x0D
 #define AT803X_MMD_ACCESS_CONTROL_DATA 0x0E
 #define AT803X_FUNC_DATA   0x4003
+#define AT803X_INER0x0012
+#define AT803X_INER_INIT   0xec00
+#define AT803X_INSR0x0013
 #define AT803X_DEBUG_ADDR  0x1D
 #define AT803X_DEBUG_DATA  0x1E
 #define AT803X_DEBUG_SYSTEM_MODE_CTRL  0x05
@@ -191,6 +194,31 @@ static int at803x_config_init(struct phy_device *phydev)
return 0;
 }
 
+static int at803x_ack_interrupt(struct phy_device *phydev)
+{
+   int err;
+
+   err = phy_read(phydev, AT803X_INSR);
+
+   return (err < 0) ? err : 0;
+}
+
+static int at803x_config_intr(struct phy_device *phydev)
+{
+   int err;
+   int value;
+
+   value = phy_read(phydev, AT803X_INER);
+
+   if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
+   err = phy_write(phydev, AT803X_INER,
+   (value | AT803X_INER_INIT));
+   else
+   err = phy_write(phydev, AT803X_INER, value);
+
+   return err;
+}
+
 static struct phy_driver at803x_driver[] = {
 {
/* ATHEROS 8035 */
@@ -240,6 +268,8 @@ static struct phy_driver at803x_driver[] = {
.flags  = PHY_HAS_INTERRUPT,
.config_aneg= genphy_config_aneg,
.read_status= genphy_read_status,
+   .ack_interrupt  = &at803x_ack_interrupt,
+   .config_intr= &at803x_config_intr,
.driver = {
.owner = THIS_MODULE,
},
-- 
1.8.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/

2014-03-26 Thread Madhavan Srinivasan
On Tuesday 25 March 2014 11:06 PM, Kirill A. Shutemov wrote:
> On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
>> Kirill A. Shutemov with the commit 96bacfe542 introduced
>> vm_ops->map_pages() for mapping easy accessible pages around
>> fault address in hope to reduce number of minor page faults.
>> Based on his workload runs, suggested FAULT_AROUND_ORDER
>> (knob to control the numbers of pages to map) is 4.
>>
>> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
>> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
>> value based on performance data for that architecture.
>>
>> Signed-off-by: Madhavan Srinivasan 
>> ---
>>  arch/powerpc/include/asm/pgtable.h |6 ++
>>  arch/x86/include/asm/pgtable.h |5 +
>>  include/asm-generic/pgtable.h  |   10 ++
>>  mm/memory.c|2 --
>>  4 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pgtable.h 
>> b/arch/powerpc/include/asm/pgtable.h
>> index 3ebb188..9fcbd48 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -19,6 +19,12 @@ struct mm_struct;
>>  #endif
>>  
>>  /*
>> + * With a few real world workloads that were run,
>> + * the performance data showed that a value of 3 is more advantageous.
>> + */
>> +#define FAULT_AROUND_ORDER  3
>> +
>> +/*
>>   * We save the slot number & secondary bit in the second half of the
>>   * PTE page. We use the 8 bytes per each pte entry.
>>   */
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index 938ef1d..8387a65 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -7,6 +7,11 @@
>>  #include 
>>  
>>  /*
>> + * Based on Kirill's test results, fault around order is set to 4
>> + */
>> +#define FAULT_AROUND_ORDER 4
>> +
>> +/*
>>   * Macro to mark a page protection value as UC-
>>   */
>>  #define pgprot_noncached(prot)  \
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 1ec08c1..62f7f07 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -7,6 +7,16 @@
>>  #include 
>>  #include 
>>  
>> +
>> +/*
>> + * Fault around order is a control knob to decide the fault around pages.
>> + * Default value is set to 0UL (disabled), but the arch can override it as
>> + * desired.
>> + */
>> +#ifndef FAULT_AROUND_ORDER
>> +#define FAULT_AROUND_ORDER  0UL
>> +#endif
> 
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
> 

ok agreed. I am thinking of adding FAULT_AROUND_ORDER check with
map_pages check in the do_read_fault. Kindly share your thoughts.

With regards
Maddy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] random32: avoid attempt to late reseed if in the middle of seeding

2014-03-26 Thread Hannes Frederic Sowa
On Thu, Mar 27, 2014 at 02:01:35AM -0400, Sasha Levin wrote:
> Commit 4af712e8df ("random32: add prandom_reseed_late() and call when
> nonblocking pool becomes initialized") has added a late reseed stage
> that happens as soon as the nonblocking pool is marked as initialized.
> 
> This fails in the case that the nonblocking pool gets initialized
> during __prandom_reseed()'s call to get_random_bytes(). In that case
> we'd double back into __prandom_reseed() in an attempt to do a late
> reseed - deadlocking on 'lock' early on in the boot process.
> 
> Instead, just avoid even waiting to do a reseed if a reseed is already
> occuring.
> 
> Signed-off-by: Sasha Levin 

Thanks for fixing this!

Acked-by: Hannes Frederic Sowa 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] vfs fixes

2014-03-26 Thread Sedat Dilek
On Wed, Mar 26, 2014 at 9:55 PM, Linus Torvalds
 wrote:
> On Wed, Mar 26, 2014 at 9:36 AM, Sedat Dilek  wrote:
>>
>> Looking at [1] you did not pull-in the new changes.
>> Are you waiting for a new pull-request?
>
> Yeah, with the top commit updated, I'd like to make sure I get the right pull.
>

AFAICS, it was a typo...

s/hlist_del_rcu()/hlist_del_init_rcu()

- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] random32: avoid attempt to late reseed if in the middle of seeding

2014-03-26 Thread Sasha Levin
Commit 4af712e8df ("random32: add prandom_reseed_late() and call when
nonblocking pool becomes initialized") has added a late reseed stage
that happens as soon as the nonblocking pool is marked as initialized.

This fails in the case that the nonblocking pool gets initialized
during __prandom_reseed()'s call to get_random_bytes(). In that case
we'd double back into __prandom_reseed() in an attempt to do a late
reseed - deadlocking on 'lock' early on in the boot process.

Instead, just avoid even waiting to do a reseed if a reseed is already
occuring.

Signed-off-by: Sasha Levin 
---
 lib/random32.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/random32.c b/lib/random32.c
index b33b23e..d67b6a7 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -245,8 +245,20 @@ static void __prandom_reseed(bool late)
static bool latch = false;
static DEFINE_SPINLOCK(lock);
 
+   /*
+* Asking for random bytes might result in bytes getting
+* moved into the nonblocking pool and thus marking it
+* as initialized. In this case we would double back into
+* this function and attempt to do a late reseed.
+* Ignore the pointless attempt to reseed again if we're
+* already waiting for bytes when the nonblocking pool
+* got initialized.
+*/
+
/* only allow initial seeding (late == false) once */
-   spin_lock_irqsave(&lock, flags);
+   if (!spin_trylock_irqsave(&lock, flags))
+   return; 
+
if (latch && !late)
goto out;
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Robert Hancock

On 21/03/14 08:50 AM, jimmie.da...@l-3com.com wrote:>
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org

> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
>
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
>
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
>
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
>
> -Mike
>
> Two options.
>
> #1. Return with a status value of EAGAIN.
>
> or
>
> #2.  Don't return until you can do it.
>
> If SCHED_FIFO is used, and mlock() is called, the intention of the 
user is very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not 
block).


Returning EAGAIN is not something that the API definition from POSIX 
allows for, that is only for indicating a failure. If the memory that is 
being locked is not currently residing in RAM, then the memory will need 
to be swapped in before the call returns, which clearly cannot be done 
without blocking. Thus mlock can potentially block, which has not 
changed. Whether or not any kernel behavior has changed to cause this to 
happen in some cases where it didn't previously, the fact remains that 
this is allowed behavior.


Generally real-time applications should not be doing mlock calls during 
their real-time execution for that reason. The required memory regions 
should be locked during startup so that this kind of execution delay can 
be avoided at runtime.


>
> SCHED_FIFO users don't care about fairness.  They want the system to 
do what it is told.

>
> regards,
> Bud Davis
>
>
>
>
>
>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] random32: assign to network folks in MAINTAINERS

2014-03-26 Thread Sasha Levin
lib/random32.c was split out of the network code and is de-facto
still maintained by the almighty net/ gods.

Make it a bit more official so that people who aren't aware of
that know where to send their patches.

Signed-off-by: Sasha Levin 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e1724d5..47fd188 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6091,6 +6091,7 @@ F:include/uapi/linux/net.h
 F: include/uapi/linux/netdevice.h
 F: tools/net/
 F: tools/testing/selftests/net/
+F: lib/random32.c
 
 NETWORKING [IPv4/IPv6]
 M: "David S. Miller" 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3 Resend] hrtimer: use ffs() to iterate over valid bits from active_bases

2014-03-26 Thread Viresh Kumar
Currently we are iterating over all possible (currently four) bits of
active_bases to see if corresponding clock bases are active. This is good enough
for cases where 3 or 4 bases are used but if only 1 or 2 are used then it makes
more sense to use ffs() to find the right bit directly.

Suggested-by: Thomas Gleixner 
Signed-off-by: Viresh Kumar 
---
V3->Resend: s/__ffs/ffs in commit log :(

V2->V3: Use ffs() instead of __ffs() and decrement 'i' later.

V1->V2: Instead of removing active_bases use __ffs() on it to make loop more
efficient.

 kernel/hrtimer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index acfef5f..2aad8a7 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1265,6 +1265,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
ktime_t expires_next, now, entry_time, delta;
+   unsigned long active_bases = cpu_base->active_bases;
int i, retries = 0;
 
BUG_ON(!cpu_base->hres_active);
@@ -1284,15 +1285,11 @@ retry:
 */
cpu_base->expires_next.tv64 = KTIME_MAX;
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-   struct hrtimer_clock_base *base;
+   while ((i = ffs(active_bases))) {
+   struct hrtimer_clock_base *base = cpu_base->clock_base + --i;
struct timerqueue_node *node;
ktime_t basenow;
 
-   if (!(cpu_base->active_bases & (1 << i)))
-   continue;
-
-   base = cpu_base->clock_base + i;
basenow = ktime_add(now, base->offset);
 
while ((node = timerqueue_getnext(&base->active))) {
@@ -1327,6 +1324,8 @@ retry:
 
__run_hrtimer(timer, &basenow);
}
+
+   active_bases &= ~(1 << i);
}
 
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH -tip v8 10/26] kprobes/x86: Allow probe on some kprobe preparation functions

2014-03-26 Thread Masami Hiramatsu
(2014/03/25 4:35), Steven Rostedt wrote:
> On Wed, 05 Mar 2014 20:59:53 +0900
> Masami Hiramatsu  wrote:
> 
>> There is no need to prohibit probing on the functions
>> used in preparation phase. Those are safely probed because
>> those are not invoked from breakpoint/fault/debug handlers,
>> there is no chance to cause recursive exceptions.
>>
>> Following functions are now removed from the kprobes blacklist.
>>  can_boost
>>  can_probe
>>  can_optimize
>>  is_IF_modifier
>>  __copy_instruction
>>  copy_optimized_instructions
>>  arch_copy_kprobe
>>  arch_prepare_kprobe
>>  arch_arm_kprobe
>>  arch_disarm_kprobe
>>  arch_remove_kprobe
> 
> Is there any possibility that the arm and disarm could cause issues if
> we have a probe in the middle of setting it?
> 
> I guess not, but I just wanted to ask, as your test only tested the
> start of function and not the middle of it.

OK, I've tested it by attached script which adds probes on every address
of the target function and run a testcase(register/unregister other probes),
and found no problem. :)

Thank you,


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com



#!/bin/bash
TARGETS=$1
FTRACE_KPROBE_EVENT=/sys/kernel/debug/tracing/kprobe_events
FTRACE_KPROBE_PROFILE=/sys/kernel/debug/tracing/kprobe_profile
FTRACE_EVENTS=/sys/kernel/debug/tracing/events

if [ ! -f $TARGETS ]; then
  echo "Usage: kprobe_test.sh "
  exit 1
fi
if [ `id -u` -ne 0 ]; then
  echo "Error: This program requires root privilege"
  exit 2
fi

function get_symbol_size() { #symbol
grep $1 -w -m1 -A1 /proc/kallsyms |\
 awk 'BEGIN{ s = 0 }; {p = strtonum("0x"substr($1,9,8)); if (s == 0) {s = p;} 
else { print p - s}}'
}

function setup_probes() { #symbol
size=`get_symbol_size $1`
if [ -z "$size" ]; then
  echo "No symbol $1 found"
  return 1
fi
i=0
err=0
while [ $i -lt $size ]; do
  (echo p $1+$i  >> $FTRACE_KPROBE_EVENT) &> /dev/null
  [ $? -eq 0 ] || err=$((err+1))
  i=$((i+1))
done
probed=`expr $size - $err`
echo "Setup $probed probes on $1"
[ $probed -eq 0 ] && return 1
return 0
}

function enable_probes() {
echo 1 > $FTRACE_EVENTS/kprobes/enable
}

function clear_probes() {
echo 0 > $FTRACE_EVENTS/kprobes/enable
echo > $FTRACE_KPROBE_EVENT
}

function run_test() {
echo p:test1 vfs_symlink >> $FTRACE_KPROBE_EVENT
echo p:test2 vfs_symlink+5 >> $FTRACE_KPROBE_EVENT
echo 1 > $FTRACE_EVENTS/kprobes/test1/enable
echo 1 > $FTRACE_EVENTS/kprobes/test2/enable
sleep 1
echo 0 > $FTRACE_EVENTS/kprobes/test1/enable
echo 0 > $FTRACE_EVENTS/kprobes/test2/enable
echo -:test1 >> $FTRACE_KPROBE_EVENT
echo -:test2 >> $FTRACE_KPROBE_EVENT
}

function save_profile() { # symbol
cat $FTRACE_KPROBE_PROFILE > ${1}.profile
}

function test_on() { # symbol
setup_probes $1
[ $? -ne 0 ] && return
enable_probes
echo "Probe Enabled"
run_test
echo "Test done on $1"
save_profile $1
clear_probes
}

clear_probes
cat $TARGETS | while read sym; do
  test_on $sym
done



[PATCH V3] hrtimer: use __ffs() to iterate over valid bits from active_bases

2014-03-26 Thread Viresh Kumar
Currently we are iterating over all possible (currently four) bits of
active_bases to see if corresponding clock bases are active. This is good enough
for cases where 3 or 4 bases are used but if only 1 or 2 are used then it makes
more sense to use __ffs() to find the right bit directly.

Suggested-by: Thomas Gleixner 
Signed-off-by: Viresh Kumar 
---
V2->V3: Use ffs() instead of __ffs() and decrement 'i' later.

V1->V2: Instead of removing active_bases use __ffs() on it to make loop more
efficient.

 kernel/hrtimer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index acfef5f..2aad8a7 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1265,6 +1265,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
ktime_t expires_next, now, entry_time, delta;
+   unsigned long active_bases = cpu_base->active_bases;
int i, retries = 0;
 
BUG_ON(!cpu_base->hres_active);
@@ -1284,15 +1285,11 @@ retry:
 */
cpu_base->expires_next.tv64 = KTIME_MAX;
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-   struct hrtimer_clock_base *base;
+   while ((i = ffs(active_bases))) {
+   struct hrtimer_clock_base *base = cpu_base->clock_base + --i;
struct timerqueue_node *node;
ktime_t basenow;
 
-   if (!(cpu_base->active_bases & (1 << i)))
-   continue;
-
-   base = cpu_base->clock_base + i;
basenow = ktime_add(now, base->offset);
 
while ((node = timerqueue_getnext(&base->active))) {
@@ -1327,6 +1324,8 @@ retry:
 
__run_hrtimer(timer, &basenow);
}
+
+   active_bases &= ~(1 << i);
}
 
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 2/2] hrtimer: use __ffs() to iterate over valid bits from active_bases

2014-03-26 Thread Viresh Kumar
On 27 March 2014 11:10, Thomas Gleixner  wrote:
> What if this is a spurious interrupt and active_bases is 0?

Hmm.. haven't thought about that actually.. I thought it would be
guaranteed here that active_bases isn't zero.

Will fix it as the current code would end up in a infinite loop.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 2/2] hrtimer: use __ffs() to iterate over valid bits from active_bases

2014-03-26 Thread Thomas Gleixner
On Thu, 27 Mar 2014, Viresh Kumar wrote:

> Currently we are iterating over all possible (currently four) bits of
> active_bases to see if corresponding clock bases are active. This is good 
> enough
> for cases where 3 or 4 bases are used but if only 1 or 2 are used then it 
> makes
> more sense to use __ffs() to find the right bit directly.
> 
> Suggested-by: Thomas Gleixner 
> Signed-off-by: Viresh Kumar 
> ---
> V1->V2: Instead of removing active_bases use __ffs() on it to make loop more
> efficient.
> 
> I tried to use for_each_set_bit() first and then it looked overdone. And so 
> used
> a simple form, __ffs() with some code to clear bits.
> 
>  kernel/hrtimer.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index acfef5f..ea90228 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1265,6 +1265,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>  {
>   struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
>   ktime_t expires_next, now, entry_time, delta;
> + unsigned long active_bases = cpu_base->active_bases;
>   int i, retries = 0;
>  
>   BUG_ON(!cpu_base->hres_active);
> @@ -1284,15 +1285,11 @@ retry:
>*/
>   cpu_base->expires_next.tv64 = KTIME_MAX;
>  
> - for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
> - struct hrtimer_clock_base *base;
> + while ((i = __ffs(active_bases))) {

What if this is a spurious interrupt and active_bases is 0?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] hrtimer: don't add clock-base to active_bases if already present

2014-03-26 Thread Thomas Gleixner


On Thu, 27 Mar 2014, Viresh Kumar wrote:

> If active_bases already has entry for a particular clock type, then we don't
> need to rewrite it while queuing a hrtimer.
> 
> Signed-off-by: Viresh Kumar 
> ---
> Initially I thought of doing this but then thought better remove active_bases
> completely and so didn't sent this one. Now it might find some place for 
> itself
> :).
> 
>  kernel/hrtimer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index da351ad..acfef5f 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -864,8 +864,9 @@ static int enqueue_hrtimer(struct hrtimer *timer,
>  {
>   debug_activate(timer);
>  
> + if (!timerqueue_getnext(&base->active))
> + base->cpu_base->active_bases |= 1 << base->index;
>   timerqueue_add(&base->active, &timer->node);
> - base->cpu_base->active_bases |= 1 << base->index;

The conditional is more expensive than actually doing the OR operation
at least on x86 as it results in a branch.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Mike Galbraith
On Thu, 2014-03-27 at 04:20 +, jimmie.da...@l-3com.com wrote: 


> The example code submitted into bugzilla (chase back on the thread a
> bit, there is a reference) shows the problem.
> 
> Two threads, TaskA (high priority) and TaskB (low priority).  Assigned
> to the same processor, explicitly for the guarantee that only one of
> them can execute at a time.

Your priority based serialization guarantee does not exist.  Tasks can
be and are put to sleep.  When that happens, a lower priority runnable
task will run.  Whether you like that fact or not, it remains a fact.

If you don't want your lower priority task to run, why do you wake it?.

-Mike
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] mmc: rtsx: add R1-no-CRC mmc command type handle

2014-03-26 Thread micky_ching
From: Micky Ching 

commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a

produced a cmd.flags unhandled in realtek pci host driver.
This will make MMC card failed initialize, this patch is
used to handle the new cmd.flags condition and MMC card can be used.

Signed-off-by: Micky Ching 
---
 drivers/mmc/host/rtsx_pci_sdmmc.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/host/rtsx_pci_sdmmc.c 
b/drivers/mmc/host/rtsx_pci_sdmmc.c
index 5fb994f..0d8904a 100644
--- a/drivers/mmc/host/rtsx_pci_sdmmc.c
+++ b/drivers/mmc/host/rtsx_pci_sdmmc.c
@@ -346,6 +346,9 @@ static void sd_send_cmd(struct realtek_pci_sdmmc *host, 
struct mmc_command *cmd)
case MMC_RSP_R1:
rsp_type = SD_RSP_TYPE_R1;
break;
+   case MMC_RSP_R1 & ~MMC_RSP_CRC:
+   rsp_type = SD_RSP_TYPE_R1 | SD_NO_CHECK_CRC7;
+   break;
case MMC_RSP_R1B:
rsp_type = SD_RSP_TYPE_R1b;
break;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] mmc: rtsx: modify error handle and remove smatch warnings

2014-03-26 Thread micky_ching
From: Micky Ching 

Using non-DMA dump-regs, which would be more exactly for DMA transfer failed.

More careful handle when cmd/data timeout, add stop(CMD12) cmd before go to
finish request when multi-rw timeout.

Remove some static checher warings.
on commit: 
drivers/mmc/host/rtsx_pci_sdmmc.c:194 sd_finish_request()
error: we previously assumed 'mrq' could be null (see line 158)
drivers/mmc/host/rtsx_pci_sdmmc.c:504 sd_get_rsp()
error: we previously assumed 'cmd' could be null (see line 434)
drivers/mmc/host/rtsx_pci_sdmmc.c:525 sd_pre_dma_transfer()
warn: we tested 'next' before and it was 'false'

Signed-off-by: Micky Ching 
---
 drivers/mmc/host/rtsx_pci_sdmmc.c |  119 -
 1 file changed, 65 insertions(+), 54 deletions(-)

diff --git a/drivers/mmc/host/rtsx_pci_sdmmc.c 
b/drivers/mmc/host/rtsx_pci_sdmmc.c
index 0d8904a..453e1d4 100644
--- a/drivers/mmc/host/rtsx_pci_sdmmc.c
+++ b/drivers/mmc/host/rtsx_pci_sdmmc.c
@@ -81,25 +81,24 @@ static inline void sd_clear_error(struct realtek_pci_sdmmc 
*host)
 }
 
 #ifdef DEBUG
+static inline void sd_print_reg(struct realtek_pci_sdmmc *host, u16 reg)
+{
+   u8 val = 0;
+
+   if (rtsx_pci_read_register(host->pcr, reg, &val) < 0)
+   dev_dbg(sdmmc_dev(host), "read 0x%04x failed\n", reg);
+   else
+   dev_dbg(sdmmc_dev(host), "0x%04X: 0x%02x\n", reg, val);
+}
+
 static void sd_print_debug_regs(struct realtek_pci_sdmmc *host)
 {
-   struct rtsx_pcr *pcr = host->pcr;
u16 i;
-   u8 *ptr;
-
-   /* Print SD host internal registers */
-   rtsx_pci_init_cmd(pcr);
-   for (i = 0xFDA0; i <= 0xFDAE; i++)
-   rtsx_pci_add_cmd(pcr, READ_REG_CMD, i, 0, 0);
-   for (i = 0xFD52; i <= 0xFD69; i++)
-   rtsx_pci_add_cmd(pcr, READ_REG_CMD, i, 0, 0);
-   rtsx_pci_send_cmd(pcr, 100);
 
-   ptr = rtsx_pci_get_cmd_data(pcr);
for (i = 0xFDA0; i <= 0xFDAE; i++)
-   dev_dbg(sdmmc_dev(host), "0x%04X: 0x%02x\n", i, *(ptr++));
+   sd_print_reg(host, i);
for (i = 0xFD52; i <= 0xFD69; i++)
-   dev_dbg(sdmmc_dev(host), "0x%04X: 0x%02x\n", i, *(ptr++));
+   sd_print_reg(host, i);
 }
 #else
 #define sd_print_debug_regs(host)
@@ -125,19 +124,27 @@ static void sd_request_timeout(unsigned long host_addr)
spin_lock_irqsave(&host->lock, flags);
 
if (!host->mrq) {
-   dev_err(sdmmc_dev(host), "error: no request exist\n");
-   goto out;
+   dev_err(sdmmc_dev(host), "error: request not exist\n");
+   spin_unlock_irqrestore(&host->lock, flags);
+   return;
}
 
-   if (host->cmd)
+   if (host->cmd && host->data)
+   dev_err(sdmmc_dev(host), "error: cmd and data conflict\n");
+
+   if (host->cmd) {
host->cmd->error = -ETIMEDOUT;
-   if (host->data)
-   host->data->error = -ETIMEDOUT;
+   dev_dbg(sdmmc_dev(host), "timeout for cmd %d\n",
+   host->cmd->opcode);
+   tasklet_schedule(&host->cmd_tasklet);
+   }
 
-   dev_dbg(sdmmc_dev(host), "timeout for request\n");
+   if (host->data) {
+   host->data->error = -ETIMEDOUT;
+   dev_dbg(sdmmc_dev(host), "timeout for data transfer\n");
+   tasklet_schedule(&host->data_tasklet);
+   }
 
-out:
-   tasklet_schedule(&host->finish_tasklet);
spin_unlock_irqrestore(&host->lock, flags);
 }
 
@@ -157,7 +164,8 @@ static void sd_finish_request(unsigned long host_addr)
mrq = host->mrq;
if (!mrq) {
dev_err(sdmmc_dev(host), "error: no request need finish\n");
-   goto out;
+   spin_unlock_irqrestore(&host->lock, flags);
+   return;
}
 
cmd = mrq->cmd;
@@ -167,11 +175,6 @@ static void sd_finish_request(unsigned long host_addr)
(mrq->stop && mrq->stop->error) ||
(cmd && cmd->error) || (data && data->error);
 
-   if (any_error) {
-   rtsx_pci_stop_cmd(pcr);
-   sd_clear_error(host);
-   }
-
if (data) {
if (any_error)
data->bytes_xfered = 0;
@@ -188,7 +191,6 @@ static void sd_finish_request(unsigned long host_addr)
host->cmd = NULL;
host->data = NULL;
 
-out:
spin_unlock_irqrestore(&host->lock, flags);
mutex_unlock(&pcr->pcr_mutex);
mmc_request_done(host->mmc, mrq);
@@ -373,8 +375,11 @@ static void sd_send_cmd(struct realtek_pci_sdmmc *host, 
struct mmc_command *cmd)
if (cmd->opcode == SD_SWITCH_VOLTAGE) {
err = rtsx_pci_write_register(pcr, SD_BUS_STAT,
0xFF, SD_CLK_TOGGLE_EN);
-   if (err < 0)
+   if (err < 0) {
+   rtsx_pci_write_register(pcr, SD_BUS_STAT,
+   SD_CLK_TOGGLE_EN | SD_CL

[PATCH v2 0/2] mmc: rtsx: add new cmd type handle and modify error handle

2014-03-26 Thread micky_ching
From: Micky Ching 

v2:
fix checkpatch warning.
WARNING: Missing a blank line after declarations

v1:
Add new command type(R1 without CRC) handle, without this
patch mmc card initialize will be failed.

Using a more careful handle in request timeout, this would
improve error recover capability. Debug info is printed
using non DMA mode, this would help print more accurately
for DMA command failed. Smatch warning was removed.

Micky Ching (2):
  mmc: rtsx: add R1-no-CRC mmc command type handle
  mmc: rtsx: modify error handle and remove smatch warnings

 drivers/mmc/host/rtsx_pci_sdmmc.c |  122 +
 1 file changed, 68 insertions(+), 54 deletions(-)

--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 02/12] pci: host: pcie-dra7xx: add support for pcie-dra7xx controller

2014-03-26 Thread Kishon Vijay Abraham I


On Thursday 27 March 2014 09:13 AM, Jingoo Han wrote:
> On Wednesday, March 26, 2014 10:58 PM, Kishon Vijay Abraham I wrote:
>>
>> Added support for pcie controller in dra7xx. This driver re-uses
>> the designware core code that is already present in kernel.
>>
>> Signed-off-by: Kishon Vijay Abraham I 
> 
> Hi Kishon,
> Long time no see! I added trivial comments.

yeah, these were in my TODO for a long time. Sorry for it though.
> 
>> ---
>>  Documentation/devicetree/bindings/pci/ti-pci.txt |   35 ++
>>  drivers/pci/host/Kconfig |   10 +
>>  drivers/pci/host/Makefile|1 +
>>  drivers/pci/host/pcie-dra7xx.c   |  411 
>> ++
> 
> How about using 'pci-' prefix?
> As it was discussed earlier, 'pci-' prefix is more proper.
> 
>>  4 files changed, 457 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/pci/ti-pci.txt
>>  create mode 100644 drivers/pci/host/pcie-dra7xx.c
> 
> [.]
> 
>> --- /dev/null
>> +++ b/drivers/pci/host/pcie-dra7xx.c
> 
> [.]
> 
>> +#define PCIECTRL_TI_CONF_IRQSTATUS_MAIN 0x0024
>> +#define PCIECTRL_TI_CONF_IRQENABLE_SET_MAIN 0x0028
> 
> I don't think that it's good to add vendor names such as TI
> to SFR names.
> 
> How about adding 'DRA7XX' or just removing 'TI'?
> 
> 1. PCIECTRL_DRA7XX_CONF_IRQSTATUS_MAIN

ok.
> 
> 2. PCIECTRL_CONF_IRQSTATUS_MAIN
> 
> [.]
> 
>> +enum dra7xx_pcie_device_type {
>> +DRA7XX_PCIE_UNKNOWN_TYPE,
>> +DRA7XX_PCIE_EP_TYPE,
>> +DRA7XX_PCIE_LEG_EP_TYPE,
>> +DRA7XX_PCIE_RC_TYPE,
>> +};
> 
> This driver can support only RC mode, so, these enum can be removed.
> 
> [.]
> 
>> +of_property_read_u32(node, "ti,device-type", &device_type);
>> +switch (device_type) {
>> +case DRA7XX_PCIE_RC_TYPE:
>> +dra7xx_pcie_writel(dra7xx->base,
>> +PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_RC);
>> +break;
>> +case DRA7XX_PCIE_EP_TYPE:
>> +dra7xx_pcie_writel(dra7xx->base,
>> +PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_EP);
>> +break;
>> +case DRA7XX_PCIE_LEG_EP_TYPE:
>> +dra7xx_pcie_writel(dra7xx->base,
>> +PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_LEG_EP);
>> +break;
>> +default:
>> +dev_dbg(dev, "UNKNOWN device type %d\n", device_type);
>> +}
> 
> Thus, this switch can be removed.

sure.

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] regmap: mmio: Add support for 1/2/8 bytes wide register address.

2014-03-26 Thread Xiubo Li
Since regmap core and mmio have already support for 1/2/8 bytes wide values,
so adds support for 1/2/8 bytes wide registers address.

Signed-off-by: Xiubo Li 
---
 drivers/base/regmap/regmap-mmio.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/base/regmap/regmap-mmio.c 
b/drivers/base/regmap/regmap-mmio.c
index 4f1efce..ed080a4 100644
--- a/drivers/base/regmap/regmap-mmio.c
+++ b/drivers/base/regmap/regmap-mmio.c
@@ -26,18 +26,30 @@
 
 struct regmap_mmio_context {
void __iomem *regs;
+   unsigned reg_bytes;
unsigned val_bytes;
+   unsigned pad_bytes;
struct clk *clk;
 };
 
 static inline void regmap_mmio_regsize_check(size_t reg_size)
 {
-   BUG_ON(reg_size != 4);
+   switch (reg_size) {
+   case 1:
+   case 2:
+   case 4:
+#ifdef CONFIG_64BIT
+   case 8:
+#endif
+   break;
+   default:
+   BUG();
+   }
 }
 
 static inline void regmap_mmio_count_check(size_t count)
 {
-   BUG_ON(count < 4);
+   BUG_ON(count % 2 != 0);
 }
 
 static int regmap_mmio_gather_write(void *context,
@@ -91,9 +103,13 @@ static int regmap_mmio_gather_write(void *context,
 
 static int regmap_mmio_write(void *context, const void *data, size_t count)
 {
+   struct regmap_mmio_context *ctx = context;
+   u32 offset = ctx->reg_bytes + ctx->pad_bytes;
+
regmap_mmio_count_check(count);
 
-   return regmap_mmio_gather_write(context, data, 4, data + 4, count - 4);
+   return regmap_mmio_gather_write(context, data, ctx->reg_bytes,
+   data + offset, count - offset);
 }
 
 static int regmap_mmio_read(void *context,
@@ -219,6 +235,8 @@ static struct regmap_mmio_context 
*regmap_mmio_gen_context(struct device *dev,
 
ctx->regs = regs;
ctx->val_bytes = config->val_bits / 8;
+   ctx->reg_bytes = config->reg_bits / 8;
+   ctx->pad_bytes = config->pad_bits / 8;
ctx->clk = ERR_PTR(-ENODEV);
 
if (clk_id == NULL)
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] regmap: mmio: add regmap_mmio_{regsize, count}_check.

2014-03-26 Thread Xiubo Li
Signed-off-by: Xiubo Li 
---
 drivers/base/regmap/regmap-mmio.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/base/regmap/regmap-mmio.c 
b/drivers/base/regmap/regmap-mmio.c
index 81f9775..4f1efce 100644
--- a/drivers/base/regmap/regmap-mmio.c
+++ b/drivers/base/regmap/regmap-mmio.c
@@ -30,6 +30,16 @@ struct regmap_mmio_context {
struct clk *clk;
 };
 
+static inline void regmap_mmio_regsize_check(size_t reg_size)
+{
+   BUG_ON(reg_size != 4);
+}
+
+static inline void regmap_mmio_count_check(size_t count)
+{
+   BUG_ON(count < 4);
+}
+
 static int regmap_mmio_gather_write(void *context,
const void *reg, size_t reg_size,
const void *val, size_t val_size)
@@ -38,7 +48,7 @@ static int regmap_mmio_gather_write(void *context,
u32 offset;
int ret;
 
-   BUG_ON(reg_size != 4);
+   regmap_mmio_regsize_check(reg_size);
 
if (!IS_ERR(ctx->clk)) {
ret = clk_enable(ctx->clk);
@@ -81,7 +91,7 @@ static int regmap_mmio_gather_write(void *context,
 
 static int regmap_mmio_write(void *context, const void *data, size_t count)
 {
-   BUG_ON(count < 4);
+   regmap_mmio_count_check(count);
 
return regmap_mmio_gather_write(context, data, 4, data + 4, count - 4);
 }
@@ -94,7 +104,7 @@ static int regmap_mmio_read(void *context,
u32 offset;
int ret;
 
-   BUG_ON(reg_size != 4);
+   regmap_mmio_regsize_check(reg_size);
 
if (!IS_ERR(ctx->clk)) {
ret = clk_enable(ctx->clk);
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] hrtimer: don't add clock-base to active_bases if already present

2014-03-26 Thread Viresh Kumar
If active_bases already has entry for a particular clock type, then we don't
need to rewrite it while queuing a hrtimer.

Signed-off-by: Viresh Kumar 
---
Initially I thought of doing this but then thought better remove active_bases
completely and so didn't sent this one. Now it might find some place for itself
:).

 kernel/hrtimer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index da351ad..acfef5f 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -864,8 +864,9 @@ static int enqueue_hrtimer(struct hrtimer *timer,
 {
debug_activate(timer);
 
+   if (!timerqueue_getnext(&base->active))
+   base->cpu_base->active_bases |= 1 << base->index;
timerqueue_add(&base->active, &timer->node);
-   base->cpu_base->active_bases |= 1 << base->index;
 
/*
 * HRTIMER_STATE_ENQUEUED is or'ed to the current state to preserve the
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 2/2] hrtimer: use __ffs() to iterate over valid bits from active_bases

2014-03-26 Thread Viresh Kumar
Currently we are iterating over all possible (currently four) bits of
active_bases to see if corresponding clock bases are active. This is good enough
for cases where 3 or 4 bases are used but if only 1 or 2 are used then it makes
more sense to use __ffs() to find the right bit directly.

Suggested-by: Thomas Gleixner 
Signed-off-by: Viresh Kumar 
---
V1->V2: Instead of removing active_bases use __ffs() on it to make loop more
efficient.

I tried to use for_each_set_bit() first and then it looked overdone. And so used
a simple form, __ffs() with some code to clear bits.

 kernel/hrtimer.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index acfef5f..ea90228 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1265,6 +1265,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
ktime_t expires_next, now, entry_time, delta;
+   unsigned long active_bases = cpu_base->active_bases;
int i, retries = 0;
 
BUG_ON(!cpu_base->hres_active);
@@ -1284,15 +1285,11 @@ retry:
 */
cpu_base->expires_next.tv64 = KTIME_MAX;
 
-   for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-   struct hrtimer_clock_base *base;
+   while ((i = __ffs(active_bases))) {
+   struct hrtimer_clock_base *base = cpu_base->clock_base + i;
struct timerqueue_node *node;
ktime_t basenow;
 
-   if (!(cpu_base->active_bases & (1 << i)))
-   continue;
-
-   base = cpu_base->clock_base + i;
basenow = ktime_add(now, base->offset);
 
while ((node = timerqueue_getnext(&base->active))) {
@@ -1327,6 +1324,8 @@ retry:
 
__run_hrtimer(timer, &basenow);
}
+
+   active_bases &= ~(1 << i);
}
 
/*
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the staging tree with the net-next tree

2014-03-26 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got a conflict in
drivers/staging/wlags49_h2/wl_netdev.c between commit 8d4ade284a41
("wlags49_h2: Call dev_kfree/consume_skb_any instead of dev_kfree_skb")
from the net-next tree and commit fed3ffd8f7ef ("staging: wlags49_h2:
reindent wl_netdev.c") (and maybe others) from the staging tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/staging/wlags49_h2/wl_netdev.c
index 69bc0a01ae14,77e4be21e44b..
--- a/drivers/staging/wlags49_h2/wl_netdev.c
+++ b/drivers/staging/wlags49_h2/wl_netdev.c
@@@ -626,103 -605,105 +605,105 @@@ void wl_tx_timeout(struct net_device *d
   *  1 on error
   *
   
**/
- int wl_send( struct wl_private *lp )
+ int wl_send(struct wl_private *lp)
  {
  
- int status;
- DESC_STRCT  *desc;
- WVLAN_LFRAME*txF = NULL;
- struct list_head*element;
- int len;
+   int status;
+   DESC_STRCT *desc;
+   WVLAN_LFRAME *txF = NULL;
+   struct list_head *element;
+   int len;
  
/**/
  
- if( lp == NULL ) {
- DBG_ERROR( DbgInfo, "Private adapter struct is NULL\n" );
- return FALSE;
- }
- if( lp->dev == NULL ) {
- DBG_ERROR( DbgInfo, "net_device struct in wl_private is NULL\n" );
- return FALSE;
- }
- 
- /* Check for the availability of FIDs; if none are available, don't take 
any
-frames off the txQ */
- if( lp->hcfCtx.IFB_RscInd == 0 ) {
- return FALSE;
- }
- 
- /* Reclaim the TxQ Elements and place them back on the free queue */
- if( !list_empty( &( lp->txQ[0] ))) {
- element = lp->txQ[0].next;
- 
- txF = (WVLAN_LFRAME * )list_entry( element, WVLAN_LFRAME, node );
- if( txF != NULL ) {
- lp->txF.skb  = txF->frame.skb;
- lp->txF.port = txF->frame.port;
- 
- txF->frame.skb  = NULL;
- txF->frame.port = 0;
- 
- list_del( &( txF->node ));
- list_add( element, &( lp->txFree ));
- 
- lp->txQ_count--;
- 
- if( lp->txQ_count < TX_Q_LOW_WATER_MARK ) {
- if( lp->netif_queue_on == FALSE ) {
- DBG_TX( DbgInfo, "Kickstarting Q: %d\n", lp->txQ_count );
- netif_wake_queue( lp->dev );
- WL_WDS_NETIF_WAKE_QUEUE( lp );
- lp->netif_queue_on = TRUE;
- }
- }
- }
- }
- 
- if( lp->txF.skb == NULL ) {
- return FALSE;
- }
- 
- /* If the device has resources (FIDs) available, then Tx the packet */
- /* Format the TxRequest and send it to the adapter */
- len = lp->txF.skb->len < ETH_ZLEN ? ETH_ZLEN : lp->txF.skb->len;
- 
- desc= &( lp->desc_tx );
- desc->buf_addr  = lp->txF.skb->data;
- desc->BUF_CNT   = len;
- desc->next_desc_addr= NULL;
- 
- status = hcf_send_msg( &( lp->hcfCtx ), desc, lp->txF.port );
- 
- if( status == HCF_SUCCESS ) {
- lp->dev->trans_start = jiffies;
- 
- DBG_TX( DbgInfo, "Transmit...\n" );
- 
- if( lp->txF.port == HCF_PORT_0 ) {
- lp->stats.tx_packets++;
- lp->stats.tx_bytes += lp->txF.skb->len;
- }
+   if (lp == NULL) {
+   DBG_ERROR(DbgInfo, "Private adapter struct is NULL\n");
+   return FALSE;
+   }
+   if (lp->dev == NULL) {
+   DBG_ERROR(DbgInfo, "net_device struct in wl_private is NULL\n");
+   return FALSE;
+   }
+ 
+   /*
+* Check for the availability of FIDs; if none are available,
+* don't take any frames off the txQ
+*/
+   if (lp->hcfCtx.IFB_RscInd == 0)
+   return FALSE;
+ 
+   /* Reclaim the TxQ Elements and place them back on the free queue */
+   if (!list_empty(&(lp->txQ[0]))) {
+   element = lp->txQ[0].next;
+ 
+   txF = (WVLAN_LFRAME *) list_entry(element, WVLAN_LFRAME, node);
+   if (txF != NULL) {
+   lp->txF.skb = txF->frame.skb;
+   lp->txF.port = txF->frame.port;
+ 
+   txF->frame.skb = NULL;
+   txF->frame.port = 0;
+ 
+   list_del(&(txF->node));
+   list_add(element, &(lp->txFree));
+ 
+   lp->txQ_count--;
+ 
+   if (lp->txQ_count < TX_Q_LOW_WATER_MARK) {
+   if (lp->netif_queue_on == FALSE) {
+   DBG_TX(DbgInfo, "Kickstarting Q: %d\n",
+  lp->txQ_

[PATCH 4/5] tracing: Add hash trigger to Documentation

2014-03-26 Thread Tom Zanussi
Add documentation and usage examples for 'hash' triggers.

Signed-off-by: Tom Zanussi 
---
 Documentation/trace/events.txt | 81 ++
 1 file changed, 81 insertions(+)

diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index c94435d..aed77bc 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -494,3 +494,84 @@ The following commands are supported:
 
   Note that there can be only one traceon or traceoff trigger per
   triggering event.
+
+- hash
+
+  This command updates a hash table with a key composed of one or more
+  trace event format fields and a set of values consisting of one or
+  more running totals of either field values or single counts.
+
+  For example, the following trigger hashes all kmalloc events using
+  'call_site' as the hash key.  For each entry, it keeps a running
+  count of event hits ('hitcount', which is optional - counts are
+  always tallied and displayed in the output), and running sums of
+  bytes_alloc, and bytes_req:
+
+  # echo 'hash:call_site:hitcount,bytes_alloc,bytes_req' > \
+  /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+  The following uses the stacktrace at the call_site as a hash key
+  instead of just the straight call_site. :
+
+  # echo 'hash:stacktrace:bytes_alloc,bytes_req' > \
+  /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+  The following uses the combination of call_site and pid as a
+  composite hash key, effectively implementing a per-pid nested hash
+  by call_site:
+
+  # echo 'hash:call_site,common_pid:bytes_alloc,bytes_req' > \
+  /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+  To keep a per-pid count of the number of bytes asked for in file
+  reads:
+
+  # echo 'hash:common_pid:count' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+  To keep a per-pid, per-file count of the number of bytes asked for
+  in file reads:
+
+  # echo 'hash:common_pid,fd:count' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+
+  To keep a per-pid, per-file count of the number of bytes actually
+  gotten in file reads (but only if the return value wasn't negative):
+
+  # echo 'hash:common_pid,fd:ret if ret > 0' > \
+  /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
+
+  The format is:
+
+  hash:,:,,[:sort_keys] if filter > event/trigger
+
+  More formally,
+
+  # echo hash:key(s):value(s)[:sort_keys()][ if filter] > event/trigger
+
+  To remove the above commands:
+
+  # echo '!hash:call_site:1,bytes_alloc,bytes_req' > \
+  /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+
+  Note that there can be any number of hash triggers per triggering
+  event.
+
+  A '-' operator is available for taking differences between numeric
+  fields.
+
+  Sorting:
+
+The default sort key is 'hitcount' which is always available.
+Appending ':sort=val1,val1' will sort the output using val1 as the
+primary key and val2 as the secondary.
+
+  Modifiers:
+
+Various fields can have a . appended to them, which will
+modify how they're displayed:
+
+  .hex  - display a numeric value as hex
+  .sym  - display an address as a symbol if possible
+  .syscall  - map a number representing syscall id to its syscall name
+  .execname - map a number representing a pid to its process name
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC II] Splitting scheduler into two halves

2014-03-26 Thread Mike Galbraith
On Thu, 2014-03-27 at 02:37 +0800, Yuyang du wrote: 
> Hi all,
> 
> This is continued after the first RFC about splitting the scheduler. Still
> work-in-progress, and call for feedback.
> 
> The question addressed here is how load balance should be changed. And I think
> the question then goes to how to *reuse* common code as much as possible and
> meanwhile be able to serve various objectives.
> 
> So these are the basic semantics needed in current load balance:

I'll probably regret it, but I'm gonna speak my mind.  I think this two
halves concept is fundamentally broken. 

> 1. [ At balance point ] on this_cpu push task on that_cpu to [ third_cpu ]

Load balancing is a necessary part of the fastpath as well as slow path,
you can't just define balance point, and have that mean a point at which
we can separate core functionality from peripheral.  For example, rt
class has push/pull at schedule time, fair class select_idle_sibling()
at wakeup, both in the fastpath, to minimize latency.  It is all load
balancing, is push pull, fastpath does exactly the same things as slow
path, for the exact same reason, only resource investment varies.

I don't think you can separate the scheduler into two halves like this,
load balancing is an integral part and fundamental consequence of being
a multi-queue scheduler.  Scheduling and balancing are not two halves
that make a whole, and can thus be separated, they are one.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] tracing: Add event record param to trigger_ops.func()

2014-03-26 Thread Tom Zanussi
Some triggers may need access to the trace event, so pass it in.  Also
fix up the existing trigger funcs and their callers.

Signed-off-by: Tom Zanussi 
---
 include/linux/ftrace_event.h|  7 ---
 kernel/trace/trace.h|  6 --
 kernel/trace/trace_events_trigger.c | 35 ++-
 3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 4cdb3a1..5961964 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -368,7 +368,8 @@ extern int call_filter_check_discard(struct 
ftrace_event_call *call, void *rec,
 extern enum event_trigger_type event_triggers_call(struct ftrace_event_file 
*file,
   void *rec);
 extern void event_triggers_post_call(struct ftrace_event_file *file,
-enum event_trigger_type tt);
+enum event_trigger_type tt,
+void *rec);
 
 /**
  * ftrace_trigger_soft_disabled - do triggers and test if soft disabled
@@ -451,7 +452,7 @@ event_trigger_unlock_commit(struct ftrace_event_file *file,
trace_buffer_unlock_commit(buffer, event, irq_flags, pc);
 
if (tt)
-   event_triggers_post_call(file, tt);
+   event_triggers_post_call(file, tt, entry);
 }
 
 /**
@@ -484,7 +485,7 @@ event_trigger_unlock_commit_regs(struct ftrace_event_file 
*file,
irq_flags, pc, regs);
 
if (tt)
-   event_triggers_post_call(file, tt);
+   event_triggers_post_call(file, tt, entry);
 }
 
 enum {
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 26c55ff..9032cf3 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1087,7 +1087,8 @@ struct event_trigger_data {
  * @func: The trigger 'probe' function called when the triggering
  * event occurs.  The data passed into this callback is the data
  * that was supplied to the event_command @reg() function that
- * registered the trigger (see struct event_command).
+ * registered the trigger (see struct event_command) along with
+ * the trace record, rec.
  *
  * @init: An optional initialization function called for the trigger
  * when the trigger is registered (via the event_command reg()
@@ -1112,7 +1113,8 @@ struct event_trigger_data {
  * (see trace_event_triggers.c).
  */
 struct event_trigger_ops {
-   void(*func)(struct event_trigger_data *data);
+   void(*func)(struct event_trigger_data *data,
+   void *rec);
int (*init)(struct event_trigger_ops *ops,
struct event_trigger_data *data);
void(*free)(struct event_trigger_ops *ops,
diff --git a/kernel/trace/trace_events_trigger.c 
b/kernel/trace/trace_events_trigger.c
index 8efbb69..323846e 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -74,7 +74,7 @@ event_triggers_call(struct ftrace_event_file *file, void *rec)
 
list_for_each_entry_rcu(data, &file->triggers, list) {
if (!rec) {
-   data->ops->func(data);
+   data->ops->func(data, rec);
continue;
}
filter = rcu_dereference(data->filter);
@@ -84,7 +84,7 @@ event_triggers_call(struct ftrace_event_file *file, void *rec)
tt |= data->cmd_ops->trigger_type;
continue;
}
-   data->ops->func(data);
+   data->ops->func(data, rec);
}
return tt;
 }
@@ -104,13 +104,14 @@ EXPORT_SYMBOL_GPL(event_triggers_call);
  */
 void
 event_triggers_post_call(struct ftrace_event_file *file,
-enum event_trigger_type tt)
+enum event_trigger_type tt,
+void *rec)
 {
struct event_trigger_data *data;
 
list_for_each_entry_rcu(data, &file->triggers, list) {
if (data->cmd_ops->trigger_type & tt)
-   data->ops->func(data);
+   data->ops->func(data, rec);
}
 }
 EXPORT_SYMBOL_GPL(event_triggers_post_call);
@@ -751,7 +752,7 @@ static int set_trigger_filter(char *filter_str,
 }
 
 static void
-traceon_trigger(struct event_trigger_data *data)
+traceon_trigger(struct event_trigger_data *data, void *rec)
 {
if (tracing_is_on())
return;
@@ -760,7 +761,7 @@ traceon_trigger(struct event_trigger_data *data)
 }
 
 static void
-traceon_count_trigger(struct event_trigger_data *data)
+traceon_count_trigger(struct event_trigger_data *data, void *rec)
 {
if (tracing_is_on())
return;
@@ -775,7 +776,7 @@ traceon_count

[PATCH 0/5] tracing: Hash triggers

2014-03-26 Thread Tom Zanussi
Hi Steve,

This is my current code for the hash triggers mentioned in the other
thread.

I've been using it for a project here, and as such it works fine for
me, but it's nowhere near anything like a mergeable state; I'm only
sending/posting it because I didn't realize until today that you were
presenting on triggers at Collab Summit, and if as mentioned you're
thinking of adding a bullet or two for it wrt future/3.16 work, it
might be useful to have the code to play around with too...

Tom

The following changes since commit f217c44ebd41ce7369d2df07622b2839479183b0:

  Merge tag 'trace-fixes-v3.14-rc7-v2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (2014-03-26 
09:09:18 -0700)

are available in the git repository at:


  git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hashtriggers-v0
  
http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hashtriggers-v0

Tom Zanussi (5):
  tracing: Make ftrace_event_field checking functions available
  tracing: Add event record param to trigger_ops.func()
  tracing: Add get_syscall_name()
  tracing: Add hash trigger to Documentation
  tracing: Add 'hash' event trigger command

 Documentation/trace/events.txt  |   81 ++
 include/linux/ftrace_event.h|8 +-
 kernel/trace/trace.h|   27 +-
 kernel/trace/trace_events_filter.c  |   15 +-
 kernel/trace/trace_events_trigger.c | 1439 ++-
 kernel/trace/trace_syscalls.c   |   11 +
 6 files changed, 1546 insertions(+), 35 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] tracing: Add get_syscall_name()

2014-03-26 Thread Tom Zanussi
Add a utility function to grab the syscall name from the syscall
metadata, given a syscall id.

Signed-off-by: Tom Zanussi 
---
 kernel/trace/trace.h  |  9 +
 kernel/trace/trace_syscalls.c | 11 +++
 2 files changed, 20 insertions(+)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9032cf3..457fb4f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1277,4 +1277,13 @@ int perf_ftrace_event_register(struct ftrace_event_call 
*call,
 #define perf_ftrace_event_register NULL
 #endif
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+const char *get_syscall_name(int syscall);
+#else
+static inline const char *get_syscall_name(int syscall)
+{
+   return NULL;
+}
+#endif /* CONFIG_FTRACE_SYSCALLS */
+
 #endif /* _LINUX_KERNEL_TRACE_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 759d5e0..1abb3396 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -106,6 +106,17 @@ static struct syscall_metadata *syscall_nr_to_meta(int nr)
return syscalls_metadata[nr];
 }
 
+const char *get_syscall_name(int syscall)
+{
+   struct syscall_metadata *entry;
+
+   entry = syscall_nr_to_meta(syscall);
+   if (!entry)
+   return NULL;
+
+   return entry->name;
+}
+
 static enum print_line_t
 print_syscall_enter(struct trace_iterator *iter, int flags,
struct trace_event *event)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] tracing: Make ftrace_event_field checking functions available

2014-03-26 Thread Tom Zanussi
Make is_string_field() and is_function_field() accessible outside of
trace_event_filters.c for other users of ftrace_event_fields.

Signed-off-by: Tom Zanussi 
---
 kernel/trace/trace.h   | 12 
 kernel/trace/trace_events_filter.c | 12 
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 02b592f..26c55ff 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1012,6 +1012,18 @@ struct filter_pred {
unsigned short  right;
 };
 
+static inline bool is_string_field(struct ftrace_event_field *field)
+{
+   return field->filter_type == FILTER_DYN_STRING ||
+  field->filter_type == FILTER_STATIC_STRING ||
+  field->filter_type == FILTER_PTR_STRING;
+}
+
+static inline bool is_function_field(struct ftrace_event_field *field)
+{
+   return field->filter_type == FILTER_TRACE_FN;
+}
+
 extern enum regex_type
 filter_parse_regex(char *buff, int len, char **search, int *not);
 extern void print_event_filter(struct ftrace_event_file *file,
diff --git a/kernel/trace/trace_events_filter.c 
b/kernel/trace/trace_events_filter.c
index 8a86319..60a8e3f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -947,18 +947,6 @@ int filter_assign_type(const char *type)
return FILTER_OTHER;
 }
 
-static bool is_function_field(struct ftrace_event_field *field)
-{
-   return field->filter_type == FILTER_TRACE_FN;
-}
-
-static bool is_string_field(struct ftrace_event_field *field)
-{
-   return field->filter_type == FILTER_DYN_STRING ||
-  field->filter_type == FILTER_STATIC_STRING ||
-  field->filter_type == FILTER_PTR_STRING;
-}
-
 static int is_legal_op(struct ftrace_event_field *field, int op)
 {
if (is_string_field(field) &&
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 02/03]: hwrng: create filler thread

2014-03-26 Thread H. Peter Anvin
On 03/26/2014 06:11 PM, Andy Lutomirski wrote:
> 
> TBH I'm highly skeptical of this kind of entropy estimation.
> /dev/random is IMO just silly, since you need to have very
> conservative entropy estimates for the concept to really work, and
> that ends up being hideously slow.

In the absence of a hardware entropy source, it is, but for long-lived
keys, delay is better than bad key generation.

A major reason for entropy estimation is to control the amount of
backpressure.  If you don't have backpressure, you only have generation
pressure, and you can't put your system to sleep when the hwrng keeps
outputting data.  Worse, if your entropy source is inexhaustible, you
might end up spending all your CPU time processing its output.

> Also, in the /dev/random sense,
> most hardware RNGs have no entropy at all, since they're likely to be
> FIPS-approved DRBGs that don't have a real non-deterministic source.

Such a device has no business being a Linux hwrng device.  We already
have a PRNG (DRBG) in the kernel, the *only* purpose for a hwrng device
is to be an entropy source.

> For the kernel's RNG to be secure, I think it should have the property
> that it still works if you rescale all the entropy estimates by any
> constant that's decently close to 1.

That is correct.

> If entropy estimates are systematically too low, then a naive
> implementation results in an excessively long window during early
> bootup in which /dev/urandom is completely insecure.

Eh?  What mechanism would make /dev/urandom any less secure due to
entropy underestimation?  The whole *point* is that we should
systematically underestimate entropy -- and we do, according to research
papers which have analyzed the state of things we do by orders of
magnitude, which is the only possible way to do it for non-hwrng sources.

> If entropy estimates are systematically too high, then a naive
> implementation fails to do a catastrophic reseed, and the RNG can be
> brute-forced.

This again is unacceptable.  We really should not overestimate.

> So I think that the core code should do something along the lines of
> using progressively larger reseeds.  Since I think that /dev/random is
> silly, this means that we only really care about the extent to which
> "entropy" measures entropy conditioned on whatever an attacker can
> actually compute.  Since this could vary widely between devices (e.g.
> if your TPM is malicious), I think that the best we can do is to
> collect ~256 bits from everything available, shove it all in to the
> core together, and repeat.  For all I know, the core code already does
> this.
> 
> The upshot is that the actual rescaling factor should barely matter.
> 50% is probably fine.  So is 100% and 25%.  10% is probably asking for
> trouble during early boot if all you have is a TPM.

I don't see why small factors should be a problem at all (except that it
discourages /dev/random usage.)  Keep in mind we still add the entropy
-- we just don't credit its existence.

TPMs, in particular, should almost certainly be massively derated based
on what little we know about TPM.

As a concrete example: RDRAND is a hardware entropy source that is
architecturally allowed to be diluted by a DRBG up to 512 times.  As far
as I know of the hardware, no shipping piece of hardware is anywhere
near 512 in this aspect.  rngd currently does 512:1 data reduction, but
injecting the raw output at 1/512 credit ought to give a much better
result in terms of entropy.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/1] fs/reiserfs/journal.c: Remove obsolete __GFP_NOFAIL

2014-03-26 Thread David Rientjes
On Wed, 26 Mar 2014, ty...@mit.edu wrote:

> But that's another new user of GFP_NOFAIL (and one added three years
> after David tried to declare There Shalt Be No New Users of
> GFP_NOFAIL), and sure, we could probably patch around that by having
> places where there's no other alternaive to keep a preallocated batch
> of pages and/or allocated structures at each code site.  But that's as
> bad as looping at each code site; in fact, wouldn't it be better if
> the allocator wants to avoid looping, to have a separate batch of
> pages which (ala GFP_ATOMIC) which is reserved for just for GFP_NOFAIL
> allocations when all else fails?
> 

I didn't declare nobody should be adding __GFP_NOFAIL three years ago, 
rather three months ago I proposed a patch to fix __GFP_NOFAIL for 
GFP_ATOMIC allocations you're talking about above since, guess what, 
GPF_ATOMIC | __GFP_NOFAIL today easily returns NULL.  I tried fixing that 
failable-__GFP_NOFAIL problem with 
http://marc.info/?l=linux-kernel&m=138662620812698 but Andrew requested a 
WARN_ON_ONCE() instead since nobody is currently doing that and we agreed 
to warn against new users.

So we should either return to my earlier patch to actually make 
__GFP_NOFAIL not fail, or improve (but not remove) the checkpatch warning 
for these failable cases.  I couldn't care less if we add 5,000 new 
__GFP_NOFAIL users tomorrow, I just care that it does what is expected if 
people are going to be adding them.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [alsa-devel] [PATCH] ASoC: Add support for multi register mux

2014-03-26 Thread Songhee Baek
> -Original Message-
> From: alsa-devel-boun...@alsa-project.org [mailto:alsa-devel-bounces@alsa-
> project.org] On Behalf Of Mark Brown
> Sent: Wednesday, March 26, 2014 6:09 PM
> To: Lars-Peter Clausen
> Cc: Songhee Baek; Arun Shamanna Lakshmi; alsa-de...@alsa-project.org;
> swar...@wwwdotorg.org; ti...@suse.de; lgirdw...@gmail.com; linux-
> ker...@vger.kernel.org
> Subject: Re: [alsa-devel] [PATCH] ASoC: Add support for multi register mux
> 
> * PGP Signed by an unknown key
> 
> On Wed, Mar 26, 2014 at 08:38:47PM +0100, Lars-Peter Clausen wrote:
> > On 03/26/2014 01:02 AM, Arun Shamanna Lakshmi wrote:
> 
> > The way you describe this it seems to me that a value array for this
> > kind of mux would look like.
> 
> > 0x, 0x, 0x0001
> > 0x, 0x, 0x0002
> > 0x, 0x, 0x0003
> > 0x, 0x, 0x0004
> > 0x, 0x, 0x0008
> 
> > That seems to be extremely tedious. If the MUX uses a one hot encoding
> > how about storing the index of the bit in the values array and use (1
> > << value) when writing the value to the register?
> 
> Or hide it behind utility macros at any rate; I've got this horrible feeling 
> that as
> soon as we have this people will notice that they have more standard enums
> that are splatted over multiple registers (I think from memory I've seen them
> but they got fudged).
> 
> > [...]
> > >  /* enumerated kcontrol */
> > >  struct soc_enum {
> 
> > There doesn't actually be any code that is shared between normal enums
> > and wide enums. This patch doubles the size of the soc_enum struct,
> > how about having a separate struct for wide enums?
> 
> Or if they are going to share the same struct then they shouldn't be 
> duplicating
> the code and instead keying off num_regs (which was my first thought earlier
> on when I saw the separate functions).  We definitely shouldn't be sharing the
> data without also sharing the code I think.
> 
> > >-  int reg;
> > >+  int reg[SOC_ENUM_MAX_REGS];
> > >   unsigned char shift_l;
> > >   unsigned char shift_r;
> > >   unsigned int items;
> > >-  unsigned int mask;
> > >+  unsigned int mask[SOC_ENUM_MAX_REGS];
> 
> > If you make mask and reg pointers instead of arrays this should be
> > much more flexible and not be limited to 3 registers.
> 
> Right, which pushes towards not sharing.  Though with an arrayified mask the
> specification by shift syntax would get to be slightly obscure (is it 
> relative to the
> enums or the registers?) so perhaps we don't want to do that at all if we've 
> got
> specification by shift.  If we do that then we could get away with a variable
> length array at the end of the struct though I think that may be painful for 
> the
> static declarations.  Someone would need to look to see what works

Making a separate soc_enum_wide is a better way compared to using soc_enum for 
this use case. If we add a separate soc_enum_wide, we need to update the 
following functions : 
dapm_connect_mux,
soc_dapm_mux_update_power
snd_soc_dapm_mux_update_power
These functions are using texts field in soc_enum struct, I think that we can 
pass texts pointer instead of soc_enum struct pointer. I want to know whether 
it is Ok to do this.

> 
> * Unknown Key
> * 0x7EA229BD
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 1/4] sl[au]b: do not charge large allocations to memcg

2014-03-26 Thread Greg Thelen

On Wed, Mar 26 2014, Vladimir Davydov  wrote:

> We don't track any random page allocation, so we shouldn't track kmalloc
> that falls back to the page allocator.

This seems like a change which will leads to confusing (and arguably
improper) kernel behavior.  I prefer the behavior prior to this patch.

Before this change both of the following allocations are charged to
memcg (assuming kmem accounting is enabled):
 a = kmalloc(KMALLOC_MAX_CACHE_SIZE, GFP_KERNEL)
 b = kmalloc(KMALLOC_MAX_CACHE_SIZE + 1, GFP_KERNEL)

After this change only 'a' is charged; 'b' goes directly to page
allocator which no longer does accounting.

> Signed-off-by: Vladimir Davydov 
> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Glauber Costa 
> Cc: Christoph Lameter 
> Cc: Pekka Enberg 
> ---
>  include/linux/slab.h |2 +-
>  mm/memcontrol.c  |   27 +--
>  mm/slub.c|4 ++--
>  3 files changed, 4 insertions(+), 29 deletions(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 3dd389aa91c7..8a928ff71d93 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -363,7 +363,7 @@ kmalloc_order(size_t size, gfp_t flags, unsigned int 
> order)
>  {
>   void *ret;
>  
> - flags |= (__GFP_COMP | __GFP_KMEMCG);
> + flags |= __GFP_COMP;
>   ret = (void *) __get_free_pages(flags, order);
>   kmemleak_alloc(ret, size, 1, flags);
>   return ret;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index b4b6aef562fa..81a162d01d4d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3528,35 +3528,10 @@ __memcg_kmem_newpage_charge(gfp_t gfp, struct 
> mem_cgroup **_memcg, int order)
>  
>   *_memcg = NULL;
>  
> - /*
> -  * Disabling accounting is only relevant for some specific memcg
> -  * internal allocations. Therefore we would initially not have such
> -  * check here, since direct calls to the page allocator that are marked
> -  * with GFP_KMEMCG only happen outside memcg core. We are mostly
> -  * concerned with cache allocations, and by having this test at
> -  * memcg_kmem_get_cache, we are already able to relay the allocation to
> -  * the root cache and bypass the memcg cache altogether.
> -  *
> -  * There is one exception, though: the SLUB allocator does not create
> -  * large order caches, but rather service large kmallocs directly from
> -  * the page allocator. Therefore, the following sequence when backed by
> -  * the SLUB allocator:
> -  *
> -  *  memcg_stop_kmem_account();
> -  *  kmalloc()
> -  *  memcg_resume_kmem_account();
> -  *
> -  * would effectively ignore the fact that we should skip accounting,
> -  * since it will drive us directly to this function without passing
> -  * through the cache selector memcg_kmem_get_cache. Such large
> -  * allocations are extremely rare but can happen, for instance, for the
> -  * cache arrays. We bring this test here.
> -  */
> - if (!current->mm || current->memcg_kmem_skip_account)
> + if (!current->mm)
>   return true;
>  
>   memcg = get_mem_cgroup_from_mm(current->mm);
> -
>   if (!memcg_can_account_kmem(memcg)) {
>   css_put(&memcg->css);
>   return true;
> diff --git a/mm/slub.c b/mm/slub.c
> index 5e234f1f8853..c2e58a787443 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3325,7 +3325,7 @@ static void *kmalloc_large_node(size_t size, gfp_t 
> flags, int node)
>   struct page *page;
>   void *ptr = NULL;
>  
> - flags |= __GFP_COMP | __GFP_NOTRACK | __GFP_KMEMCG;
> + flags |= __GFP_COMP | __GFP_NOTRACK;
>   page = alloc_pages_node(node, flags, get_order(size));
>   if (page)
>   ptr = page_address(page);
> @@ -3395,7 +3395,7 @@ void kfree(const void *x)
>   if (unlikely(!PageSlab(page))) {
>   BUG_ON(!PageCompound(page));
>   kfree_hook(x);
> - __free_memcg_kmem_pages(page, compound_order(page));
> + __free_pages(page, compound_order(page));
>   return;
>   }
>   slab_free(page->slab_cache, page, object, _RET_IP_);
> -- 
> 1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/14] hrtimer: remove active_bases field from struct hrtimer_cpu_base

2014-03-26 Thread Viresh Kumar
On 26 March 2014 22:58, Thomas Gleixner  wrote:
> Instead of removing it we should actually use ffs and avoid the whole
> looping. That was the intention in the first place, but I never wrote
> the patch...

I thought about that and then using ffs for a field of which only 4 bits
are useful didn't looked too convincing to me :)

But, probably it will make things slightly better in case this routine is
heavily used.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 4/4] cpuset: Add cpusets.quiesce option

2014-03-26 Thread Viresh Kumar
On 27 March 2014 08:17, Li Zefan  wrote:
> This doesn't look like a complete solution, because newer timers/workqueues 
> can
> still run in those CPUs.

The initial idea was to disable load balance between CPUs and then do this.
So, that new timers and workqueues from other CPUs would never get
queued on this CPU..

But I think we can just modify get_nohz_timer_target() for making sure this
for timers..

> Seems like the proposal discussed is to support setting
> cpu affinity for workqueues through sysfs. If so, we can migrate workqueues 
> when
> affinity is set, so we don't need this cpuset.quiesce ?

That was another thread just for workqueues, but this one is about migrating
everything else as well.. Probably some more additions apart from timers/
hrtimers/wqs in future. So, for us it is still required :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread jimmie.davis


-Original Message-
From: Andy Lutomirski [mailto:l...@amacapital.net] 
Sent: Wednesday, March 26, 2014 7:40 PM
To: Davis, Bud @ SSG - Link; umgwanakikb...@gmail.com
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
> 
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
> kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
> very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what 
> it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy

===


Andy,

The example code submitted into bugzilla (chase back on the thread a bit, there 
is a reference) shows the problem.

Two threads, TaskA (high priority) and TaskB (low priority).  Assigned to the 
same processor, explicitly for the guarantee that only one of them can execute 
at a time.  TaskA becomes eligible to run.  As part of its processing ( which 
the normal end is a call to sem_wait() ), it calls mlock().  TaskA then blocks, 
and TaskB begins running.  But wait, the system is designed that TaskA will run 
until it is done (thus SCHED_FIFO and a priority less than TaskB).  TaskA, a 
higher priority task is suspended and TaskB starts running.  And in the code 
that lead me on this endeavor :) {consisting of a lot of Ada threads}, the 
result was a segfault due to half-processed data by TaskA.

This is what I call 'blocking'; the thread is no longer running and the 
scheduler puts someone else in the processor.  I don't mean 'takes a long time 
until it returns'.  Takes a long time is fine, the system design relies on 
priority based scheduling and cpu affinity to ensure ordered access to 
application data.

mlock() now blocks.  I don't care how long mlock() takes, what I care about is 
the lower priority process pre-empting me.  Only a limited number of syscalls 
block; those that do are documented and usually have a way to obtain blocking 
or non-blocking behavior.

Can I change the system to deal with mlock() being a blocking syscall ?  Yes, 
but this is a situation where working code, that meets the API has stopped 
working.

Thanks for looking at it.

Regards,
Bud Davis






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread li.xi...@freescale.com
> > So let's just ignore the clearance of these bits in isr().
> >
> > +
> > SAI Transmit Control Register (I2S1_TCSR) : 32 : R/W : _h
> 
> I'm talking about FWF and FRF bits, not TCSR as a register.
> 
> > -
> >
> > I have checked in the Vybrid and LS1 SoC datasheets, and they are all the
> > Same as above, and nothing else.
> >
> > Have I missed ?
> 
> What i.MX IC team told me is SAI ignores what we do to FWF and FRF, so you
> don't need to worry about it at all unless Vybrid makes them writable, in
> which case we may also need to clear these bits and confirm with Vybrid IC
> team if they're also W1C.
> 

Well, if so, that's fine.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Nicolin Chen
On Thu, Mar 27, 2014 at 12:06:53PM +0800, Xiubo Li-B47053 wrote:
> > > > > > > > +   if (xcsr & FSL_SAI_CSR_FWF)
> > > > > > > > +   dev_dbg(dev, "isr: Enabled transmit FIFO is 
> > > > > > > > empty\n");
> > > > > > > > +
> > > > > > > > +   if (xcsr & FSL_SAI_CSR_FRF)
> > > > > > > > +   dev_dbg(dev, "isr: Transmit FIFO watermark has 
> > > > > > > > been
> > > > reached\n");
> > > > > > > > +
> > > > > > >
> > > > > > > While are these ones really needed to clear manually ?
> > > > > >
> > > > > > The reference manual doesn't mention about the requirement. So SAI
> > should
> > > > do
> > > > > > the self-clearance.
> > > > >
> > > > > Yes, I do think we should let it do the self-clearance, and shouldn't
> > > > interfere
> > > > > of them...
> > > >
> > > > SAI is supposed to ignore the interference, isn't it?
> > > >
> > >
> > > Maybe, but I'm not very sure.
> > > And these bits are all writable and readable.
> > 
> > Double-confirmed? Because FWF and FRF should be read-only bits.
> > 
> 
> So let's just ignore the clearance of these bits in isr().
> 
> +
> SAI Transmit Control Register (I2S1_TCSR) : 32 : R/W : _h

I'm talking about FWF and FRF bits, not TCSR as a register.

> -
> 
> I have checked in the Vybrid and LS1 SoC datasheets, and they are all the
> Same as above, and nothing else.
> 
> Have I missed ?

What i.MX IC team told me is SAI ignores what we do to FWF and FRF, so you
don't need to worry about it at all unless Vybrid makes them writable, in
which case we may also need to clear these bits and confirm with Vybrid IC
team if they're also W1C.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread li.xi...@freescale.com
> > > > > > > + if (xcsr & FSL_SAI_CSR_FWF)
> > > > > > > + dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > > > > > > +
> > > > > > > + if (xcsr & FSL_SAI_CSR_FRF)
> > > > > > > + dev_dbg(dev, "isr: Transmit FIFO watermark has been
> > > reached\n");
> > > > > > > +
> > > > > >
> > > > > > While are these ones really needed to clear manually ?
> > > > >
> > > > > The reference manual doesn't mention about the requirement. So SAI
> should
> > > do
> > > > > the self-clearance.
> > > >
> > > > Yes, I do think we should let it do the self-clearance, and shouldn't
> > > interfere
> > > > of them...
> > >
> > > SAI is supposed to ignore the interference, isn't it?
> > >
> >
> > Maybe, but I'm not very sure.
> > And these bits are all writable and readable.
> 
> Double-confirmed? Because FWF and FRF should be read-only bits.
> 

So let's just ignore the clearance of these bits in isr().

+
SAI Transmit Control Register (I2S1_TCSR) : 32 : R/W : _h
-

I have checked in the Vybrid and LS1 SoC datasheets, and they are all the
Same as above, and nothing else.

Have I missed ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/12] drm/nouveau: support for GK20A, cont'd

2014-03-26 Thread Alexandre Courbot
On Wed, Mar 26, 2014 at 7:33 PM, Lucas Stach  wrote:
>> > It does so by doing the necessary manual cache flushes/invalidates on
>> > buffer access, so costs some performance. To avoid this you really want
>> > to get writecombined mappings into the kernel<->userspace interface.
>> > Simply mapping the pushbuf as WC/US has brought a 7% performance
>> > increase in OpenArena when I last tested this. This test was done with
>> > only one PCIe lane, so the perf increase may be even better with a more
>> > adequate interconnect.
>>
>> Interestingly if I allow writecombined mappings in the kernel I get
>> faults when attempting the read the mapped area:
>>
> This is most likely because your handling of those buffers produces
> conflicting mappings (if my understanding of what you are doing is
> right).
>
> At first you allocate memory from CMA without changing the pgprot flags.
> This yields pages which are mapped uncached or cached (when moveable
> pages are purged from CMA to make space for your buffer) into the
> kernels linear space.
>
> Later you regard this memory as iomem (it isn't!) and let TTM remap
> those pages into the vmalloc area with pgprot set to writecombined.
>
> I don't know exactly why this is causing havoc, but having two
> conflicting virtual mappings of the same physical memory is documented
> to at least produce undefined behavior on ARMv7.

IIUC this is not exactly what happens with GK20A, so let me explain
how VRAM is currently accessed to make sure we are in sync.

VRAM pages are allocated by nvea_ram_get(), which allocates chunks of
contiguous memory using dma_alloc_from_contiguous(). At that time I
don't think the pages are mapped anywhere for the CPU to see (contrary
to dma_alloc_coherent() for instance). Nouveau will then map the
memory into the GPU context's address space, but it is only when
nouveau_ttm_io_mem_reserve() is called that a BAR mapping is created,
making the memory accessible to the CPU through the BAR window (which
I consider as I/O memory).

The area of the BAR window pointing to the VRAM is then mapped to the
kernel (using ioremap_wc() or ioremap_nocache()) or user-space (where
ttm_io_prot() is called to get the pgprot_t to use). It is when this
mapping is writecombined that I get the faults.

So as far as I can tell, only at most one CPU mapping exists at any
time for VRAM memory, which goes through the BAR to access the actual
physical memory. It would probably be faster and more logical to map
the RAM directly so the CPU can address it, but going through the BAR
reduces CPU/GPU synchronization issues and there are a few cases where
we would need to map through the BAR anyway (e.g. tiled memory to be
made linear for the CPU).

I don't know if that help understanding what the issue might be - I
just wanted to make sure we are talking about the same thing. :)

Thanks,
Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Nicolin Chen
On Thu, Mar 27, 2014 at 11:41:02AM +0800, Xiubo Li-B47053 wrote:
> 
> > Subject: Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag
> > 
> > On Thu, Mar 27, 2014 at 10:53:50AM +0800, Xiubo Li-B47053 wrote:
> > > > On Thu, Mar 27, 2014 at 10:13:48AM +0800, Xiubo Li-B47053 wrote:
> > > > > > +   regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > > > > +   regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> > > > > > +
> > > > > > +   if (xcsr & FSL_SAI_CSR_WSF)
> > > > > > +   dev_dbg(dev, "isr: Start of Tx word detected\n");
> > > > > > +
> > > > > > +   if (xcsr & FSL_SAI_CSR_SEF)
> > > > > > +   dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> > > > > > +
> > > > > > +   if (xcsr & FSL_SAI_CSR_FEF)
> > > > > > +   dev_dbg(dev, "isr: Transmit underrun detected\n");
> > > > > > +
> > > > >
> > > > > Actually, the above three isrs should to write a logic 1 to this field
> > > > > to clear this flag.
> > > > >
> > > > >
> > > > > > +   if (xcsr & FSL_SAI_CSR_FWF)
> > > > > > +   dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > > > > > +
> > > > > > +   if (xcsr & FSL_SAI_CSR_FRF)
> > > > > > +   dev_dbg(dev, "isr: Transmit FIFO watermark has been
> > reached\n");
> > > > > > +
> > > > >
> > > > > While are these ones really needed to clear manually ?
> > > >
> > > > The reference manual doesn't mention about the requirement. So SAI 
> > > > should
> > do
> > > > the self-clearance.
> > >
> > > Yes, I do think we should let it do the self-clearance, and shouldn't
> > interfere
> > > of them...
> > 
> > SAI is supposed to ignore the interference, isn't it?
> > 
> 
> Maybe, but I'm not very sure.
> And these bits are all writable and readable.

Double-confirmed? Because FWF and FRF should be read-only bits.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 02/12] pci: host: pcie-dra7xx: add support for pcie-dra7xx controller

2014-03-26 Thread Jingoo Han
On Wednesday, March 26, 2014 10:58 PM, Kishon Vijay Abraham I wrote:
> 
> Added support for pcie controller in dra7xx. This driver re-uses
> the designware core code that is already present in kernel.
> 
> Signed-off-by: Kishon Vijay Abraham I 

Hi Kishon,
Long time no see! I added trivial comments.

> ---
>  Documentation/devicetree/bindings/pci/ti-pci.txt |   35 ++
>  drivers/pci/host/Kconfig |   10 +
>  drivers/pci/host/Makefile|1 +
>  drivers/pci/host/pcie-dra7xx.c   |  411 
> ++

How about using 'pci-' prefix?
As it was discussed earlier, 'pci-' prefix is more proper.

>  4 files changed, 457 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/pci/ti-pci.txt
>  create mode 100644 drivers/pci/host/pcie-dra7xx.c

[.]

> --- /dev/null
> +++ b/drivers/pci/host/pcie-dra7xx.c

[.]

> +#define  PCIECTRL_TI_CONF_IRQSTATUS_MAIN 0x0024
> +#define  PCIECTRL_TI_CONF_IRQENABLE_SET_MAIN 0x0028

I don't think that it's good to add vendor names such as TI
to SFR names.

How about adding 'DRA7XX' or just removing 'TI'?

1. PCIECTRL_DRA7XX_CONF_IRQSTATUS_MAIN

2. PCIECTRL_CONF_IRQSTATUS_MAIN

[.]

> +enum dra7xx_pcie_device_type {
> + DRA7XX_PCIE_UNKNOWN_TYPE,
> + DRA7XX_PCIE_EP_TYPE,
> + DRA7XX_PCIE_LEG_EP_TYPE,
> + DRA7XX_PCIE_RC_TYPE,
> +};

This driver can support only RC mode, so, these enum can be removed.

[.]

> + of_property_read_u32(node, "ti,device-type", &device_type);
> + switch (device_type) {
> + case DRA7XX_PCIE_RC_TYPE:
> + dra7xx_pcie_writel(dra7xx->base,
> + PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_RC);
> + break;
> + case DRA7XX_PCIE_EP_TYPE:
> + dra7xx_pcie_writel(dra7xx->base,
> + PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_EP);
> + break;
> + case DRA7XX_PCIE_LEG_EP_TYPE:
> + dra7xx_pcie_writel(dra7xx->base,
> + PCIECTRL_TI_CONF_DEVICE_TYPE, DEVICE_TYPE_LEG_EP);
> + break;
> + default:
> + dev_dbg(dev, "UNKNOWN device type %d\n", device_type);
> + }

Thus, this switch can be removed.
Others look good.

Best regards,
Jingoo Han

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread li.xi...@freescale.com

> Subject: Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag
> 
> On Thu, Mar 27, 2014 at 10:53:50AM +0800, Xiubo Li-B47053 wrote:
> > > On Thu, Mar 27, 2014 at 10:13:48AM +0800, Xiubo Li-B47053 wrote:
> > > > > + regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > > > + regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> > > > > +
> > > > > + if (xcsr & FSL_SAI_CSR_WSF)
> > > > > + dev_dbg(dev, "isr: Start of Tx word detected\n");
> > > > > +
> > > > > + if (xcsr & FSL_SAI_CSR_SEF)
> > > > > + dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> > > > > +
> > > > > + if (xcsr & FSL_SAI_CSR_FEF)
> > > > > + dev_dbg(dev, "isr: Transmit underrun detected\n");
> > > > > +
> > > >
> > > > Actually, the above three isrs should to write a logic 1 to this field
> > > > to clear this flag.
> > > >
> > > >
> > > > > + if (xcsr & FSL_SAI_CSR_FWF)
> > > > > + dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > > > > +
> > > > > + if (xcsr & FSL_SAI_CSR_FRF)
> > > > > + dev_dbg(dev, "isr: Transmit FIFO watermark has been
> reached\n");
> > > > > +
> > > >
> > > > While are these ones really needed to clear manually ?
> > >
> > > The reference manual doesn't mention about the requirement. So SAI should
> do
> > > the self-clearance.
> >
> > Yes, I do think we should let it do the self-clearance, and shouldn't
> interfere
> > of them...
> 
> SAI is supposed to ignore the interference, isn't it?
> 

Maybe, but I'm not very sure.
And these bits are all writable and readable.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11] Revert "serial: omap: unlock the port lock"

2014-03-26 Thread Felipe Balbi
Hi,

On Wed, Mar 26, 2014 at 10:27:13PM -0400, Peter Hurley wrote:
> On 03/26/2014 10:10 PM, Felipe Balbi wrote:
> >Hi,
> >
> >On Wed, Mar 26, 2014 at 08:39:11PM -0400, Peter Hurley wrote:
> >>On 03/25/2014 02:28 PM, Tony Lindgren wrote:
> >>>* Felipe Balbi  [140320 12:39]:
> This reverts commit 0324a821029e1f54e7a7f8fed48693cfce42dc0e.
> 
> That commit tried to fix a deadlock problem when using
> hci_ldisc, but it turns out the bug was in hci_ldsic
> all along where it was calling ->write() from within
> ->write_wakeup() callback.
> 
> The problem is that ->write_wakeup() was called with
> port lock held and ->write() tried to grab the same
> port lock.
> >>>
> >>>Should this and the next patch be earlier in the series
> >>>as a fix for the v3.15-rc cycle? Should they be cc: stable
> >>>as well?
> >>
> >>Well, right now the other fix has had _zero_ testing
> >>so not really a -stable candidate just yet.
> >
> >how can you even say that ?
> 
> I misunderstood when you wrote:
> 
> On 03/20/2014 02:11 PM, Felipe Balbi wrote:
> > here's a build-tested only patch which is waiting for testing from other
> > colleagues who've got a platform to reproduce the problem:
> 
> and then the version I reviewed had no Tested-by: tags.

I wouldn't add that tag myself, but Murali (in Cc) did help testing
together with other colleagues.

> >How else would we have found the issue to start with ?
> 
> Bug report?

touchè :-)

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH 09/11] bluetooth: hci_ldisc: fix deadlock condition

2014-03-26 Thread Felipe Balbi
Hi,

On Wed, Mar 26, 2014 at 10:20:15PM -0400, Peter Hurley wrote:
> >>You may want to build on top of this patch split handling;
> >>I noticed some of the protocol drivers are calling
> >>hci_uart_tx_wakeup() from work functions already (so don't
> >>need to schedule another work...)
> >
> >I don't think that should be part of $subject, though.
> 
> I don't understand what you mean here.

it seemed, at first, like you suggested to redo this patch modifying the
protocol drivers to avoid two workqueues. But now that I read it again
you _did_ write "on top of this patch".

-- 
balbi


signature.asc
Description: Digital signature


Re: Thoughts on credential switching

2014-03-26 Thread Jeff Layton
On Wed, 26 Mar 2014 20:05:16 -0700
Andy Lutomirski  wrote:

> On Wed, Mar 26, 2014 at 7:48 PM, Jeff Layton 
> wrote:
> > On Wed, 26 Mar 2014 17:23:24 -0700
> > Andy Lutomirski  wrote:
> >
> >> Hi various people who care about user-space NFS servers and/or
> >> security-relevant APIs.
> >>
> >> I propose the following set of new syscalls:
> >>
> >> int credfd_create(unsigned int flags): returns a new credfd that
> >> corresponds to current's creds.
> >>
> >> int credfd_activate(int fd, unsigned int flags): Change current's
> >> creds to match the creds stored in fd.  To be clear, this changes
> >> both the "subjective" and "objective" (aka real_cred and cred)
> >> because there aren't any real semantics for what happens when
> >> userspace code runs with real_cred != cred.
> >>
> >> Rules:
> >>
> >>  - credfd_activate fails (-EINVAL) if fd is not a credfd.
> >>  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
> >> current's userns.  credfd_activate is not intended to be a
> >> substitute for setns.
> >>  - credfd_activate will fail (-EPERM) if LSM does not allow the
> >> switch.  This probably needs to be a new selinux action --
> >> dyntransition is too restrictive.
> >>
> >>
> >> Optional:
> >>  - credfd_create always sets cloexec, because the alternative is
> >> silly.
> >>  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
> >> don't want a privileged daemon to be ptraced while impersonating
> >> someone else.
> >>  - optional: both credfd_create and credfd_activate fail if
> >> !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
> >>
> >> The first question: does this solve Ganesha's problem?
> >>
> >> The second question: is this safe?  I can see two major concerns.
> >> The bigger concern is that having these syscalls available will
> >> allow users to exploit things that were previously secure.  For
> >> example, maybe some configuration assumes that a task running as
> >> uid==1 can't switch to uid==2, even with uid 2's consent.  Similar
> >> issues happen with capabilities.  If CAP_SYS_ADMIN is not
> >> required, then this is no longer really true.
> >>
> >> Alternatively, something running as uid == 0 with heavy capability
> >> restrictions in a mount namespace (but not a uid namespace) could
> >> pass a credfd out of the namespace.  This could break things like
> >> Docker pretty badly.  CAP_SYS_ADMIN guards against this to some
> >> extent.  But I think that Docker is already totally screwed if a
> >> Docker root task can receive an O_DIRECTORY or O_PATH fd out of
> >> the container, so it's not entirely clear that the situation is
> >> any worse, even without requiring CAP_SYS_ADMIN.
> >>
> >> The second concern is that it may be difficult to use this
> >> correctly. There's a reason that real_cred and cred exist, but
> >> it's not really well set up for being used.
> >>
> >> As a simple way to stay safe, Ganesha could only use credfds that
> >> have real_uid == 0.
> >>
> >> --Andy
> >
> >
> > I still don't quite grok why having this special credfd_create call
> > buys you anything over simply doing what Al had originally
> > suggested -- switch creds using all of the different syscalls and
> > then simply caching that in a "normal" fd:
> >
> > fd = open("/dev/null", O_PATH...);
> >
> > ...it seems to me that the credfd_activate call will still need to
> > do the same permission checking that all of the individual set*id()
> > calls require (and all of the other stuff like changing selinux
> > contexts, etc).
> >
> > IOW, this fd is just a "handle" for passing around a struct cred,
> > but I don't see why having access to that handle would allow you to
> > do something you couldn't already do anyway.
> >
> > Am I missing something obvious here?
> 
> Not really.  I think I didn't adequately explain a piece of this.
> 
> I think that what you're suggesting is for an fd to encode a set of
> credentials but not to grant permission to use those credentials.  So
> switch_creds(fd) is more or less the same thing as switch_creds(ruid,
> euid, suid, rgid, egid, sgid, groups, mac label, ...).  switch_creds
> needs to verify that the caller can dyntransition to the label, set
> all the ids, etc., but it avoids allocating anything and running RCU
> callbacks.
> 
> The trouble with this is that the verification needed is complicated
> and expensive.  And I think that my proposal is potentially more
> useful.
> 

Is it really though? My understanding of the problem was that it was
the syscall (context switching) overhead + having to do a bunch of RCU
critical stuff that was the problem. If we can do all of this in the
context of a single RCU critical section, isn't that still a win?

As to the complicated part...maybe but it doesn't seem like it would
have to be. We could simply return -EINVAL or something if the old
struct cred doesn't have fields that match the ones we're replacing and
that we don't expect to see changed.

> A credfd is like a struct cred, bu

Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Nicolin Chen
On Thu, Mar 27, 2014 at 10:53:50AM +0800, Xiubo Li-B47053 wrote:
> > On Thu, Mar 27, 2014 at 10:13:48AM +0800, Xiubo Li-B47053 wrote:
> > > > +   regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > > +   regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> > > > +
> > > > +   if (xcsr & FSL_SAI_CSR_WSF)
> > > > +   dev_dbg(dev, "isr: Start of Tx word detected\n");
> > > > +
> > > > +   if (xcsr & FSL_SAI_CSR_SEF)
> > > > +   dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> > > > +
> > > > +   if (xcsr & FSL_SAI_CSR_FEF)
> > > > +   dev_dbg(dev, "isr: Transmit underrun detected\n");
> > > > +
> > >
> > > Actually, the above three isrs should to write a logic 1 to this field
> > > to clear this flag.
> > >
> > >
> > > > +   if (xcsr & FSL_SAI_CSR_FWF)
> > > > +   dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > > > +
> > > > +   if (xcsr & FSL_SAI_CSR_FRF)
> > > > +   dev_dbg(dev, "isr: Transmit FIFO watermark has been 
> > > > reached\n");
> > > > +
> > >
> > > While are these ones really needed to clear manually ?
> > 
> > The reference manual doesn't mention about the requirement. So SAI should do
> > the self-clearance.
> 
> Yes, I do think we should let it do the self-clearance, and shouldn't 
> interfere
> of them...

SAI is supposed to ignore the interference, isn't it?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tick, broadcast: Prevent false alarm when force mask contains offline cpus

2014-03-26 Thread Preeti U Murthy
On 03/26/2014 04:51 PM, Srivatsa S. Bhat wrote:
> On 03/26/2014 09:26 AM, Preeti U Murthy wrote:
>> Its possible that the tick_broadcast_force_mask contains cpus which are not
>> in cpu_online_mask when a broadcast tick occurs. This could happen under the
>> following circumstance assuming CPU1 is among the CPUs waiting for broadcast.
>>
>> CPU0 CPU1
>>
>> Run CPU_DOWN_PREPARE notifiers
>>
>> Start stop_machine   Gets woken up by IPI to run
>>  stop_machine, sets itself in
>>  tick_broadcast_force_mask if the
>>  time of broadcast interrupt is around
>>  the same time as this IPI.
>>
>>  Start stop_machine
>>set_cpu_online(cpu1, false)
>> End stop_machine End stop_machine
>>
>> Broadcast interrupt
>>   Finds that cpu1 in
>>   tick_broadcast_force_mask is offline
>>   and triggers the WARN_ON in
>>   tick_handle_oneshot_broadcast()
>>
>> Clears all broadcast masks
>> in CPU_DEAD stage.
>>
>> This WARN_ON was added to capture scenarios where the broadcast mask, be it
>> oneshot/pending/force_mask contain offline cpus whose tick devices have been
>> removed. But here is a case where we trigger the warn on in a valid scenario.
>>
>> One could argue that the scenario is invalid and ought to be warned against
>> because ideally the broadcast masks need to be cleared of the cpus about to
>> go offine before clearing them in the online_mask so that we dont hit these
>> scenarios.
>>
>> This would mean clearing the masks in CPU_DOWN_PREPARE stage.
> 
> Not necessarily. We could clear the mask in the CPU_DYING stage. That way,
> offline CPUs will automatically get cleared from the force_mask and hence
> the tick-broadcast code will not need to have a special case to deal with
> this scenario. What do you think?

Ok I gave some thought to this. This will not work with the hrtimer mode
of broadcast framework going in. This is the feature that was added for
implementations of such archs which do not have an external clock device
to wake them up in deep idle states when the local timers stop. They
assign one of the CPUs as an agent to wake them up. When this designated
CPU gets hotplugged out, we need to assign this duty to some other CPU.

The way this is being done now is in
tick_shutdown_broadcast_oneshot_control() which is also responsible for
clearing the broadcast masks. When the hrtimer mode of broadcast is
active, then in addition to clearing masks in this function we make the
CPU executing this function take on the task of waking up CPUs in deep
idle state if the hotplugged CPU was doing this earlier.

Currently tick_shutdown_broadcast_oneshot_control() is being executed in
the CPU_DEAD notification and this is guarenteed to run on a CPU *other
than* the dying CPU. Hence we can safely do this.

However if we move this function underneath CPU_DYING notifier, this
will turn out to be a disaster since IIUC the dying CPU is running this
notifier and will end up re-assigning the duty of waking up CPUs to itself.

Does this make sense?

Regards
Preeti U Murthy
> 
> Regards,
> Srivatsa S. Bhat
> 
>> ---
>>
>>  kernel/time/tick-broadcast.c |7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>> index 63c7b2d..30b8731 100644
>> --- a/kernel/time/tick-broadcast.c
>> +++ b/kernel/time/tick-broadcast.c
>> @@ -606,7 +606,12 @@ again:
>>   */
>>  cpumask_clear_cpu(smp_processor_id(), tick_broadcast_pending_mask);
>>
>> -/* Take care of enforced broadcast requests */
>> +/* Take care of enforced broadcast requests. We could have offline
>> + * cpus in the tick_broadcast_force_mask. Thats ok, we got the interrupt
>> + * before we could clear the mask.
>> + */
>> +cpumask_and(tick_broadcast_force_mask,
>> +tick_broadcast_force_mask, cpu_online_mask);
>>  cpumask_or(tmpmask, tmpmask, tick_broadcast_force_mask);
>>  cpumask_clear(tick_broadcast_force_mask);
>>
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Andy Lutomirski
On Wed, Mar 26, 2014 at 7:48 PM, Jeff Layton  wrote:
> On Wed, 26 Mar 2014 17:23:24 -0700
> Andy Lutomirski  wrote:
>
>> Hi various people who care about user-space NFS servers and/or
>> security-relevant APIs.
>>
>> I propose the following set of new syscalls:
>>
>> int credfd_create(unsigned int flags): returns a new credfd that
>> corresponds to current's creds.
>>
>> int credfd_activate(int fd, unsigned int flags): Change current's
>> creds to match the creds stored in fd.  To be clear, this changes both
>> the "subjective" and "objective" (aka real_cred and cred) because
>> there aren't any real semantics for what happens when userspace code
>> runs with real_cred != cred.
>>
>> Rules:
>>
>>  - credfd_activate fails (-EINVAL) if fd is not a credfd.
>>  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
>> current's userns.  credfd_activate is not intended to be a substitute
>> for setns.
>>  - credfd_activate will fail (-EPERM) if LSM does not allow the
>> switch.  This probably needs to be a new selinux action --
>> dyntransition is too restrictive.
>>
>>
>> Optional:
>>  - credfd_create always sets cloexec, because the alternative is
>> silly.
>>  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
>> don't want a privileged daemon to be ptraced while impersonating
>> someone else.
>>  - optional: both credfd_create and credfd_activate fail if
>> !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
>>
>> The first question: does this solve Ganesha's problem?
>>
>> The second question: is this safe?  I can see two major concerns.  The
>> bigger concern is that having these syscalls available will allow
>> users to exploit things that were previously secure.  For example,
>> maybe some configuration assumes that a task running as uid==1 can't
>> switch to uid==2, even with uid 2's consent.  Similar issues happen
>> with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
>> longer really true.
>>
>> Alternatively, something running as uid == 0 with heavy capability
>> restrictions in a mount namespace (but not a uid namespace) could pass
>> a credfd out of the namespace.  This could break things like Docker
>> pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
>> I think that Docker is already totally screwed if a Docker root task
>> can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
>> not entirely clear that the situation is any worse, even without
>> requiring CAP_SYS_ADMIN.
>>
>> The second concern is that it may be difficult to use this correctly.
>> There's a reason that real_cred and cred exist, but it's not really
>> well set up for being used.
>>
>> As a simple way to stay safe, Ganesha could only use credfds that have
>> real_uid == 0.
>>
>> --Andy
>
>
> I still don't quite grok why having this special credfd_create call
> buys you anything over simply doing what Al had originally suggested --
> switch creds using all of the different syscalls and then simply caching
> that in a "normal" fd:
>
> fd = open("/dev/null", O_PATH...);
>
> ...it seems to me that the credfd_activate call will still need to do
> the same permission checking that all of the individual set*id() calls
> require (and all of the other stuff like changing selinux contexts,
> etc).
>
> IOW, this fd is just a "handle" for passing around a struct cred, but I
> don't see why having access to that handle would allow you to do
> something you couldn't already do anyway.
>
> Am I missing something obvious here?

Not really.  I think I didn't adequately explain a piece of this.

I think that what you're suggesting is for an fd to encode a set of
credentials but not to grant permission to use those credentials.  So
switch_creds(fd) is more or less the same thing as switch_creds(ruid,
euid, suid, rgid, egid, sgid, groups, mac label, ...).  switch_creds
needs to verify that the caller can dyntransition to the label, set
all the ids, etc., but it avoids allocating anything and running RCU
callbacks.

The trouble with this is that the verification needed is complicated
and expensive.  And I think that my proposal is potentially more
useful.

A credfd is like a struct cred, but possession of a credfd carries the
permission to use those credentials.  So, for example, credfd_activate
to switch to a given uid might work even if setresuid to that uid
would be disallowed.  But, for this to be secure, the act of giving
someone a credfd needs to be explicit.  Programs implicitly send other
programs their credentials by means of f_cred all the time, and they
don't expect to allow the receiver to impersonate them.

credfd has other uses.  A file server, for example, could actually
delegate creation of the credfds to a separate process, and that
process could validate that the request is for a credfd that the file
server really should be able to obtain.  This would enable that
process to make sure that the user in question has actually
authenticated itsel

Re: [KVM] BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0

2014-03-26 Thread Fengguang Wu
On Wed, Mar 26, 2014 at 04:46:48PM +0100, Paolo Bonzini wrote:
> Il 26/03/2014 15:57, Fengguang Wu ha scritto:
>  >
>  >git://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
>  >commit 93c4adc7afedf9b0ec190066d45b6d67db5270da ("KVM: x86: handle 
>  >missing MPX in nested virtualization")
> >>>
> >>> Ouch.  Out of curiosity is this on Skylake prototypes, or is it also
> >>> visible on some released silicon?
> >Paolo, the problem shows up in a Sandybridge-EX and an Ivybridge-EX.
> 
> What does that mean in terms of commercial names?  I tested on Sandy
> Bridge Xeon E5.

Sorry I don't know the exact commercial names listed in

http://en.wikipedia.org/wiki/Sandy_Bridge
http://en.wikipedia.org/wiki/Ivy_Bridge_%28microarchitecture%29

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread li.xi...@freescale.com
> On Thu, Mar 27, 2014 at 10:13:48AM +0800, Xiubo Li-B47053 wrote:
> > > + regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > + regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> > > +
> > > + if (xcsr & FSL_SAI_CSR_WSF)
> > > + dev_dbg(dev, "isr: Start of Tx word detected\n");
> > > +
> > > + if (xcsr & FSL_SAI_CSR_SEF)
> > > + dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> > > +
> > > + if (xcsr & FSL_SAI_CSR_FEF)
> > > + dev_dbg(dev, "isr: Transmit underrun detected\n");
> > > +
> >
> > Actually, the above three isrs should to write a logic 1 to this field
> > to clear this flag.
> >
> >
> > > + if (xcsr & FSL_SAI_CSR_FWF)
> > > + dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > > +
> > > + if (xcsr & FSL_SAI_CSR_FRF)
> > > + dev_dbg(dev, "isr: Transmit FIFO watermark has been reached\n");
> > > +
> >
> > While are these ones really needed to clear manually ?
> 
> The reference manual doesn't mention about the requirement. So SAI should do
> the self-clearance.

Yes, I do think we should let it do the self-clearance, and shouldn't interfere
of them...




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 4/4] cpuset: Add cpusets.quiesce option

2014-03-26 Thread Li Zefan
On 2014/3/20 21:49, Viresh Kumar wrote:
> For networking applications platforms need to provide one CPU per each user
> space data plane thread. These CPUs should not be interrupted by kernel at all
> unless userspace has requested for some syscalls. Currently, there are
> background kernel activities that are running on almost every CPU, like:
> timers/hrtimers/watchdogs/etc, and these are required to be migrated to other
> CPUs.
> 
> To achieve that, this patch adds another option to cpusets, i.e. 'quiesce'.
> Writing '1' on this file would migrate these unbound/unpinned 
> timers/workqueues
> away from the CPUs of the cpuset in question. Writing '0' has no effect and 
> this
> file can't be read from userspace as we aren't maintaining a state here.
> 

This doesn't look like a complete solution, because newer timers/workqueues can
still run in those CPUs. Seems like the proposal discussed is to support setting
cpu affinity for workqueues through sysfs. If so, we can migrate workqueues when
affinity is set, so we don't need this cpuset.quiesce ?

> Currently, only timers are migrated. This would be followed by other kernel
> infrastructure later.
> 
> Suggested-by: Peter Zijlstra 
> Signed-off-by: Viresh Kumar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Jeff Layton
On Wed, 26 Mar 2014 17:23:24 -0700
Andy Lutomirski  wrote:

> Hi various people who care about user-space NFS servers and/or
> security-relevant APIs.
> 
> I propose the following set of new syscalls:
> 
> int credfd_create(unsigned int flags): returns a new credfd that
> corresponds to current's creds.
> 
> int credfd_activate(int fd, unsigned int flags): Change current's
> creds to match the creds stored in fd.  To be clear, this changes both
> the "subjective" and "objective" (aka real_cred and cred) because
> there aren't any real semantics for what happens when userspace code
> runs with real_cred != cred.
> 
> Rules:
> 
>  - credfd_activate fails (-EINVAL) if fd is not a credfd.
>  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
> current's userns.  credfd_activate is not intended to be a substitute
> for setns.
>  - credfd_activate will fail (-EPERM) if LSM does not allow the
> switch.  This probably needs to be a new selinux action --
> dyntransition is too restrictive.
> 
> 
> Optional:
>  - credfd_create always sets cloexec, because the alternative is
> silly.
>  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
> don't want a privileged daemon to be ptraced while impersonating
> someone else.
>  - optional: both credfd_create and credfd_activate fail if
> !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
> 
> The first question: does this solve Ganesha's problem?
> 
> The second question: is this safe?  I can see two major concerns.  The
> bigger concern is that having these syscalls available will allow
> users to exploit things that were previously secure.  For example,
> maybe some configuration assumes that a task running as uid==1 can't
> switch to uid==2, even with uid 2's consent.  Similar issues happen
> with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
> longer really true.
> 
> Alternatively, something running as uid == 0 with heavy capability
> restrictions in a mount namespace (but not a uid namespace) could pass
> a credfd out of the namespace.  This could break things like Docker
> pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
> I think that Docker is already totally screwed if a Docker root task
> can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
> not entirely clear that the situation is any worse, even without
> requiring CAP_SYS_ADMIN.
> 
> The second concern is that it may be difficult to use this correctly.
> There's a reason that real_cred and cred exist, but it's not really
> well set up for being used.
> 
> As a simple way to stay safe, Ganesha could only use credfds that have
> real_uid == 0.
> 
> --Andy


I still don't quite grok why having this special credfd_create call
buys you anything over simply doing what Al had originally suggested --
switch creds using all of the different syscalls and then simply caching
that in a "normal" fd:

fd = open("/dev/null", O_PATH...);

...it seems to me that the credfd_activate call will still need to do
the same permission checking that all of the individual set*id() calls
require (and all of the other stuff like changing selinux contexts,
etc).

IOW, this fd is just a "handle" for passing around a struct cred, but I
don't see why having access to that handle would allow you to do
something you couldn't already do anyway.

Am I missing something obvious here?

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Nicolin Chen
On Thu, Mar 27, 2014 at 10:13:48AM +0800, Xiubo Li-B47053 wrote:
> > +   regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > +   regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> > +
> > +   if (xcsr & FSL_SAI_CSR_WSF)
> > +   dev_dbg(dev, "isr: Start of Tx word detected\n");
> > +
> > +   if (xcsr & FSL_SAI_CSR_SEF)
> > +   dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> > +
> > +   if (xcsr & FSL_SAI_CSR_FEF)
> > +   dev_dbg(dev, "isr: Transmit underrun detected\n");
> > +
> 
> Actually, the above three isrs should to write a logic 1 to this field
> to clear this flag.
> 
> 
> > +   if (xcsr & FSL_SAI_CSR_FWF)
> > +   dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> > +
> > +   if (xcsr & FSL_SAI_CSR_FRF)
> > +   dev_dbg(dev, "isr: Transmit FIFO watermark has been reached\n");
> > +
> 
> While are these ones really needed to clear manually ?

The reference manual doesn't mention about the requirement. So SAI should do
the self-clearance.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC II] Splitting scheduler into two halves

2014-03-26 Thread Yuyang du
Hi all,

This is continued after the first RFC about splitting the scheduler. Still
work-in-progress, and call for feedback.

The question addressed here is how load balance should be changed. And I think
the question then goes to how to *reuse* common code as much as possible and
meanwhile be able to serve various objectives.

So these are the basic semantics needed in current load balance:

1. [ At balance point ] on this_cpu push task on that_cpu to [ third_cpu ]

Examples are fork/exec/wakeup. Task is determined by the balance point in
question. And that_cpu is determined by task.

2. [ At balance point ] on this_cpu pull [ task/tasks ] on [ that_cpu ] to
this_cpu

Examples are other idle/periodic/nohz balance, and active_load_balance in
ASYM_PACKING (pull first and then a push).

3. [ At balance point ] on this_cpu kick [ that_cpu/those_cpus ] to do [ what
] balance

Examples are nohz idle balance and active balance.

To make the above more general, we need to abstract more:

1. [ At balance point ] on this_cpu push task on that_cpu to [ third_cpu ] in
[ cpu_mask ]

2. [ At balance point ] on this_cpu [ do | skip ] pull [task/tasks ] on [
that_cpu ] in [ cpu_mask ] to this_cpu

3. [ At balance point ] on this_cpu kick [ that_cpu/those_cpus ] in [ cpu_mask
] to do nohz idle balance

So essentially, we give them choice or restrict the scope for them.

Then instead of an all-in-one load_balance class, we define pull or push
classes:

struct push_class:
int (*which_third_cpu);
struct cpumask * (*which_cpu_mask);

struct pull_class:
int (*skip);
int (*which_that_cpu);
struct task_struct * (*which_task);
struct cpumask* (*which_cpu_mask);

Last but not least, currently we configure domain by flags/parameters, how
about attaching push/pull classes directly to them as struct members? So those
classes are responsible specially for its riding domain's "well-being".

Thanks,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [alsa-devel] [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Nicolin Chen
On Thu, Mar 27, 2014 at 01:14:24AM +, Mark Brown wrote:
> On Wed, Mar 26, 2014 at 11:59:53AM +, David Laight wrote:
> > From: Nicolin Chen
> 
> > > + regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > > + regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> 
> > Assuming these are 'write to clear' bits, you might want
> > to make the write (above) and all the traces (below)
> > conditional on the value being non-zero.
> 
> The trace is already conditional?  I'd also expect to see the driver
> only acknowledging sources it knows about and only reporting that the
> interrupt was handled if it saw one of them - right now all interrupts
> are unconditionally acknowleged.

Will revise it based on the comments from both of you.

Thank you.


> ___
> Alsa-devel mailing list
> alsa-de...@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/3] kmemleak: allow freeing internal objects after kmemleak was disabled

2014-03-26 Thread Li Zefan
(Just came back from travelling)

On 2014/3/22 7:37, Catalin Marinas wrote:
> Hi Li,
> 
> On 17 Mar 2014, at 04:07, Li Zefan  wrote:
>> Currently if kmemleak is disabled, the kmemleak objects can never be freed,
>> no matter if it's disabled by a user or due to fatal errors.
>>
>> Those objects can be a big waste of memory.
>>
>>  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
>> 1200264 1197433  99%0.30K  46164   26369312K kmemleak_object
>>
>> With this patch, internal objects will be freed immediately if kmemleak is
>> disabled explicitly by a user. If it's disabled due to a kmemleak error,
>> The user will be informed, and then he/she can reclaim memory with:
>>
>>  # echo off > /sys/kernel/debug/kmemleak
>>
>> v2: use "off" handler instead of "clear" handler to do this, suggested
>>by Catalin.
> 
> I think there was a slight misunderstanding. My point was about "echo
> scan=off” before “echo off”, they can just be squashed into the
> same action of the latter.
> 

I'm not sure if I understand correctly, so you want the "off" handler to
stop the scan thread but it will never free kmemleak objects until the 
user explicitly trigger the "clear" action, right?

> I would keep the “clear” part separately as per your first patch. I
> recall people asked in the past to still be able to analyse the reports
> even though kmemleak failed or was disabled.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11] Revert "serial: omap: unlock the port lock"

2014-03-26 Thread Peter Hurley

On 03/26/2014 10:10 PM, Felipe Balbi wrote:

Hi,

On Wed, Mar 26, 2014 at 08:39:11PM -0400, Peter Hurley wrote:

On 03/25/2014 02:28 PM, Tony Lindgren wrote:

* Felipe Balbi  [140320 12:39]:

This reverts commit 0324a821029e1f54e7a7f8fed48693cfce42dc0e.

That commit tried to fix a deadlock problem when using
hci_ldisc, but it turns out the bug was in hci_ldsic
all along where it was calling ->write() from within
->write_wakeup() callback.

The problem is that ->write_wakeup() was called with
port lock held and ->write() tried to grab the same
port lock.


Should this and the next patch be earlier in the series
as a fix for the v3.15-rc cycle? Should they be cc: stable
as well?


Well, right now the other fix has had _zero_ testing
so not really a -stable candidate just yet.


how can you even say that ?


I misunderstood when you wrote:

On 03/20/2014 02:11 PM, Felipe Balbi wrote:
> here's a build-tested only patch which is waiting for testing from other
> colleagues who've got a platform to reproduce the problem:

and then the version I reviewed had no Tested-by: tags.


Unless you work for some 3 letter acronym
organizations, you have no clue about the fact that this was tested on a
keystone 2 platform.


Ok.


How else would we have found the issue to start with ?


Bug report?

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] random32: avoid attempt to late reseed if in the middle of seeding

2014-03-26 Thread Hannes Frederic Sowa
On Wed, Mar 26, 2014 at 07:35:01PM -0400, Sasha Levin wrote:
> On 03/26/2014 07:18 PM, Daniel Borkmann wrote:
> >On 03/26/2014 06:12 PM, Sasha Levin wrote:
> >>Commit 4af712e8df ("random32: add prandom_reseed_late() and call when
> >>nonblocking pool becomes initialized") has added a late reseed stage
> >>that happens as soon as the nonblocking pool is marked as initialized.
> >>
> >>This fails in the case that the nonblocking pool gets initialized
> >>during __prandom_reseed()'s call to get_random_bytes(). In that case
> >>we'd double back into __prandom_reseed() in an attempt to do a late
> >>reseed - deadlocking on 'lock' early on in the boot process.
> >>
> >>Instead, just avoid even waiting to do a reseed if a reseed is already
> >>occuring.
> >>
> >>Signed-off-by: Sasha Levin 
> >
> >Thanks for catching! (If you want Dave to pick it up, please also
> >Cc netdev.)
> >
> >Why not via spin_trylock_irqsave() ? Thus, if we already hold the
> >lock, we do not bother any longer with doing the same work twice
> >and just return.

I totally agree with Daniel spin_trylock_irqsave seems like the best
solution.

In case we really want to make sure that even early seeding doesn't
race with late seed and the pool is only filled by another CPU, we would
actually need per-cpu bools to get this case correct.

I really doubt this isn't worth the effort and wouldn't do that.

> Your code looks much better, I'll should really stop sending patches
> too early in the morning...
> 
> It's also worth adding lib/random32.c to the MAINTAINERS file, as my
> list of recipients is solely based on what get_maintainer.pl tells
> me to do (and I'm assuming that I'm not the last person who will be
> sending patches for this).

Would be a nice idea, especially because prandom_u32 changes are sensitive to
network security and should get reviewed there, too.

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/11] bluetooth: hci_ldisc: fix deadlock condition

2014-03-26 Thread Peter Hurley

On 03/26/2014 10:09 PM, Felipe Balbi wrote:

I just noticed this patch wasn't addressed to Marcel;
seems like this should go through the bluetooth tree (but not
through bluetooth-next because it fixes an oops).


read the archives:

http://marc.info/?l=linux-bluetooth&m=139534449409583&w=2


Sorry. I did actually get Marcel's reply but Thunderbird
didn't parent the reply properly in my inbox and I forgot about it.



Marcel,

You may want to build on top of this patch split handling;
I noticed some of the protocol drivers are calling
hci_uart_tx_wakeup() from work functions already (so don't
need to schedule another work...)


I don't think that should be part of $subject, though.


I don't understand what you mean here.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread li.xi...@freescale.com
> + regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> + regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);
> +
> + if (xcsr & FSL_SAI_CSR_WSF)
> + dev_dbg(dev, "isr: Start of Tx word detected\n");
> +
> + if (xcsr & FSL_SAI_CSR_SEF)
> + dev_dbg(dev, "isr: Tx Frame sync error detected\n");
> +
> + if (xcsr & FSL_SAI_CSR_FEF)
> + dev_dbg(dev, "isr: Transmit underrun detected\n");
> +

Actually, the above three isrs should to write a logic 1 to this field
to clear this flag.


> + if (xcsr & FSL_SAI_CSR_FWF)
> + dev_dbg(dev, "isr: Enabled transmit FIFO is empty\n");
> +
> + if (xcsr & FSL_SAI_CSR_FRF)
> + dev_dbg(dev, "isr: Transmit FIFO watermark has been reached\n");
> +

While are these ones really needed to clear manually ?


Thanks,
--

Best Regards,
Xiubo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11] Revert "serial: omap: unlock the port lock"

2014-03-26 Thread Felipe Balbi
Hi,

On Wed, Mar 26, 2014 at 08:39:11PM -0400, Peter Hurley wrote:
> On 03/25/2014 02:28 PM, Tony Lindgren wrote:
> >* Felipe Balbi  [140320 12:39]:
> >>This reverts commit 0324a821029e1f54e7a7f8fed48693cfce42dc0e.
> >>
> >>That commit tried to fix a deadlock problem when using
> >>hci_ldisc, but it turns out the bug was in hci_ldsic
> >>all along where it was calling ->write() from within
> >>->write_wakeup() callback.
> >>
> >>The problem is that ->write_wakeup() was called with
> >>port lock held and ->write() tried to grab the same
> >>port lock.
> >
> >Should this and the next patch be earlier in the series
> >as a fix for the v3.15-rc cycle? Should they be cc: stable
> >as well?
> 
> Well, right now the other fix has had _zero_ testing
> so not really a -stable candidate just yet.

how can you even say that ? Unless you work for some 3 letter acronym
organizations, you have no clue about the fact that this was tested on a
keystone 2 platform. How else would we have found the issue to start
with ?

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 in tlb and alb mode.

2014-03-26 Thread zheng.li
Hi Jay,
What's your opinion about the new patch.

Thanks,
Zheng Li

于 2014年03月26日 08:53, Ding Tianhong 写道:
> On 2014/3/25 16:36, zheng.li wrote:
>> 于 2014年03月25日 11:42, Ding Tianhong 写道:
>>> On 2014/3/25 11:00, Zheng Li wrote:
 In bond mode tlb and alb, inactive slaves should keep inactive flag to
 1 to refuse to receive broadcast packets. Now, active slave send broadcast 
 packets
 (for example ARP requests) which will arrive inactive slaves on same host 
 from switch,
 but inactive slave's inactive flag is zero that cause bridge receive the 
 broadcast
 packets to produce a wrong entry in forward table. Typical situation is 
 domu send some
 ARP request which go out from dom0 bond's active slave, then the ARP 
 broadcast request
 packets go back to inactive slave from switch, because the inactive 
 slave's inactive
 flag is zero, kernel will receive the packets and pass them to bridge, 
 that cause dom0's
 bridge map domu's MAC address to port of bond, bridge should map domu's 
 MAC to port of vif.

 Signed-off-by: Zheng Li 
 ---
  drivers/net/bonding/bond_main.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/drivers/net/bonding/bond_main.c 
 b/drivers/net/bonding/bond_main.c
 index e5628fc..8761df6 100644
 --- a/drivers/net/bonding/bond_main.c
 +++ b/drivers/net/bonding/bond_main.c
 @@ -3062,7 +3062,7 @@ static int bond_open(struct net_device *bond_dev)
&& (slave != bond->curr_active_slave)) {
bond_set_slave_inactive_flags(slave,
  
 BOND_SLAVE_NOTIFY_NOW);
 -  } else {
 +  } else if (!bond_is_lb(bond)) {
bond_set_slave_active_flags(slave,

 BOND_SLAVE_NOTIFY_NOW);
}

>>> I think you did not fix the problem completely, the state monitor will 
>>> change the status for the slaves
>>> and the inactive slave still could receive the broadcast.
>>
>> Had tested, it can fix the issue, verified by our QA.
>> Default set slave of bond as inactive when add a slave to bond, when
>> link UP, just set one slave as current active slave and clear its
>> inactive flag, the inactive slave's inactive flag will keep the value of 1.
>>
>>
> Ok, I found that in the mii monitor, it will only change the backup state, no 
> problem,
> it looks good to me.
> 
> Ding
> 
>>>
>>> Regards
>>> Ding
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>>
>> .
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/11] bluetooth: hci_ldisc: fix deadlock condition

2014-03-26 Thread Felipe Balbi
Hi,

On Wed, Mar 26, 2014 at 08:47:15PM -0400, Peter Hurley wrote:
> [ +to Marcel Holtmann ]
> 
> On 03/20/2014 03:30 PM, Felipe Balbi wrote:
> >LDISCs shouldn't call tty->ops->write() from within
> >->write_wakeup().
> >
> >->write_wakeup() is called with port lock taken and
> >IRQs disabled, tty->ops->write() will try to acquire
> >the same port lock and we will deadlock.
> >
> >Reviewed-by: Peter Hurley 
> >Reported-by: Huang Shijie 
> >Signed-off-by: Felipe Balbi 
> 
> I just noticed this patch wasn't addressed to Marcel;
> seems like this should go through the bluetooth tree (but not
> through bluetooth-next because it fixes an oops).

read the archives:

http://marc.info/?l=linux-bluetooth&m=139534449409583&w=2

> Marcel,
> 
> You may want to build on top of this patch split handling;
> I noticed some of the protocol drivers are calling
> hci_uart_tx_wakeup() from work functions already (so don't
> need to schedule another work...)

I don't think that should be part of $subject, though.

-- 
balbi


signature.asc
Description: Digital signature


Massive read only kvm guests when backing file was missing

2014-03-26 Thread Alejandro Comisario
Hi List!
Hope some one can help me, we had a big issue in our cloud the other
day, a couple of our openstack regions ( +2000 kvm guests with qcow2 )
went read only filesystem from the guest side because the backing
files directory (the openstack _base directory) was compromised and
the data was lost, when we realized the data was lost, it took us 5
mins to restore the backup of the backing files, but by that time all
the kvm guests received some kind of IO error from the hypervisor
layer, and went read only on root filesystem.

My question would be, is there a way to hold the IO operations against
the backing files ( i thought that would be 99% READ operations ) for
a little longer ( im asking this because i dont quite understand what
is the process and when it raises the error ) in a case the backing
files are missing (no IO possible) but is recoverable within minutes ?

Any tip  on how to achieve this if possible, or information about how
backing files works on kvm, will be amazing.
Waiting for feedback!

kindest regards.
Alejandro Comisario
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 02/03]: hwrng: create filler thread

2014-03-26 Thread H. Peter Anvin
There are a number of things wrong with this post, but I'll respond in detail 
when I get to a keyboard.

On March 26, 2014 6:11:53 PM PDT, Andy Lutomirski  wrote:
>[cc: Greg Price, might be working on this stuff]
>
>On Wed, Mar 26, 2014 at 6:03 PM, H. Peter Anvin  wrote:
>> I'm wondering more about the default.  We default to 50% for
>arch_get_random_seed, and this is supposed to be the default for in
>effect unverified hwrngs...
>
>TBH I'm highly skeptical of this kind of entropy estimation.
>/dev/random is IMO just silly, since you need to have very
>conservative entropy estimates for the concept to really work, and
>that ends up being hideously slow.  Also, in the /dev/random sense,
>most hardware RNGs have no entropy at all, since they're likely to be
>FIPS-approved DRBGs that don't have a real non-deterministic source.
>
>For the kernel's RNG to be secure, I think it should have the property
>that it still works if you rescale all the entropy estimates by any
>constant that's decently close to 1.
>
>If entropy estimates are systematically too low, then a naive
>implementation results in an excessively long window during early
>bootup in which /dev/urandom is completely insecure.
>
>If entropy estimates are systematically too high, then a naive
>implementation fails to do a catastrophic reseed, and the RNG can be
>brute-forced.
>
>So I think that the core code should do something along the lines of
>using progressively larger reseeds.  Since I think that /dev/random is
>silly, this means that we only really care about the extent to which
>"entropy" measures entropy conditioned on whatever an attacker can
>actually compute.  Since this could vary widely between devices (e.g.
>if your TPM is malicious), I think that the best we can do is to
>collect ~256 bits from everything available, shove it all in to the
>core together, and repeat.  For all I know, the core code already does
>this.
>
>The upshot is that the actual rescaling factor should barely matter.
>50% is probably fine.  So is 100% and 25%.  10% is probably asking for
>trouble during early boot if all you have is a TPM.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: vme: fix memory leak in vme_user_probe()

2014-03-26 Thread DaeSeok Youn
2014-03-27 3:51 GMT+09:00 Aaron Sierra :
> - Original Message -
>> From: "Daeseok Youn" 
>> Sent: Tuesday, March 25, 2014 10:01:48 PM
>> Subject: [PATCH] staging: vme: fix memory leak in vme_user_probe()
>>
>>
>> If vme_master_request() returns NULL when it failed,
>> it need to free buffers for master.
>>
>> And also removes unreachable code in vme_user_probe().
>>
>> Signed-off-by: Daeseok Youn 
>> ---
>>  drivers/staging/vme/devices/vme_user.c |9 +++--
>>  1 files changed, 3 insertions(+), 6 deletions(-)
>
> Nice catches Daeseok. I don't maintain this driver, but I have some
> suggestions below.
>
>>
>> diff --git a/drivers/staging/vme/devices/vme_user.c
>> b/drivers/staging/vme/devices/vme_user.c
>> index 7927927..ffb4eee 100644
>> --- a/drivers/staging/vme/devices/vme_user.c
>> +++ b/drivers/staging/vme/devices/vme_user.c
>> @@ -776,7 +776,8 @@ static int vme_user_probe(struct vme_dev *vdev)
>>   image[i].kern_buf = kmalloc(image[i].size_buf, GFP_KERNEL);
>>   if (image[i].kern_buf == NULL) {
>>   err = -ENOMEM;
>> - goto err_master_buf;
>> + vme_master_free(image[i].resource);
>> + goto err_master;
>>   }
>>   }
>
> I think it would be nice to keep all of the cleanup under the err_master
> label.
Actually, I changed like "err_slave" doing. When it failed to alloc
buffer for slave,
just called vme_slave_free(image[i].slave) and cleanup under the err_slave.

>
> That could be done by changing the kern_buf allocation in this part to
> a devm_kmalloc. Then devm handles the kern_buf freeing entirely.
I didn't know about devm_kmalloc(), I will check that function. Thanks!

>
>>
>> @@ -819,8 +820,6 @@ static int vme_user_probe(struct vme_dev *vdev)
>>
>>   return 0;
>>
>> - /* Ensure counter set correcty to destroy all sysfs devices */
>> - i = VME_DEVS;
>>  err_sysfs:
>>   while (i > 0) {
>>   i--;
>> @@ -830,12 +829,10 @@ err_sysfs:
>>
>>   /* Ensure counter set correcty to unalloc all master windows */
>>   i = MASTER_MAX + 1;
>> -err_master_buf:
>> - for (i = MASTER_MINOR; i < (MASTER_MAX + 1); i++)
>> - kfree(image[i].kern_buf);
>>  err_master:
>>   while (i > MASTER_MINOR) {
>>   i--;
>> + kfree(image[i].kern_buf);
>>   vme_master_free(image[i].resource);
>>   }
>
> Using devm_kmalloc as mentioned above, the while loop could be
> simplified to this:
>
> err_master:
> while (i >= MASTER_MINOR) {
> vme_master_free(image[i].resource);
> i--;
> }
It would be nice, but when it failed to vme_master_request() and than
go to err_master,
image[i].resource must be NULL. So a NULL exception has occurred in
vme_master_free().

I think vme_master{slave}_free() need to check NULL and it can be
possible to change code as your comment.
please check for me. :-)

>
> If not moving to devm, this should be safe even though the first
> kern_buf may be NULL:
>
> err_master:
> while (i >= MASTER_MINOR) {
> kfree(image[i].kern_buf);
> vme_master_free(image[i].resource);
> i--;
> }
kfree() is ok. But vme_master_free() function has an problem as mentioned above.

Thanks for review.
Daeseok Youn.
>
> -Aaron
>
>>
>> --
>> 1.7.4.4
>>
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v11 0/3] mmc: sdhci-msm: Add support for Qualcomm chipsets

2014-03-26 Thread Chris Ball
Hi,

On Wed, Mar 26 2014, Ulf Hansson wrote:
> On 26 March 2014 17:42, Georgi Djakov  wrote:
>> Hello Chris, Ulf,
>>
>> Do you have any comments on the patches?
>> The arch code that will use this driver is already in mainline. The
>> regulators support seem to be still on its way, but this driver also works
>> fine with dummy regulators.
>
> Looks good to me!
>
> For the complete patchset:
>
> Acked-by: Ulf Hansson 

Thanks Georgi and Ulf -- pushed all three patches to mmc-next for 3.15.

- Chris.
-- 
Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ASoC: Add support for multi register mux

2014-03-26 Thread Mark Brown
On Tue, Mar 25, 2014 at 05:02:35PM -0700, Arun Shamanna Lakshmi wrote:

> + }
> + if (!match) {
> + dev_err(codec->dev, "ASoC: Failed to find matched enum 
> value\n");
> + return -EINVAL;
> + } else
> + ucontrol->value.enumerated.item[0] = i;

Coding style nit: if one side of the if has braces both should.  Most of
this code could also use more blank lines.

> + for (reg_idx = 0; reg_idx < e->num_regs; reg_idx++) {
> + val = e->values[item * e->num_regs + reg_idx];
> + ret = snd_soc_update_bits_locked(codec, e->reg[reg_idx],
> + e->mask[reg_idx], val);
> + if (ret)
> + return ret;
> + }

So, this is a bit interesting.  It will update one register at a time
which means that we are likely to transiently set an invalid value
sometimes which might not make the hardware happy or may cause us to
write a valid value with undesirable consequences.  I'd expect to see
some handling of this, some combination of providing a safe value that
the hardware could be reset to prior to change and doing a bulk write to
all the registers simultaneously if we can (I know sometimes hardware
has special handling for atomic updates of multi-register values in a
single block transfer).


signature.asc
Description: Digital signature


Re: [PATCH] ASoC: fsl_sai: Add isr to deal with error flag

2014-03-26 Thread Mark Brown
On Wed, Mar 26, 2014 at 11:59:53AM +, David Laight wrote:
> From: Nicolin Chen

> > +   regmap_read(sai->regmap, FSL_SAI_TCSR, &xcsr);
> > +   regmap_write(sai->regmap, FSL_SAI_TCSR, xcsr);

> Assuming these are 'write to clear' bits, you might want
> to make the write (above) and all the traces (below)
> conditional on the value being non-zero.

The trace is already conditional?  I'd also expect to see the driver
only acknowledging sources it knows about and only reporting that the
interrupt was handled if it saw one of them - right now all interrupts
are unconditionally acknowleged.


signature.asc
Description: Digital signature


Re: [PATCH v2 02/03]: hwrng: create filler thread

2014-03-26 Thread Andy Lutomirski
[cc: Greg Price, might be working on this stuff]

On Wed, Mar 26, 2014 at 6:03 PM, H. Peter Anvin  wrote:
> I'm wondering more about the default.  We default to 50% for 
> arch_get_random_seed, and this is supposed to be the default for in effect 
> unverified hwrngs...

TBH I'm highly skeptical of this kind of entropy estimation.
/dev/random is IMO just silly, since you need to have very
conservative entropy estimates for the concept to really work, and
that ends up being hideously slow.  Also, in the /dev/random sense,
most hardware RNGs have no entropy at all, since they're likely to be
FIPS-approved DRBGs that don't have a real non-deterministic source.

For the kernel's RNG to be secure, I think it should have the property
that it still works if you rescale all the entropy estimates by any
constant that's decently close to 1.

If entropy estimates are systematically too low, then a naive
implementation results in an excessively long window during early
bootup in which /dev/urandom is completely insecure.

If entropy estimates are systematically too high, then a naive
implementation fails to do a catastrophic reseed, and the RNG can be
brute-forced.

So I think that the core code should do something along the lines of
using progressively larger reseeds.  Since I think that /dev/random is
silly, this means that we only really care about the extent to which
"entropy" measures entropy conditioned on whatever an attacker can
actually compute.  Since this could vary widely between devices (e.g.
if your TPM is malicious), I think that the best we can do is to
collect ~256 bits from everything available, shove it all in to the
core together, and repeat.  For all I know, the core code already does
this.

The upshot is that the actual rescaling factor should barely matter.
50% is probably fine.  So is 100% and 25%.  10% is probably asking for
trouble during early boot if all you have is a TPM.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] input: misc: Add driver for Intel Bay Trail GPIO buttons

2014-03-26 Thread Zhu, Lejun
On 3/27/2014 4:20 AM, Dmitry Torokhov wrote:
> On Wed, Mar 26, 2014 at 05:04:04PM +, One Thousand Gnomes wrote:
>> On Wed, 26 Mar 2014 13:01:36 +0800
>> "Zhu, Lejun"  wrote:
>>
>>> This patch adds support for the GPIO buttons on some Intel Bay Trail
>>> tablets originally running Windows 8. The ACPI description of these
>>> buttons follows "Windows ACPI Design Guide for SoC Platforms".
>>
>> I'm not sure calling it "Baytrail" is right here - it's in theory a
>> generic interface so probably should be named accordingly
>>
>> Otherwise looks good to me.
> 
> It uses static devices in non-board code - if I unbind and rebind PNP
> device that produces gpio-keys platform devices driver core will not be
> happy.
> 
> Thanks.
> 

Alan, I will think of a better name for it. This is supposed to work for
other "win8 ready" tablets as well, just all I have today are Baytrail
tablets.

Dmitry, thank you for pointing out the bug. I'll fix it and submit again.

Thanks.
Lejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [alsa-devel] [PATCH] ASoC: Add support for multi register mux

2014-03-26 Thread Mark Brown
On Wed, Mar 26, 2014 at 08:38:47PM +0100, Lars-Peter Clausen wrote:
> On 03/26/2014 01:02 AM, Arun Shamanna Lakshmi wrote:

> The way you describe this it seems to me that a value array for this kind of
> mux would look like.

> 0x, 0x, 0x0001
> 0x, 0x, 0x0002
> 0x, 0x, 0x0003
> 0x, 0x, 0x0004
> 0x, 0x, 0x0008

> That seems to be extremely tedious. If the MUX uses a one hot encoding how
> about storing the index of the bit in the values array and use (1 << value)
> when writing the value to the register?

Or hide it behind utility macros at any rate; I've got this horrible
feeling that as soon as we have this people will notice that they have
more standard enums that are splatted over multiple registers (I think
from memory I've seen them but they got fudged).

> [...]
> >  /* enumerated kcontrol */
> >  struct soc_enum {

> There doesn't actually be any code that is shared between normal enums and
> wide enums. This patch doubles the size of the soc_enum struct, how about
> having a separate struct for wide enums?

Or if they are going to share the same struct then they shouldn't be
duplicating the code and instead keying off num_regs (which was my first
thought earlier on when I saw the separate functions).  We definitely
shouldn't be sharing the data without also sharing the code I think.

> >-int reg;
> >+int reg[SOC_ENUM_MAX_REGS];
> > unsigned char shift_l;
> > unsigned char shift_r;
> > unsigned int items;
> >-unsigned int mask;
> >+unsigned int mask[SOC_ENUM_MAX_REGS];

> If you make mask and reg pointers instead of arrays this should be much more
> flexible and not be limited to 3 registers.

Right, which pushes towards not sharing.  Though with an arrayified mask
the specification by shift syntax would get to be slightly obscure (is
it relative to the enums or the registers?) so perhaps we don't want to
do that at all if we've got specification by shift.  If we do that then
we could get away with a variable length array at the end of the struct
though I think that may be painful for the static declarations.  Someone
would need to look to see what works


signature.asc
Description: Digital signature


Re: [PATCH] Route kbd LEDs through the generic LEDs layer

2014-03-26 Thread Pali Rohár
2014-03-16 11:19 GMT+01:00 Samuel Thibault :
> Pali Rohár, le Sun 16 Mar 2014 11:16:25 +0100, a écrit :
>> Hello, what happened with this patch? Is there any problem with accepting it?
>
> Dmitry finding time to review it, I guess.
>
> Samuel

Dmitry, can you look and review this patch?

-- 
Pali Rohár
pali.ro...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] pwm-backlight: switch to gpiod interface (part 1)

2014-03-26 Thread Jingoo Han
On Thursday, March 27, 2014 9:09 AM, Bryan Wu wrote:
> On Tue, Mar 25, 2014 at 10:01 PM, Alexandre Courbot  wrote:
> > Ping Thierry, can you have a look at this series? It is quite similar
> > to the same change you merged for panel-simple (although I cannot see
> > it in -next neither).
> >
> 
> I think Jingoo can help to review this as well. Jingoo, can you help
> to review. Actually it looks good to me.

Sorry, I cannot review this pwm-backlight code, I don't have
knowledge about pwm-backlight and gpiod. I think that Thierry
will review this correctly.

Best regards,
Jingoo Han

> 
> -Bryan
> 
> > On Thu, Feb 27, 2014 at 2:53 PM, Alexandre Courbot  
> > wrote:
> >> These two patches initiate the switch of the pwm-backlight driver to
> >> the gpiod GPIO interface, as it considerably simplifies the code.
> >>
> >> For compatibility with current users of the driver, it is still possible
> >> to pass the enable GPIO number as platform data. Two platforms are still
> >> relying on this feature (pxa/palmtc and shmobile/armadillo800eva) which
> >> will be removed as soon as its last users are switched to GPIO mapping
> >> tables.
> >>
> >> Alexandre Courbot (2):
> >>   ARM: SAMSUNG: remove gpio flags in dev-backlight
> >>   pwm-backlight: switch to gpiod interface
> >>
> >>  arch/arm/plat-samsung/dev-backlight.c |  2 -
> >>  drivers/video/backlight/pwm_bl.c  | 72 
> >> +++
> >>  include/linux/pwm_backlight.h |  5 +--
> >>  3 files changed, 32 insertions(+), 47 deletions(-)
> >>
> >> --
> >> 1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 02/03]: hwrng: create filler thread

2014-03-26 Thread H. Peter Anvin
I'm wondering more about the default.  We default to 50% for 
arch_get_random_seed, and this is supposed to be the default for in effect 
unverified hwrngs...

On March 26, 2014 5:50:09 PM PDT, Andy Lutomirski  wrote:
>On 03/21/2014 07:33 AM, Torsten Duwe wrote:
>> This can be viewed as the in-kernel equivalent of hwrngd;
>> like FUSE it is a good thing to have a mechanism in user land,
>> but for some reasons (simplicity, secrecy, integrity, speed)
>> it may be better to have it in kernel space.
>
>Nice.
>
>
>[...]
>
>>  
>>  static struct hwrng *current_rng;
>> +static struct task_struct *hwrng_fill;
>>  static LIST_HEAD(rng_list);
>>  static DEFINE_MUTEX(rng_mutex);
>>  static int data_avail;
>> -static u8 *rng_buffer;
>> +static u8 *rng_buffer, *rng_fillbuf;
>> +static unsigned short derating_current = 700; /* an arbitrary 70% */
>> +
>> +module_param(derating_current, ushort, 0644);
>> +MODULE_PARM_DESC(derating_current,
>> + "current hwrng entropy estimation per mill");
>
>As an electrical engineer (sort of), I can't read this without thinking
>you're talking about the amount by which the current is derated.  For
>example, a 14-50 electrical outlet is rated to 50 Amps.  If you use it
>continuously for a long time, though, the current is derated to 40
>Amps.
>
>Shouldn't this be called credit_derating or, even better,
>credit_per_1000bits?
>
>Also, "per mill" is just obscure enough that someone might think it
>means "per million".
>
>
>> +
>> +static void start_khwrngd(void);
>>  
>>  static size_t rng_buffer_size(void)
>>  {
>> @@ -62,9 +71,18 @@ static size_t rng_buffer_size(void)
>>  
>>  static inline int hwrng_init(struct hwrng *rng)
>>  {
>> +int err;
>> +
>>  if (!rng->init)
>>  return 0;
>> -return rng->init(rng);
>> +err = rng->init(rng);
>> +if (err)
>> +return err;
>> +
>> +if (derating_current > 0 && !hwrng_fill)
>> +start_khwrngd();
>> +
>
>Why the check for derating > 0?  Paranoid users may want zero credit,
>but they probably still want the thing to run.
>
>> +return 0;
>>  }
>>  
>>  static inline void hwrng_cleanup(struct hwrng *rng)
>> @@ -300,6 +318,36 @@ err_misc_dereg:
>>  goto out;
>>  }
>>  
>> +static int hwrng_fillfn(void *unused)
>> +{
>> +long rc;
>> +
>> +while (!kthread_should_stop()) {
>> +if (!current_rng)
>> +break;
>> +rc = rng_get_data(current_rng, rng_fillbuf,
>> +  rng_buffer_size(), 1);
>> +if (rc <= 0) {
>> +pr_warn("hwrng: no data available\n");
>
>ratelimit (heavily), please.
>
>Also, would it make sense to round-robin all hwrngs?  Even better:
>collect entropy from each one and add them to the pool all at once.  If
>so, would it make sense for the derating to be a per-rng parameter. 
>For
>example, if there's a sysfs class, it could go in there.
>
>Finally, there may be hwrngs like TPMs that are amazingly slow.  What
>happens if the RNG is so slow that it becomes the bottleneck?  Should
>this thing back off?  Using the TPM at 100% utilization seems silly
>when
>there's a heavy entropy consumer, especially since reading 256 bits
>from
>the TPM once is probably just about as secure as reading from it
>continuously.
>
>
>Also, with my quantum hat on, thanks for doing this in a way that isn't
>gratuitously insecure against quantum attack.  128-bit reseeds are
>simply too small if your adversary has a large quantum computer :)
>
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND for ARM-SoC 0/2] ARM: STi: NOR Flash support

2014-03-26 Thread Arnd Bergmann
On Friday 21 March 2014, Lee Jones wrote:
> As requested by Arnd:
> 
> ARM-SoC Maintainers,
> 
> Please apply these two patches directly to ARM-SoC for inclusion
> into the v3.15 merge window. All maintainer Acks are applied.
> 
> Kind regards,

Applied to next/dt branch. I had a trivial conflict against the
addition of the network device nodes, which I resolved on this
branch.

Thanks,

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Andy Lutomirski
On Wed, Mar 26, 2014 at 5:42 PM, Serge Hallyn  wrote:
> Quoting Andy Lutomirski (l...@amacapital.net):
>> Hi various people who care about user-space NFS servers and/or
>> security-relevant APIs.
>>
>> I propose the following set of new syscalls:
>>
>> int credfd_create(unsigned int flags): returns a new credfd that
>> corresponds to current's creds.
>>
>> int credfd_activate(int fd, unsigned int flags): Change current's
>> creds to match the creds stored in fd.  To be clear, this changes both
>> the "subjective" and "objective" (aka real_cred and cred) because
>> there aren't any real semantics for what happens when userspace code
>> runs with real_cred != cred.
>
> Is there a URL where I can find the motivation, and why the existing
> features can't be used?

It was an LSF talk.  There's this, though:

http://thread.gmane.org/gmane.linux.file-systems/79234

Essentially, it's a performance problem.  knfsd has override_creds,
and it can cache struct cred.  But userspace doing the same thing
(i.e. impersonating a user) has to do setresuid, setresgid, and
setgroups, which kills performance, since it results in something like
five RCU callbacks per impersonation round-trip.

Windows has something called an "impersonation token" that more or
less solves this problem.

>
> My guess would be, uid 10 is root in a container, and you want
> him to be able to send a request to a root daemon on the host, on
> behalf of uid 15 in the container, over which 10 has
> privilege.  (Which is sort of what we need for the cgmanager proxy;
> there we do it by checking checking that 10 is mapped to 0 in
> the requestor's uid_map, and that 15 is mapped in that uid_map)
> The credfd would be useful there, especially combined with a
> credfd_access(credfd, fd, perms) call.

This requires uid 15 to send 10 a credfd, right?

In general, making this same probably requires a way to make it safe
to call credfd_activate on an untrusted credfd.  You don't want to
expose yourself to ptrace or proc attacks from the credential
provider.  Nor do you want to suddenly get hit by rlimits, perhaps.
So maybe there really does need to be a separate subjective and
objective state.  Ugh.

Ganesha can avoid this because the caller of credfd_create is trusted.

>
> But I'd like to hear exactly how nfs and ganesha would use these.

knfsd will presumably not use it.  Ganesha will, and Jim can probably
comment further.  Samba might want to use it, too.

>
> What all would be assiciated with the credfd?  Everything that is
> in the kernel cred?

I assume so.

--Andy

>
>> Rules:
>>
>>  - credfd_activate fails (-EINVAL) if fd is not a credfd.
>>  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
>> current's userns.  credfd_activate is not intended to be a substitute
>> for setns.
>>  - credfd_activate will fail (-EPERM) if LSM does not allow the
>> switch.  This probably needs to be a new selinux action --
>> dyntransition is too restrictive.
>>
>>
>> Optional:
>>  - credfd_create always sets cloexec, because the alternative is silly.
>>  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
>> don't want a privileged daemon to be ptraced while impersonating
>> someone else.
>>  - optional: both credfd_create and credfd_activate fail if
>> !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
>>
>> The first question: does this solve Ganesha's problem?
>>
>> The second question: is this safe?  I can see two major concerns.  The
>> bigger concern is that having these syscalls available will allow
>> users to exploit things that were previously secure.  For example,
>> maybe some configuration assumes that a task running as uid==1 can't
>> switch to uid==2, even with uid 2's consent.  Similar issues happen
>> with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
>> longer really true.
>>
>> Alternatively, something running as uid == 0 with heavy capability
>> restrictions in a mount namespace (but not a uid namespace) could pass
>> a credfd out of the namespace.  This could break things like Docker
>> pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
>> I think that Docker is already totally screwed if a Docker root task
>> can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
>> not entirely clear that the situation is any worse, even without
>> requiring CAP_SYS_ADMIN.
>>
>> The second concern is that it may be difficult to use this correctly.
>> There's a reason that real_cred and cred exist, but it's not really
>> well set up for being used.
>>
>> As a simple way to stay safe, Ganesha could only use credfds that have
>> real_uid == 0.
>>
>> --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/5] Documentation: dt: Add Kona PWM binding

2014-03-26 Thread Tim Kryger
Add the binding description for the Kona PWM controller found on Broadcom's
mobile SoCs.

Signed-off-by: Tim Kryger 
Reviewed-by: Alex Elder 
Reviewed-by: Markus Mayer 
---
 .../devicetree/bindings/pwm/bcm-kona-pwm.txt   |   21 
 1 file changed, 21 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/bcm-kona-pwm.txt

diff --git a/Documentation/devicetree/bindings/pwm/bcm-kona-pwm.txt 
b/Documentation/devicetree/bindings/pwm/bcm-kona-pwm.txt
new file mode 100644
index 000..8eae9fe
--- /dev/null
+++ b/Documentation/devicetree/bindings/pwm/bcm-kona-pwm.txt
@@ -0,0 +1,21 @@
+Broadcom Kona PWM controller device tree bindings
+
+This controller has 6 channels.
+
+Required Properties :
+- compatible: should contain "brcm,kona-pwm"
+- reg: physical base address and length of the controller's registers
+- clocks: phandle + clock specifier pair for the external clock
+- #pwm-cells: Should be 3. See pwm.txt in this directory for a
+  description of the cells format.
+
+Refer to clocks/clock-bindings.txt for generic clock consumer properties.
+
+Example:
+
+pwm: pwm@3e01a000 {
+   compatible = "brcm,bcm11351-pwm", "brcm,kona-pwm";
+   reg = <0x3e01a000 0xc4>;
+   clocks = <&pwm_clk>;
+   #pwm-cells = <3>;
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/5] Add Broadcom Kona PWM Support

2014-03-26 Thread Tim Kryger
This series introduces the driver for the Kona PWM controller found in
Broadcom mobile SoCs like bcm281xx and updates the device tree and the
defconfig to enable use of this hardware on the bcm28155 AP board.

Changes since v4:
  - Added in real polarity support
  - Labeled trigger bits as such rather than use the name from hw docs
  - Listed unsual hardware characteristics at the top of the file
  - Removed default from Kconfig and update defconfig accordingly
  - Always use unsigned int for temporary register values

Changes since v3:
  - Removed polarity support for now
  - Cleaned up whitespace issues, shortened some variable names
  - Use container_of instead of dev_get_drvdata to get private data
  - Removed workaround for PWM framework bug
  - Reworded some binding documentation

Changes since v2:
  - SoC DTS file updated to use real clock's phandle + specifier
  - Toggle smooth mode off during apply so new settings take immediately

Changes since v1:
  - Fixed up macros to be clearer and more complete
  - Corrected spelling and punctuation mistakes
  - Added support for polarity
  - Made peripheral clock use more efficient
  - Made prescale and duty computation clearer
  - Moved Makefile addition to keep alphabetical
  - Split complex lines into multiple steps

Dependencies:
The "ARM: dts: Declare the PWM for bcm11351 (bcm281xx)" patch depends
upon "ARM: dts: bcm281xx: define real clocks" which is queued up in
for-next of arm-soc. See https://lkml.org/lkml/2014/2/14/451

Tim Kryger (5):
  Documentation: dt: Add Kona PWM binding
  pwm: kona: Introduce Kona PWM controller support
  ARM: dts: Declare the PWM for bcm11351 (bcm281xx)
  ARM: dts: Enable the PWM for bcm28155 AP board
  ARM: bcm_defconfig: Enable PWM and Backlight

 .../devicetree/bindings/pwm/bcm-kona-pwm.txt   |   21 ++
 arch/arm/boot/dts/bcm11351.dtsi|8 +
 arch/arm/boot/dts/bcm28155-ap.dts  |4 +
 arch/arm/configs/bcm_defconfig |3 +
 drivers/pwm/Kconfig|9 +
 drivers/pwm/Makefile   |1 +
 drivers/pwm/pwm-bcm-kona.c |  319 
 7 files changed, 365 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pwm/bcm-kona-pwm.txt
 create mode 100644 drivers/pwm/pwm-bcm-kona.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 02/03]: hwrng: create filler thread

2014-03-26 Thread Andy Lutomirski
On 03/21/2014 07:33 AM, Torsten Duwe wrote:
> This can be viewed as the in-kernel equivalent of hwrngd;
> like FUSE it is a good thing to have a mechanism in user land,
> but for some reasons (simplicity, secrecy, integrity, speed)
> it may be better to have it in kernel space.

Nice.


[...]

>  
>  static struct hwrng *current_rng;
> +static struct task_struct *hwrng_fill;
>  static LIST_HEAD(rng_list);
>  static DEFINE_MUTEX(rng_mutex);
>  static int data_avail;
> -static u8 *rng_buffer;
> +static u8 *rng_buffer, *rng_fillbuf;
> +static unsigned short derating_current = 700; /* an arbitrary 70% */
> +
> +module_param(derating_current, ushort, 0644);
> +MODULE_PARM_DESC(derating_current,
> +  "current hwrng entropy estimation per mill");

As an electrical engineer (sort of), I can't read this without thinking
you're talking about the amount by which the current is derated.  For
example, a 14-50 electrical outlet is rated to 50 Amps.  If you use it
continuously for a long time, though, the current is derated to 40 Amps.

Shouldn't this be called credit_derating or, even better,
credit_per_1000bits?

Also, "per mill" is just obscure enough that someone might think it
means "per million".


> +
> +static void start_khwrngd(void);
>  
>  static size_t rng_buffer_size(void)
>  {
> @@ -62,9 +71,18 @@ static size_t rng_buffer_size(void)
>  
>  static inline int hwrng_init(struct hwrng *rng)
>  {
> + int err;
> +
>   if (!rng->init)
>   return 0;
> - return rng->init(rng);
> + err = rng->init(rng);
> + if (err)
> + return err;
> +
> + if (derating_current > 0 && !hwrng_fill)
> + start_khwrngd();
> +

Why the check for derating > 0?  Paranoid users may want zero credit,
but they probably still want the thing to run.

> + return 0;
>  }
>  
>  static inline void hwrng_cleanup(struct hwrng *rng)
> @@ -300,6 +318,36 @@ err_misc_dereg:
>   goto out;
>  }
>  
> +static int hwrng_fillfn(void *unused)
> +{
> + long rc;
> +
> + while (!kthread_should_stop()) {
> + if (!current_rng)
> + break;
> + rc = rng_get_data(current_rng, rng_fillbuf,
> +   rng_buffer_size(), 1);
> + if (rc <= 0) {
> + pr_warn("hwrng: no data available\n");

ratelimit (heavily), please.

Also, would it make sense to round-robin all hwrngs?  Even better:
collect entropy from each one and add them to the pool all at once.  If
so, would it make sense for the derating to be a per-rng parameter.  For
example, if there's a sysfs class, it could go in there.

Finally, there may be hwrngs like TPMs that are amazingly slow.  What
happens if the RNG is so slow that it becomes the bottleneck?  Should
this thing back off?  Using the TPM at 100% utilization seems silly when
there's a heavy entropy consumer, especially since reading 256 bits from
the TPM once is probably just about as secure as reading from it
continuously.


Also, with my quantum hat on, thanks for doing this in a way that isn't
gratuitously insecure against quantum attack.  128-bit reseeds are
simply too small if your adversary has a large quantum computer :)


--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/5] pwm: kona: Introduce Kona PWM controller support

2014-03-26 Thread Tim Kryger
Add support for the six-channel Kona PWM controller found on Broadcom
mobile SoCs like bcm281xx.

Signed-off-by: Tim Kryger 
Reviewed-by: Alex Elder 
Reviewed-by: Markus Mayer 
---
 drivers/pwm/Kconfig|9 ++
 drivers/pwm/Makefile   |1 +
 drivers/pwm/pwm-bcm-kona.c |  319 
 3 files changed, 329 insertions(+)
 create mode 100644 drivers/pwm/pwm-bcm-kona.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index 22f2f28..777d87dd 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -62,6 +62,15 @@ config PWM_ATMEL_TCB
  To compile this driver as a module, choose M here: the module
  will be called pwm-atmel-tcb.
 
+config PWM_BCM_KONA
+   tristate "Kona PWM support"
+   depends on ARCH_BCM_MOBILE
+   help
+ Generic PWM framework driver for Broadcom Kona PWM block.
+
+ To compile this driver as a module, choose M here: the module
+ will be called pwm-bcm-kona.
+
 config PWM_BFIN
tristate "Blackfin PWM support"
depends on BFIN_GPTIMERS
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index d8906ec..7413090 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_PWM_SYSFS) += sysfs.o
 obj-$(CONFIG_PWM_AB8500)   += pwm-ab8500.o
 obj-$(CONFIG_PWM_ATMEL)+= pwm-atmel.o
 obj-$(CONFIG_PWM_ATMEL_TCB)+= pwm-atmel-tcb.o
+obj-$(CONFIG_PWM_BCM_KONA) += pwm-bcm-kona.o
 obj-$(CONFIG_PWM_BFIN) += pwm-bfin.o
 obj-$(CONFIG_PWM_EP93XX)   += pwm-ep93xx.o
 obj-$(CONFIG_PWM_IMX)  += pwm-imx.o
diff --git a/drivers/pwm/pwm-bcm-kona.c b/drivers/pwm/pwm-bcm-kona.c
new file mode 100644
index 000..ee8a59d
--- /dev/null
+++ b/drivers/pwm/pwm-bcm-kona.c
@@ -0,0 +1,319 @@
+/*
+ * Copyright (C) 2014 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * The Kona PWM has some unusual characteristics.  Here are the main points.
+ *
+ * 1) There is no disable bit and the hardware docs advise programming a zero
+ *duty to achieve output equivalent to that of a normal disable operation.
+ *
+ * 2) Changes to prescale, duty, period, and polarity do not take effect until
+ *a subsequent rising edge of the trigger bit.
+ *
+ * 3) If the smooth bit and trigger bit are both low, the output is a constant
+ *high signal.  Otherwise, the earlier waveform continues to be output.
+ *
+ * 4) If the smooth bit is set on the rising edge of the trigger bit, output
+ *will transition to the new settings on a period boundary (which could be
+ *seconds away).  If the smooth bit is clear, new settings will be applied
+ *as soon as possible (the hardware always has a 400ns delay).
+ *
+ * 5) When the external clock that feeds the PWM is disabled, output is pegged
+ *high or low depending on its state at that exact instant.
+ */
+
+#define PWM_CONTROL_OFFSET (0x)
+#define PWM_CONTROL_SMOOTH_SHIFT(chan) (24 + (chan))
+#define PWM_CONTROL_TYPE_SHIFT(chan)   (16 + (chan))
+#define PWM_CONTROL_POLARITY_SHIFT(chan)   (8 + (chan))
+#define PWM_CONTROL_TRIGGER_SHIFT(chan)(chan)
+
+#define PRESCALE_OFFSET(0x0004)
+#define PRESCALE_SHIFT(chan)   ((chan) << 2)
+#define PRESCALE_MASK(chan)(0x7 << PRESCALE_SHIFT(chan))
+#define PRESCALE_MIN   (0x)
+#define PRESCALE_MAX   (0x0007)
+
+#define PERIOD_COUNT_OFFSET(chan)  (0x0008 + ((chan) << 3))
+#define PERIOD_COUNT_MIN   (0x0002)
+#define PERIOD_COUNT_MAX   (0x00ff)
+
+#define DUTY_CYCLE_HIGH_OFFSET(chan)   (0x000c + ((chan) << 3))
+#define DUTY_CYCLE_HIGH_MIN(0x)
+#define DUTY_CYCLE_HIGH_MAX(0x00ff)
+
+struct kona_pwmc {
+   struct pwm_chip chip;
+   void __iomem *base;
+   struct clk *clk;
+};
+
+static inline struct kona_pwmc *to_kona_pwmc(struct pwm_chip *_chip)
+{
+   return container_of(_chip, struct kona_pwmc, chip);
+}
+
+static void kona_pwmc_apply_settings(struct kona_pwmc *kp, unsigned int chan)
+{
+   unsigned int value = readl(kp->base + PWM_CONTROL_OFFSET);
+
+   /* Clear trigger bit but set smooth bit to maintain old output */
+   value |= 

[PATCH v5 4/5] ARM: dts: Enable the PWM for bcm28155 AP board

2014-03-26 Thread Tim Kryger
Mark the PWM as enabled on the bcm28155 AP board.

Signed-off-by: Tim Kryger 
Reviewed-by: Alex Elder 
Reviewed-by: Markus Mayer 
---
 arch/arm/boot/dts/bcm28155-ap.dts |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/bcm28155-ap.dts 
b/arch/arm/boot/dts/bcm28155-ap.dts
index 5ff2382..37c72eb 100644
--- a/arch/arm/boot/dts/bcm28155-ap.dts
+++ b/arch/arm/boot/dts/bcm28155-ap.dts
@@ -66,6 +66,10 @@
status = "okay";
};
 
+   pwm: pwm@3e01a000 {
+   status = "okay";
+   };
+
usbotg: usb@3f12 {
status = "okay";
};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 5/5] ARM: bcm_defconfig: Enable PWM and Backlight

2014-03-26 Thread Tim Kryger
Enable PWM drivers and the PWM-based backlight driver.

Signed-off-by: Tim Kryger 
Reviewed-by: Alex Elder 
Reviewed-by: Markus Mayer 
---
 arch/arm/configs/bcm_defconfig |3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/configs/bcm_defconfig b/arch/arm/configs/bcm_defconfig
index 3b2b7bd..593df57 100644
--- a/arch/arm/configs/bcm_defconfig
+++ b/arch/arm/configs/bcm_defconfig
@@ -88,6 +88,7 @@ CONFIG_FB=y
 CONFIG_BACKLIGHT_LCD_SUPPORT=y
 CONFIG_LCD_CLASS_DEVICE=y
 CONFIG_BACKLIGHT_CLASS_DEVICE=y
+CONFIG_BACKLIGHT_PWM=y
 # CONFIG_USB_SUPPORT is not set
 CONFIG_MMC=y
 CONFIG_MMC_UNSAFE_RESUME=y
@@ -101,6 +102,8 @@ CONFIG_LEDS_TRIGGERS=y
 CONFIG_LEDS_TRIGGER_TIMER=y
 CONFIG_LEDS_TRIGGER_HEARTBEAT=y
 CONFIG_LEDS_TRIGGER_DEFAULT_ON=y
+CONFIG_PWM=y
+CONFIG_PWM_BCM_KONA=y
 CONFIG_EXT4_FS=y
 CONFIG_EXT4_FS_POSIX_ACL=y
 CONFIG_EXT4_FS_SECURITY=y
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 3/5] ARM: dts: Declare the PWM for bcm11351 (bcm281xx)

2014-03-26 Thread Tim Kryger
Add the device tree node for the PWM on bcm11351 SoCs.

Signed-off-by: Tim Kryger 
Reviewed-by: Alex Elder 
Reviewed-by: Markus Mayer 
---
 arch/arm/boot/dts/bcm11351.dtsi |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/bcm11351.dtsi b/arch/arm/boot/dts/bcm11351.dtsi
index 64d069b..6b05ae6 100644
--- a/arch/arm/boot/dts/bcm11351.dtsi
+++ b/arch/arm/boot/dts/bcm11351.dtsi
@@ -193,6 +193,14 @@
status = "disabled";
};
 
+   pwm: pwm@3e01a000 {
+   compatible = "brcm,bcm11351-pwm", "brcm,kona-pwm";
+   reg = <0x3e01a000 0xcc>;
+   clocks = <&slave_ccu BCM281XX_SLAVE_CCU_PWM>;
+   #pwm-cells = <3>;
+   status = "disabled";
+   };
+
clocks {
#address-cells = <1>;
#size-cells = <1>;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/11] bluetooth: hci_ldisc: fix deadlock condition

2014-03-26 Thread Peter Hurley

[ +to Marcel Holtmann ]

On 03/20/2014 03:30 PM, Felipe Balbi wrote:

LDISCs shouldn't call tty->ops->write() from within
->write_wakeup().

->write_wakeup() is called with port lock taken and
IRQs disabled, tty->ops->write() will try to acquire
the same port lock and we will deadlock.

Reviewed-by: Peter Hurley 
Reported-by: Huang Shijie 
Signed-off-by: Felipe Balbi 


I just noticed this patch wasn't addressed to Marcel;
seems like this should go through the bluetooth tree (but not
through bluetooth-next because it fixes an oops).

Marcel,

You may want to build on top of this patch split handling;
I noticed some of the protocol drivers are calling
hci_uart_tx_wakeup() from work functions already (so don't
need to schedule another work...)

Regards,
Peter Hurley


---
  drivers/bluetooth/hci_ldisc.c | 24 +++-
  drivers/bluetooth/hci_uart.h  |  1 +
  2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index 6e06f6f..77af52f 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -118,10 +118,6 @@ static inline struct sk_buff *hci_uart_dequeue(struct 
hci_uart *hu)

  int hci_uart_tx_wakeup(struct hci_uart *hu)
  {
-   struct tty_struct *tty = hu->tty;
-   struct hci_dev *hdev = hu->hdev;
-   struct sk_buff *skb;
-
if (test_and_set_bit(HCI_UART_SENDING, &hu->tx_state)) {
set_bit(HCI_UART_TX_WAKEUP, &hu->tx_state);
return 0;
@@ -129,6 +125,22 @@ int hci_uart_tx_wakeup(struct hci_uart *hu)

BT_DBG("");

+   schedule_work(&hu->write_work);
+
+   return 0;
+}
+
+static void hci_uart_write_work(struct work_struct *work)
+{
+   struct hci_uart *hu = container_of(work, struct hci_uart, write_work);
+   struct tty_struct *tty = hu->tty;
+   struct hci_dev *hdev = hu->hdev;
+   struct sk_buff *skb;
+
+   /* REVISIT: should we cope with bad skbs or ->write() returning
+* and error value ?
+*/
+
  restart:
clear_bit(HCI_UART_TX_WAKEUP, &hu->tx_state);

@@ -153,7 +165,6 @@ restart:
goto restart;

clear_bit(HCI_UART_SENDING, &hu->tx_state);
-   return 0;
  }

  static void hci_uart_init_work(struct work_struct *work)
@@ -281,6 +292,7 @@ static int hci_uart_tty_open(struct tty_struct *tty)
tty->receive_room = 65536;

INIT_WORK(&hu->init_ready, hci_uart_init_work);
+   INIT_WORK(&hu->write_work, hci_uart_write_work);

spin_lock_init(&hu->rx_lock);

@@ -318,6 +330,8 @@ static void hci_uart_tty_close(struct tty_struct *tty)
if (hdev)
hci_uart_close(hdev);

+   cancel_work_sync(&hu->write_work);
+
if (test_and_clear_bit(HCI_UART_PROTO_SET, &hu->flags)) {
if (hdev) {
if (test_bit(HCI_UART_REGISTERED, &hu->flags))
diff --git a/drivers/bluetooth/hci_uart.h b/drivers/bluetooth/hci_uart.h
index fffa61f..12df101 100644
--- a/drivers/bluetooth/hci_uart.h
+++ b/drivers/bluetooth/hci_uart.h
@@ -68,6 +68,7 @@ struct hci_uart {
unsigned long   hdev_flags;

struct work_struct  init_ready;
+   struct work_struct  write_work;

struct hci_uart_proto   *proto;
void*priv;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Serge Hallyn
Quoting Andy Lutomirski (l...@amacapital.net):
> Hi various people who care about user-space NFS servers and/or
> security-relevant APIs.
> 
> I propose the following set of new syscalls:
> 
> int credfd_create(unsigned int flags): returns a new credfd that
> corresponds to current's creds.
> 
> int credfd_activate(int fd, unsigned int flags): Change current's
> creds to match the creds stored in fd.  To be clear, this changes both
> the "subjective" and "objective" (aka real_cred and cred) because
> there aren't any real semantics for what happens when userspace code
> runs with real_cred != cred.

Is there a URL where I can find the motivation, and why the existing
features can't be used?

My guess would be, uid 10 is root in a container, and you want
him to be able to send a request to a root daemon on the host, on
behalf of uid 15 in the container, over which 10 has
privilege.  (Which is sort of what we need for the cgmanager proxy;
there we do it by checking checking that 10 is mapped to 0 in
the requestor's uid_map, and that 15 is mapped in that uid_map)
The credfd would be useful there, especially combined with a
credfd_access(credfd, fd, perms) call.

But I'd like to hear exactly how nfs and ganesha would use these.

What all would be assiciated with the credfd?  Everything that is
in the kernel cred?

> Rules:
> 
>  - credfd_activate fails (-EINVAL) if fd is not a credfd.
>  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
> current's userns.  credfd_activate is not intended to be a substitute
> for setns.
>  - credfd_activate will fail (-EPERM) if LSM does not allow the
> switch.  This probably needs to be a new selinux action --
> dyntransition is too restrictive.
> 
> 
> Optional:
>  - credfd_create always sets cloexec, because the alternative is silly.
>  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
> don't want a privileged daemon to be ptraced while impersonating
> someone else.
>  - optional: both credfd_create and credfd_activate fail if
> !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
> 
> The first question: does this solve Ganesha's problem?
> 
> The second question: is this safe?  I can see two major concerns.  The
> bigger concern is that having these syscalls available will allow
> users to exploit things that were previously secure.  For example,
> maybe some configuration assumes that a task running as uid==1 can't
> switch to uid==2, even with uid 2's consent.  Similar issues happen
> with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
> longer really true.
> 
> Alternatively, something running as uid == 0 with heavy capability
> restrictions in a mount namespace (but not a uid namespace) could pass
> a credfd out of the namespace.  This could break things like Docker
> pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
> I think that Docker is already totally screwed if a Docker root task
> can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
> not entirely clear that the situation is any worse, even without
> requiring CAP_SYS_ADMIN.
> 
> The second concern is that it may be difficult to use this correctly.
> There's a reason that real_cred and cred exist, but it's not really
> well set up for being used.
> 
> As a simple way to stay safe, Ganesha could only use credfds that have
> real_uid == 0.
> 
> --Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug 71331 - mlock yields processor to lower priority process

2014-03-26 Thread Andy Lutomirski
On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
> 
> 
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
> kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +, jimmie.da...@l-3com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
> very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what 
> it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/11] Revert "serial: omap: unlock the port lock"

2014-03-26 Thread Peter Hurley

On 03/25/2014 02:28 PM, Tony Lindgren wrote:

* Felipe Balbi  [140320 12:39]:

This reverts commit 0324a821029e1f54e7a7f8fed48693cfce42dc0e.

That commit tried to fix a deadlock problem when using
hci_ldisc, but it turns out the bug was in hci_ldsic
all along where it was calling ->write() from within
->write_wakeup() callback.

The problem is that ->write_wakeup() was called with
port lock held and ->write() tried to grab the same
port lock.


Should this and the next patch be earlier in the series
as a fix for the v3.15-rc cycle? Should they be cc: stable
as well?


Well, right now the other fix has had _zero_ testing
so not really a -stable candidate just yet.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: convert some level-less printks to pr_*

2014-03-26 Thread Christoph Lameter
On Wed, 26 Mar 2014, Mitchel Humpherys wrote:

> printk is meant to be used with an associated log level. There are some
> instances of printk scattered around the mm code where the log level is
> missing. Add a log level and adhere to suggestions by
> scripts/checkpatch.pl by moving to the pr_* macros.

Acked-by: Christoph Lameter 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Thoughts on credential switching

2014-03-26 Thread Andy Lutomirski
Hi various people who care about user-space NFS servers and/or
security-relevant APIs.

I propose the following set of new syscalls:

int credfd_create(unsigned int flags): returns a new credfd that
corresponds to current's creds.

int credfd_activate(int fd, unsigned int flags): Change current's
creds to match the creds stored in fd.  To be clear, this changes both
the "subjective" and "objective" (aka real_cred and cred) because
there aren't any real semantics for what happens when userspace code
runs with real_cred != cred.

Rules:

 - credfd_activate fails (-EINVAL) if fd is not a credfd.
 - credfd_activate fails (-EPERM) if the fd's userns doesn't match
current's userns.  credfd_activate is not intended to be a substitute
for setns.
 - credfd_activate will fail (-EPERM) if LSM does not allow the
switch.  This probably needs to be a new selinux action --
dyntransition is too restrictive.


Optional:
 - credfd_create always sets cloexec, because the alternative is silly.
 - credfd_activate fails (-EINVAL) if dumpable.  This is because we
don't want a privileged daemon to be ptraced while impersonating
someone else.
 - optional: both credfd_create and credfd_activate fail if
!ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).

The first question: does this solve Ganesha's problem?

The second question: is this safe?  I can see two major concerns.  The
bigger concern is that having these syscalls available will allow
users to exploit things that were previously secure.  For example,
maybe some configuration assumes that a task running as uid==1 can't
switch to uid==2, even with uid 2's consent.  Similar issues happen
with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
longer really true.

Alternatively, something running as uid == 0 with heavy capability
restrictions in a mount namespace (but not a uid namespace) could pass
a credfd out of the namespace.  This could break things like Docker
pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
I think that Docker is already totally screwed if a Docker root task
can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
not entirely clear that the situation is any worse, even without
requiring CAP_SYS_ADMIN.

The second concern is that it may be difficult to use this correctly.
There's a reason that real_cred and cred exist, but it's not really
well set up for being used.

As a simple way to stay safe, Ganesha could only use credfds that have
real_uid == 0.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] mmc: host: omap_hsmmc: set max_blk_size correctly

2014-03-26 Thread Felipe Balbi
Hi,

On Wed, Mar 26, 2014 at 07:04:50PM -0500, Felipe Balbi wrote:
> @@ -1867,6 +1879,37 @@ static inline struct omap_mmc_platform_data
>  }
>  #endif
>  
> +static void omap_hsmmc_set_max_blk_size(struct omap_hsmmc_host *host)
> +{
> + struct mmc_host *mmc = host->mmc;
> +
> + if (of_device_is_compatible(host->dev->of_node, "ti,omap4-hsmmc")) {
> + u32 mem;
> + u32 reg;
> +
> + reg = omap_hsmmc_read_no_offset(host, OMAP_HSMMC_HL_HWINFO);
> + mem = OMAP_HSMMC_HL_HWINFO_MEM_SIZE(reg);
> +
> + switch (mem) {
> + case 1:
> + mmc->max_blk_size = 512;
> + break;
> + case 2:
> + mmc->max_blk_size = 1024;
> + break;
> + case 4:
> + /* FALLTHROUGH */
> + case 8:
> + /* FALLTHROUGH */
> + default:
> + mmc->max_blk_size = 2048;
> + break;
> + }
> + } else {
> + mmc->max_blk_size = 512;   /* Block Length at max can be 
> 1024 */

looks like here, we could read CAPA register to figure out if older
devices support bigger block sizes. According to TRM, omap3 should
support 1024 just fine.

-- 
balbi


signature.asc
Description: Digital signature


Re: patch fix for intel sdd s3500 series on Sil3512 controller

2014-03-26 Thread David F.
Yes.  The drive was found fine on other controllers I tried it on.

On Mon, Mar 24, 2014 at 8:27 AM, Bartlomiej Zolnierkiewicz
 wrote:
>
> + linux-ide mailing list on Cc:
>
> On Monday, March 24, 2014 02:15:58 PM One Thousand Gnomes wrote:
>> On Sat, 22 Mar 2014 16:32:54 -0700
>> "David F."  wrote:
>>
>> > Hi,
>> >
>> > It appears if nIEN is set all polling type IO fails.   After an
>> > attempt, future non-polled communications also fails.  This patch
>> > allows it to work.  Not sure if any spin lock protection would be
>> > needed or the system already handles the device existence for the
>> > generated irq with polling enabled.  handler already ignored irq if
>> > queued command was polling type.
>>
>> Does it behave plugged into a different controller without the hacks ?
>>
>> Alan
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] pwm-backlight: switch to gpiod interface (part 1)

2014-03-26 Thread Bryan Wu
On Tue, Mar 25, 2014 at 10:01 PM, Alexandre Courbot  wrote:
> Ping Thierry, can you have a look at this series? It is quite similar
> to the same change you merged for panel-simple (although I cannot see
> it in -next neither).
>

I think Jingoo can help to review this as well. Jingoo, can you help
to review. Actually it looks good to me.

-Bryan

> On Thu, Feb 27, 2014 at 2:53 PM, Alexandre Courbot  
> wrote:
>> These two patches initiate the switch of the pwm-backlight driver to
>> the gpiod GPIO interface, as it considerably simplifies the code.
>>
>> For compatibility with current users of the driver, it is still possible
>> to pass the enable GPIO number as platform data. Two platforms are still
>> relying on this feature (pxa/palmtc and shmobile/armadillo800eva) which
>> will be removed as soon as its last users are switched to GPIO mapping
>> tables.
>>
>> Alexandre Courbot (2):
>>   ARM: SAMSUNG: remove gpio flags in dev-backlight
>>   pwm-backlight: switch to gpiod interface
>>
>>  arch/arm/plat-samsung/dev-backlight.c |  2 -
>>  drivers/video/backlight/pwm_bl.c  | 72 
>> +++
>>  include/linux/pwm_backlight.h |  5 +--
>>  3 files changed, 32 insertions(+), 47 deletions(-)
>>
>> --
>> 1.9.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] mmc: host: omap_hsmmc: introduce new accessor functions

2014-03-26 Thread Felipe Balbi
we introduce new accessors which provide for register
access with and without offsets.

This is just to make sure newer versions of the IP
can access the new registers prepended at the beginning
of the address space.

Signed-off-by: Felipe Balbi 
---
 drivers/mmc/host/omap_hsmmc.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index d46f768..e596c6a 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -211,6 +211,42 @@ struct omap_hsmmc_host {
struct  omap_mmc_platform_data  *pdata;
 };
 
+static inline int _omap_hsmmc_read(struct omap_hsmmc_host *host,
+   u32 reg, bool offset)
+{
+   return readl(host->base + reg + (offset ? host->reg_offset : 0));
+}
+
+static inline void _omap_hsmmc_write(struct omap_hsmmc_host *host,
+   u32 reg, u32 val, bool offset)
+{
+   writel(val, host->base + reg + (offset ? host->reg_offset : 0));
+}
+
+static inline int omap_hsmmc_read_offset(struct omap_hsmmc_host *host,
+   u32 reg)
+{
+   return _omap_hsmmc_read(host, reg, true);
+}
+
+static inline void omap_hsmmc_write_offset(struct omap_hsmmc_host *host,
+   u32 reg, u32 val)
+{
+   _omap_hsmmc_write(host, reg, val, true);
+}
+
+static inline int omap_hsmmc_read_no_offset(struct omap_hsmmc_host *host,
+   u32 reg)
+{
+   return _omap_hsmmc_read(host, reg, false);
+}
+
+static inline void omap_hsmmc_write_no_offset(struct omap_hsmmc_host *host,
+   u32 reg, u32 val)
+{
+   _omap_hsmmc_write(host, reg, val, false);
+}
+
 struct omap_mmc_of_data {
u32 reg_offset;
u8 controller_flags;
-- 
1.9.1.286.g5172cb3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] mmc: host: omap_hsmmc: add reg_offset field

2014-03-26 Thread Felipe Balbi
by saving reg_offset inside our host structure
we can ioremap the correct area, make use of
resource_size() and make sure newer versions
of the IP have access to the new set of registers
which were added back in OMAP4.

Signed-off-by: Felipe Balbi 
---
 drivers/mmc/host/omap_hsmmc.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index a8f1e08..d46f768 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -152,10 +152,10 @@
  * MMC Host controller read/write API's
  */
 #define OMAP_HSMMC_READ(host, reg) \
-   __raw_readl((host)->base + OMAP_HSMMC_##reg)
+   __raw_readl((host)->base + OMAP_HSMMC_##reg + host->reg_offset)
 
 #define OMAP_HSMMC_WRITE(host, reg, val) \
-   __raw_writel((val), (host)->base + OMAP_HSMMC_##reg)
+   __raw_writel((val), (host)->base + OMAP_HSMMC_##reg + host->reg_offset)
 
 struct omap_hsmmc_next {
unsigned intdma_len;
@@ -184,6 +184,7 @@ struct omap_hsmmc_host {
void__iomem *base;
resource_size_t mapbase;
spinlock_t  irq_lock; /* Prevent races with irq handler */
+   unsigned intreg_offset;
unsigned intdma_len;
unsigned intdma_sg_idx;
unsigned char   bus_mode;
@@ -1354,8 +1355,8 @@ static int omap_hsmmc_setup_dma_transfer(struct 
omap_hsmmc_host *host,
 
chan = omap_hsmmc_get_dma_chan(host, data);
 
-   cfg.src_addr = host->mapbase + OMAP_HSMMC_DATA;
-   cfg.dst_addr = host->mapbase + OMAP_HSMMC_DATA;
+   cfg.src_addr = host->mapbase + OMAP_HSMMC_DATA + host->reg_offset;
+   cfg.dst_addr = host->mapbase + OMAP_HSMMC_DATA + host->reg_offset;
cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
cfg.src_maxburst = data->blksz / 4;
@@ -1903,8 +1904,9 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
host->dma_ch= -1;
host->irq   = irq;
host->slot_id   = 0;
-   host->mapbase   = res->start + pdata->reg_offset;
-   host->base  = ioremap(host->mapbase, SZ_4K);
+   host->reg_offset = pdata->reg_offset;
+   host->mapbase   = res->start;
+   host->base  = ioremap(res->start, resource_size(res));
host->power_mode = MMC_POWER_OFF;
host->next_data.cookie = 1;
host->pbias_enabled = 0;
-- 
1.9.1.286.g5172cb3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] mmc: host: omap_hsmmc: pass host as an argument

2014-03-26 Thread Felipe Balbi
This patch is in preparation for a larger series
of cleanups on the omap_hsmmc.c driver.

In newer instances of this IP, there's a lot of
configuration details which we can grab by reading
some new registers which were prepended to the
address space.

Signed-off-by: Felipe Balbi 
---
 drivers/mmc/host/omap_hsmmc.c | 198 +-
 1 file changed, 99 insertions(+), 99 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index e91ee21..a8f1e08 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -151,11 +151,11 @@
 /*
  * MMC Host controller read/write API's
  */
-#define OMAP_HSMMC_READ(base, reg) \
-   __raw_readl((base) + OMAP_HSMMC_##reg)
+#define OMAP_HSMMC_READ(host, reg) \
+   __raw_readl((host)->base + OMAP_HSMMC_##reg)
 
-#define OMAP_HSMMC_WRITE(base, reg, val) \
-   __raw_writel((val), (base) + OMAP_HSMMC_##reg)
+#define OMAP_HSMMC_WRITE(host, reg, val) \
+   __raw_writel((val), (host)->base + OMAP_HSMMC_##reg)
 
 struct omap_hsmmc_next {
unsigned intdma_len;
@@ -492,8 +492,8 @@ static void omap_hsmmc_gpio_free(struct 
omap_mmc_platform_data *pdata)
  */
 static void omap_hsmmc_start_clock(struct omap_hsmmc_host *host)
 {
-   OMAP_HSMMC_WRITE(host->base, SYSCTL,
-   OMAP_HSMMC_READ(host->base, SYSCTL) | CEN);
+   OMAP_HSMMC_WRITE(host, SYSCTL,
+   OMAP_HSMMC_READ(host, SYSCTL) | CEN);
 }
 
 /*
@@ -501,9 +501,9 @@ static void omap_hsmmc_start_clock(struct omap_hsmmc_host 
*host)
  */
 static void omap_hsmmc_stop_clock(struct omap_hsmmc_host *host)
 {
-   OMAP_HSMMC_WRITE(host->base, SYSCTL,
-   OMAP_HSMMC_READ(host->base, SYSCTL) & ~CEN);
-   if ((OMAP_HSMMC_READ(host->base, SYSCTL) & CEN) != 0x0)
+   OMAP_HSMMC_WRITE(host, SYSCTL,
+   OMAP_HSMMC_READ(host, SYSCTL) & ~CEN);
+   if ((OMAP_HSMMC_READ(host, SYSCTL) & CEN) != 0x0)
dev_dbg(mmc_dev(host->mmc), "MMC Clock is not stopped\n");
 }
 
@@ -521,16 +521,16 @@ static void omap_hsmmc_enable_irq(struct omap_hsmmc_host 
*host,
if (cmd->opcode == MMC_ERASE)
irq_mask &= ~DTO_EN;
 
-   OMAP_HSMMC_WRITE(host->base, STAT, STAT_CLEAR);
-   OMAP_HSMMC_WRITE(host->base, ISE, irq_mask);
-   OMAP_HSMMC_WRITE(host->base, IE, irq_mask);
+   OMAP_HSMMC_WRITE(host, STAT, STAT_CLEAR);
+   OMAP_HSMMC_WRITE(host, ISE, irq_mask);
+   OMAP_HSMMC_WRITE(host, IE, irq_mask);
 }
 
 static void omap_hsmmc_disable_irq(struct omap_hsmmc_host *host)
 {
-   OMAP_HSMMC_WRITE(host->base, ISE, 0);
-   OMAP_HSMMC_WRITE(host->base, IE, 0);
-   OMAP_HSMMC_WRITE(host->base, STAT, STAT_CLEAR);
+   OMAP_HSMMC_WRITE(host, ISE, 0);
+   OMAP_HSMMC_WRITE(host, IE, 0);
+   OMAP_HSMMC_WRITE(host, STAT, STAT_CLEAR);
 }
 
 /* Calculate divisor for the given clock frequency */
@@ -558,17 +558,17 @@ static void omap_hsmmc_set_clock(struct omap_hsmmc_host 
*host)
 
omap_hsmmc_stop_clock(host);
 
-   regval = OMAP_HSMMC_READ(host->base, SYSCTL);
+   regval = OMAP_HSMMC_READ(host, SYSCTL);
regval = regval & ~(CLKD_MASK | DTO_MASK);
clkdiv = calc_divisor(host, ios);
regval = regval | (clkdiv << 6) | (DTO << 16);
-   OMAP_HSMMC_WRITE(host->base, SYSCTL, regval);
-   OMAP_HSMMC_WRITE(host->base, SYSCTL,
-   OMAP_HSMMC_READ(host->base, SYSCTL) | ICE);
+   OMAP_HSMMC_WRITE(host, SYSCTL, regval);
+   OMAP_HSMMC_WRITE(host, SYSCTL,
+   OMAP_HSMMC_READ(host, SYSCTL) | ICE);
 
/* Wait till the ICS bit is set */
timeout = jiffies + msecs_to_jiffies(MMC_TIMEOUT_MS);
-   while ((OMAP_HSMMC_READ(host->base, SYSCTL) & ICS) != ICS
+   while ((OMAP_HSMMC_READ(host, SYSCTL) & ICS) != ICS
&& time_before(jiffies, timeout))
cpu_relax();
 
@@ -583,14 +583,14 @@ static void omap_hsmmc_set_clock(struct omap_hsmmc_host 
*host)
 */
if ((mmc_slot(host).features & HSMMC_HAS_HSPE_SUPPORT) &&
(ios->timing != MMC_TIMING_UHS_DDR50) &&
-   ((OMAP_HSMMC_READ(host->base, CAPA) & HSS) == HSS)) {
-   regval = OMAP_HSMMC_READ(host->base, HCTL);
+   ((OMAP_HSMMC_READ(host, CAPA) & HSS) == HSS)) {
+   regval = OMAP_HSMMC_READ(host, HCTL);
if (clkdiv && (clk_get_rate(host->fclk)/clkdiv) > 2500)
regval |= HSPE;
else
regval &= ~HSPE;
 
-   OMAP_HSMMC_WRITE(host->base, HCTL, regval);
+   OMAP_HSMMC_WRITE(host, HCTL, regval);
}
 
omap_hsmmc_start_clock(host);
@@ -601,24 +601,24 @@ static void omap_hsmmc_set_bus_width(struct 
omap_hsmmc_host *host)
struct mmc_ios *ios = &host->mmc->ios;
u32 con;
 
-   con = OMAP_HSMMC_READ(host->base, CON);
+   con = OMAP_HSMMC_READ(host, CON);
if (ios->timing == MMC_T

  1   2   3   4   5   6   >