Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 11:14 AM, Thomas Gleixner wrote:
> On Mon, 29 Feb 2016, Peter Hurley wrote:
>> On 02/29/2016 10:24 AM, Eric Dumazet wrote:
>>>> Just to be clear
>>>>
>>>>if (time_before(jiffies, end) && !need_resched() &&
>>>>--max_restart)
>>>>goto restart;
>>>>
>>>> aborts softirq *even if 0ns have elapsed*, if NET_RX has woken a process.
>>>
>>> Sure, now remove the 1st and 2nd condition.
>>
>> Well just removing the 2nd condition has everything working fine,
>> because that fixes the priority inversion.
> 
> No. It does not fix anything. It hides the shortcomings of the driver.
>  
>> However, when system resources are _not_ contended, it makes no
>> sense to be forced to revert to ksoftirqd resolution, which is strictly
>> intended as fallback.
> 
> No. You claim it is simply because your driver does not handle that situation
> properly.
>  
>> Or flipping your argument on its head, why not just _always_ execute
>> softirq in ksoftirqd?
> 
> Which is what that change effectivley does. And that makes a lot of sense,
> because you get the softirq load under scheduler control and do not let the
> softirq run as a context stealing entity which is completely uncontrollable by
> the scheduler.

Ok, fair enough.

However, charging [in the scheduler sense] very lightweight DMA completion for
one subsystem collectively with very heavyweight NET_RX (doing garbage 
collection
in softirq!) is hardly ideal.

The alternative being threaded interrupt handlers (which are essentially treated
as 0.00 scheduler cost).

I just want to make sure that's the conscious choice being made, when the
patches for converting from tasklet to threaded irq start hitting subsystem
maintainers.

Regards,
Peter Hurley




Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:27 AM, Eric Dumazet wrote:
> On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote:
> 
>> The reason why Eric's change is so effective for Eric's workload is
>> that it fixes the problem where NET_RX keeps getting new network packets
>> so it keeps looping, servicing more NET_RX softirq.
> 
> You have very little idea of what is happening in networking land.

While that is true, I can read a trace:

  ** already in NET_RX softirq **

  -0   0..s2   15us : kmem_cache_alloc: call_site=c08378e4 
ptr=de55d7c0 bytes_req=192 bytes_alloc=192 gfp_flags=GFP_ATOMIC
  -0   0..s2   23us : netif_receive_skb_entry: dev=eth0 napi_id=0x0 
queue_mapping=0 skbaddr=dca04400 vlan_tagged=0 vlan_proto=0x vlan_tci=0x000
0 protocol=0x0800 ip_summed=0 hash=0x l4_hash=0 len=88 data_len=0 
truesize=1984 mac_header_valid=1 mac_header=-14 nr_frags=0 gso_size=0 
gso_type=0x0
  -0   0..s2   30us+: netif_receive_skb: dev=eth0 skbaddr=dca04400 
len=88
  -0   0d.s5   98us : sched_waking: comm=sshd pid=750 prio=120 
target_cpu=000
  -0   0d.s6  105us : sched_stat_sleep: comm=sshd pid=750 
delay=3125230447 [ns]
  -0   0dns6  110us+: sched_wakeup: comm=sshd pid=750 prio=120 
target_cpu=000
  -0   0dns4  123us+: timer_start: timer=dc940e9c 
function=tcp_delack_timer expires=9746 [timeout=10] flags=0x
  -0   0dnH3  150us : irq_handler_entry: irq=176 
name=4a10.ethernet
  -0   0dnH3  153us : softirq_raise: vec=3 [action=NET_RX]
  -0   0dnH3  155us : irq_handler_exit: irq=176 ret=handled
  -0   0dnH3  160us : irq_handler_entry: irq=20 
name=4900.edma_ccint
  -0   0dnH3  163us : irq_handler_exit: irq=20 ret=handled
  -0   0.ns2  169us : napi_poll: napi poll on napi struct de465c30 
for device eth0
  -0   0.ns2  171us : softirq_exit: vec=3 [action=NET_RX]


As you can see, NET_RX softirq is re-raised while in NET_RX softirq,
as a result of receiving new packets. So NET_RX will keep looping,
which is what I wrote.


> Once hard irq for RX has triggered, we arm a NAPI (NET_RX softirq), and
> no more irq will come unless the napi handler ran. Then when NAPI is
> complete, we re-allow interrupt to be delivered when a new packet is
> coming.
> 
> Yes, ksoftirqd runs under load, and this is _wanted_.
> 
> Sure, it might add a latency if some high prio task is wanting the same
> cpu, but this is exactly the purpose of having multi tasking.
> 
> 



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 10:24 AM, Eric Dumazet wrote:
> On lun., 2016-02-29 at 10:05 -0800, Peter Hurley wrote:
> 
>> While I appreciate the attempt, that's not the problem.
>>
>> Just to be clear
>>
>>  if (time_before(jiffies, end) && !need_resched() &&
>>  --max_restart)
>>  goto restart;
>>
>> aborts softirq *even if 0ns have elapsed*, if NET_RX has woken a process.
> 
> 
> Sure, now remove the 1st and 2nd condition.

Well just removing the 2nd condition has everything working fine,
because that fixes the priority inversion.


> You would still 'abort' (ie wakeup ksoftirqd really) when --max_restart
> becomes 0

Sure. Which would mean there's contended heavy i/o load so the driver
has to fallback to non-DMA. That's an acceptable outcome.


> So, instead of some subtle load dependent bug, you know have a reliable
> trigger.

There's no "subtle load dependent bug" here.

The driver has a fallback mode of operation that it relies on without
DMA. Of course, as I already wrote, this has consequences.

If system resources are _actually contended_, then naturally, fighting
for cpu and i/o time is fine, and I'm happy to do that in ksoftirqd.

However, when system resources are _not_ contended, it makes no
sense to be forced to revert to ksoftirqd resolution, which is strictly
intended as fallback.

Or flipping your argument on its head, why not just _always_ execute
softirq in ksoftirqd?



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 08:21 AM, Eric Dumazet wrote:
> On lun., 2016-02-29 at 07:54 -0800, Peter Hurley wrote:
> 
>>  The current kernel is HZ=250 but this would occur on HZ=1000 as well.
> 
> Right. But the problem with HZ=100 and HZ=250 is that the detection can
> happens because jiffy granularity is too coarse, since 
> 
> msecs_to_jiffies(2) -> 1
> 
> Following patch might reduce the probability, but wont really fix your
> problem.
> 
> Fact that ksoftirqd prio is not what you want is completely orthogonal.
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 479e443..f7cc594 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -180,7 +180,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
>  
>  /*
>   * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
> - * but break the loop if need_resched() is set or after 2 ms.
> + * but break the loop if need_resched() is set or after 2 ms/ticks.
>   * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
>   * certain cases, such as stop_machine(), jiffies may cease to
>   * increment and so we need the MAX_SOFTIRQ_RESTART limit as
> @@ -191,7 +191,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
>   * we want to handle softirqs as soon as possible, but they
>   * should not be able to lock up the box.
>   */
> -#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
> +#define MAX_SOFTIRQ_TIME  (1 + msecs_to_jiffies(2))
>  #define MAX_SOFTIRQ_RESTART 10
>  
>  #ifdef CONFIG_TRACE_IRQFLAGS

While I appreciate the attempt, that's not the problem.

Just to be clear

if (time_before(jiffies, end) && !need_resched() &&
--max_restart)
goto restart;

aborts softirq *even if 0ns have elapsed*, if NET_RX has woken a process.






Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:19 AM, Eric Dumazet wrote:
> On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote:
> 
>> Not the case. The softirq is raised from interrupt.
>>
>> Before Eric's change, when an interrupt raises a new softirq
>> while processing another softirq, the new softirq is immediately
>> processed *after the existing softirq completes*.
>>
>> After Eric's change, when an interrupt raises a new softirq
>> while processing another softirq and _that softirq wakes a process_,
>> the new softirq is *deferred to normal process priority*.
> 
> For the last time, this is not true.
> 
> My patch changed the probability for this to happen.

There is a huge difference between
1. heavy i/o load forcing ksoftirqd to battle out i/o with regular
   sched processes *as a fallback to avoid 100% softirq* and
2. always deferring new softirq just because a process was woken


> It will happen even if you revert it.

I think there is a happy medium where finer constraints on
softirq looping will get us both what we want.

For example, an accumulating mask of softirq already run would
keep one softirq level from looping over-and-over. Or a per-softirq
limiting counter. Or relying on the hard limit that was added later
of a fixed number of softirq loops. Or a combination of those.


> linux never claimed that softirq could steal all cpu time.

That's not the problem observed here.

In fact, what your patch triggers is exactly the opposite:
although cpu load is initially very light because DMA is used to perform
device i/o, once DMA is not being serviced in a timely manner, the
driver fallbacks to purely interrupt-driven i/o which dramatically
increases the real cpu load at those line rates.

> Are by any chance still running a HZ=100 kernel ?

The current kernel is HZ=250 but this would occur on HZ=1000 as well.

Regards,
Peter Hurley



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:40 AM, Mike Galbraith wrote:
> On Mon, 2016-02-29 at 07:03 -0800, Peter Hurley wrote:
> 
>>> If I'm listening properly, the root cause is that there is a timing
>>> constraint involved, which is being exposed because one softirq raises
>>> another (ew).
>>
>> Not the case. The softirq is raised from interrupt.
> 
> Yeah, saw that on re-read.
> 
>> Before Eric's change, when an interrupt raises a new softirq
>> while processing another softirq, the new softirq is immediately
>> processed *after the existing softirq completes*.
> 
> Not necessarily, Eric only changed it from an arbitrary count to an
> arbitrary time, so your irq could just as well land when there's no
> count left and be up the same creek.

Your misreading the softirq abort logic:
neither 2ms nor a fixed number of loops has elapsed.

All that's happened is the first loop of NET_RX softirq has woken a
process; that is sufficient to abort softirq and defer it for ksoftirqd.

That's why I'm saying this is a priority inversion, and one that
will happen a lot.


> I was more infatuated by the constraint that's left dangling in the
> breeze any time processing is deferred to ksoftirqd.
> 
>   -Mike
> 



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/28/2016 08:58 PM, Mike Galbraith wrote:
> On Sun, 2016-02-28 at 18:01 +0100, Francois Romieu wrote:
>> Mike Galbraith  :
>> [...]
>>> Hrm, relatively new + tasklet woes rings a bell.  Ah, that..
>>>
>>>
>>> What's worse is that at the point where this code was written it was
>>> already well known that tasklets are a steaming pile of crap and
>>> should die.
>>>
>>>
>>> Source thereof https://lwn.net/Articles/588457/

Thanks but not applicable. tglx's POV has everything to do with the
tasklet interface and not the general concept of bottom-half interrupt
processing in a timely manner. In any event, the problem created by
Eric's change is not restricted to tasklets, but rather applies to
all softirq.


>> tasklets are ingrained in the dmaengine API (see 
>> Documentation/dmaengine/client.txt
>> and drivers/dma/virt-dma.h::vchan_cookie_complete).
>>
>> Moving everything to irq context or handling his own sub-{jiffy/ms} timer
>> while losing async dma doesn't exactly smell like roses either. :o(
> 
> https://lwn.net/Articles/239633/
> 
> If I'm listening properly, the root cause is that there is a timing
> constraint involved, which is being exposed because one softirq raises
> another (ew).

Not the case. The softirq is raised from interrupt.

Before Eric's change, when an interrupt raises a new softirq
while processing another softirq, the new softirq is immediately
processed *after the existing softirq completes*.

After Eric's change, when an interrupt raises a new softirq
while processing another softirq and _that softirq wakes a process_,
the new softirq is *deferred to normal process priority*.
This happens even if the new softirq is higher priority than the
one currently running, which is flat-out wrong.

The reason this happens repeatedly and regularly is because
1. The time window while NET_RX softirq is running is big.
2. NET_RX softirq will almost always wake a process for a received packet.

The reason why Eric's change is so effective for Eric's workload is
that it fixes the problem where NET_RX keeps getting new network packets
so it keeps looping, servicing more NET_RX softirq.

However, I'm pointing out that Eric's sledgehammer approach to fixing
the NET_RX softirq bug is having significant side-effects in other
subsystems.


> Processing timeout happens, freshly raised tasklet
> wanders off to SCHED_NORMAL kthread context where its constraint dies.
> 
> Given the dma stuff apparently works fine in -rt (or did, see below),
> timing constraints can't be super tight, so perhaps we could grow
> realtime workqueue support for the truly deserving.  The tricky bit
> would be being keeping everybody and his brother from abusing it.
> 
> WRT -rt: if dma tasklets really do have hard (ish) constraints, -rt
> recently "broke" in the same way.. of all softirqs which are deferred
> to kthread context, due to a recent change, only timer/hrtimer are
> executed at realtime priority by default.
> 
>   -Mike
> 



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 05:59 PM, Eric Dumazet wrote:
> On sam., 2016-02-27 at 15:33 -0800, Peter Hurley wrote:
>> On 02/27/2016 03:04 PM, David Miller wrote:
>>> From: Peter Hurley <pe...@hurleysoftware.com>
>>> Date: Sat, 27 Feb 2016 12:29:39 -0800
>>>
>>>> Not really. softirq raised from interrupt context will always execute
>>>> on this cpu and not in ksoftirqd, unless load forces softirq loop abort.
>>>
>>> That guarantee never was specified.
>>
>> ??
>>
>> Neither is running network socket servers at normal priority as if they're
>> higher priority than softirq.
>>
>>
>>> Or are you saying that by design, on a system under load, your UART
>>> will not function properly?
>>>
>>> Surely you don't mean that.
>>
>> No, that's not what I mean.
>>
>> What I mean is that bypassing the entire SOFTIRQ priority so that
>> sshd can process one network packet makes a mockery of the point of softirq.
>>
>> This hack to workaround NET_RX looping over-and-over-and-over affects every
>> subsystem, not just one uart.
>>
>> HI, TIMER, BLOCK; all of these are skipped: that's straight-up, a bug.
> 
> No idea what you talk about.
> 
> All pending softirq interrupts are processed. _Nothing_ is skipped.

An interrupt that schedules HI softirq while in NET_RX softirq should
still run the HI softirq. But with your patch that won't happen.


> Really, your system stability seems to depend on a completely
> undocumented behavior of linux kernels before linux-3.8
> 
> If I understood, you expect that a tasklet activated from a softirq
> handler is run from the same  __do_softirq() loop. This never has been
> the case.

No.

The *interrupt handler* for DMA goes off while NET_RX softirq is running.
That's what schedules the *DMA tasklet*.

That tasklet should run before any process.

But it doesn't because your patch bails out early from softirq.


> My change simply triggers the bug in your driver earlier. As David
> pointed out, your bug should trigger the same on a loaded machine, even
> if you revert my patch.
> 
> I honestly do not know why you arm a tasklet from NET_RX, why don't you
> simply process this directly, so that you do not rely on some scheduler
> decision ?




Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 03:04 PM, David Miller wrote:
> From: Peter Hurley <pe...@hurleysoftware.com>
> Date: Sat, 27 Feb 2016 12:29:39 -0800
> 
>> Not really. softirq raised from interrupt context will always execute
>> on this cpu and not in ksoftirqd, unless load forces softirq loop abort.
> 
> That guarantee never was specified.

??

Neither is running network socket servers at normal priority as if they're
higher priority than softirq.


> Or are you saying that by design, on a system under load, your UART
> will not function properly?
> 
> Surely you don't mean that.

No, that's not what I mean.

What I mean is that bypassing the entire SOFTIRQ priority so that
sshd can process one network packet makes a mockery of the point of softirq.

This hack to workaround NET_RX looping over-and-over-and-over affects every
subsystem, not just one uart.

HI, TIMER, BLOCK; all of these are skipped: that's straight-up, a bug.

Regards,
Peter Hurley


Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 12:13 PM, Eric Dumazet wrote:
> On sam., 2016-02-27 at 10:19 -0800, Peter Hurley wrote:
>> Hi Eric,
>>
>> For a while now, we've been struggling to understand why we've been
>> observing missed uart rx DMA.
>>
>> Because both the uart driver (omap8250) and the dmaengine driver
>> (edma) were (relatively) new, we assumed there was some race between
>> starting a new rx DMA and processing the previous one.
>>
>> However, after instrumenting both the uart driver and the dmaengine
>> driver, what we've observed is huge anomalous latencies between receiving
>> the DMA interrupt and servicing the DMA tasklet.
>>
>> For example, at 3Mbaud we recorded the following distribution of
>> softirq[TASKLET] service latency for this specific DMA channel:
>>
>> root@black:/sys/kernel/debug/edma# cat 35
>> latency(us):   0+   20+   40+   60+   80+  100+  120+  140+  160+  180+  
>> 200+  220+  240+  260+  280+  300+  320+  340+  360+  380+
>>195681335315 7 4 3 1 0 0 
>> 0 1 4 6 1 0 0 0 0 0
>>
>> As you can see, the vast majority of tasklet service happens immediately,
>> tapering off to 140+us.
>>
>> However, note the island of distribution at 220~300 [latencies beyond 300+
>> are not recorded because the uart fifo has filled again by this point and
>> dma must be aborted].
>>
>> So I cribbed together a latency tracer to catch what was happening at
>> the extreme, and what it caught was a priority inversion stemming from
>> your commit:
>>
>>commit c10d73671ad30f54692f7f69f0e09e75d3a8926a
>>Author: Eric Dumazet <eduma...@google.com>
>>Date:   Thu Jan 10 15:26:34 2013 -0800
>>
>>softirq: reduce latencies
>> 
>>In various network workloads, __do_softirq() latencies can be up
>>to 20 ms if HZ=1000, and 200 ms if HZ=100.
>> 
>>This is because we iterate 10 times in the softirq dispatcher,
>>and some actions can consume a lot of cycles.
>>
>>
>> In the trace below [1], the trace begins in the edma completion interrupt
>> handler when the tasklet is scheduled; the edma interrupt has occurred during
>> NET_RX softirq context.
>>
>> However, instead of causing a restart of the softirq loop to process the
>> tasklet _which occurred before sshd was scheduled_, the softirq loop is
>> aborted and deferred for ksoftirqd. The tasklet is not serviced for 521us,
>> which is way too long, so DMA was aborted.
>>
>> Your patch has effectively inverted the priority of tasklets with normal
>> pri/nice processes that have merely received a network packet.
>>
>> ISTM, the problem you're trying to solve here was caused by NET_RX softirq
>> to begin with, and maybe that thing needs a diet.
>>
>> But rather than outright reverting your patch, what if more selective
>> conditions are used to abort the softirq restart? What would those conditions
>> be? In the netperf benchmark you referred to in that commit, is it just
>> NET_TX/NET_RX softirqs that are causing scheduling latencies?
>>
>> It just doesn't make sense to special case for a workload that isn't
>> even running.
>>
>>
>> Regards,
>> Peter Hurley
>>
>>
>> [1] softirq tasklet latency trace  (apologies that it's only events - full
>> function trace introduces too much overhead)
>>
>> # tracer: latency
>> #
>> # latency latency trace v1.1.5 on 4.5.0-rc2+
>> # 
>> # latency: 476 us, #59/59, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
>> #-
>> #| task: sshd-750 (uid:1000 nice:0 policy:0 rt_prio:0)
>> #-
>> #  => started at: __tasklet_schedule  
>> #  => ended at:   tasklet_action
>> #
>> #
>> #  _--=> CPU#
>> # / _-=> irqs-off 
>> #| / _=> need-resched
>> #|| / _---=> hardirq/softirq
>> #||| / _--=> preempt-depth
>> # / delay
>> #  cmd pid   | time  |   caller
>> # \   /  |  \|   /
>>   -0   0d.H31us : __tasklet_schedule
>>   -0   0d.H43us : softirq_raise: vec=6 [action=TASKLET]
>>   -0   0d.H36us : irq_handler_exit: irq=20 ret=handled
>>   -0   0..s2   15us : kmem_cache_alloc: call_site=c08378e4 
>> ptr=de55d7c0 bytes_req=19

Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
Hi Eric,

For a while now, we've been struggling to understand why we've been
observing missed uart rx DMA.

Because both the uart driver (omap8250) and the dmaengine driver
(edma) were (relatively) new, we assumed there was some race between
starting a new rx DMA and processing the previous one.

However, after instrumenting both the uart driver and the dmaengine
driver, what we've observed is huge anomalous latencies between receiving
the DMA interrupt and servicing the DMA tasklet.

For example, at 3Mbaud we recorded the following distribution of
softirq[TASKLET] service latency for this specific DMA channel:

root@black:/sys/kernel/debug/edma# cat 35
latency(us):   0+   20+   40+   60+   80+  100+  120+  140+  160+  180+  200+  
220+  240+  260+  280+  300+  320+  340+  360+  380+
   195681335315 7 4 3 1 0 0 0   
  1 4 6 1 0 0 0 0 0

As you can see, the vast majority of tasklet service happens immediately,
tapering off to 140+us.

However, note the island of distribution at 220~300 [latencies beyond 300+
are not recorded because the uart fifo has filled again by this point and
dma must be aborted].

So I cribbed together a latency tracer to catch what was happening at
the extreme, and what it caught was a priority inversion stemming from
your commit:

   commit c10d73671ad30f54692f7f69f0e09e75d3a8926a
   Author: Eric Dumazet <eduma...@google.com>
   Date:   Thu Jan 10 15:26:34 2013 -0800

   softirq: reduce latencies

   In various network workloads, __do_softirq() latencies can be up
   to 20 ms if HZ=1000, and 200 ms if HZ=100.

   This is because we iterate 10 times in the softirq dispatcher,
   and some actions can consume a lot of cycles.


In the trace below [1], the trace begins in the edma completion interrupt
handler when the tasklet is scheduled; the edma interrupt has occurred during
NET_RX softirq context.

However, instead of causing a restart of the softirq loop to process the
tasklet _which occurred before sshd was scheduled_, the softirq loop is
aborted and deferred for ksoftirqd. The tasklet is not serviced for 521us,
which is way too long, so DMA was aborted.

Your patch has effectively inverted the priority of tasklets with normal
pri/nice processes that have merely received a network packet.

ISTM, the problem you're trying to solve here was caused by NET_RX softirq
to begin with, and maybe that thing needs a diet.

But rather than outright reverting your patch, what if more selective
conditions are used to abort the softirq restart? What would those conditions
be? In the netperf benchmark you referred to in that commit, is it just
NET_TX/NET_RX softirqs that are causing scheduling latencies?

It just doesn't make sense to special case for a workload that isn't
even running.


Regards,
Peter Hurley


[1] softirq tasklet latency trace  (apologies that it's only events - full
function trace introduces too much overhead)

# tracer: latency
#
# latency latency trace v1.1.5 on 4.5.0-rc2+
# 
# latency: 476 us, #59/59, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
#-
#| task: sshd-750 (uid:1000 nice:0 policy:0 rt_prio:0)
#-
#  => started at: __tasklet_schedule  
#  => ended at:   tasklet_action
#
#
#  _--=> CPU#
# / _-=> irqs-off 
#| / _=> need-resched
#|| / _---=> hardirq/softirq
#||| / _--=> preempt-depth
# / delay
#  cmd pid   | time  |   caller
# \   /  |  \|   /
  -0   0d.H31us : __tasklet_schedule
  -0   0d.H43us : softirq_raise: vec=6 [action=TASKLET]
  -0   0d.H36us : irq_handler_exit: irq=20 ret=handled
  -0   0..s2   15us : kmem_cache_alloc: call_site=c08378e4 
ptr=de55d7c0 bytes_req=192 bytes_alloc=192 gfp_flags=GFP_ATOMIC
  -0   0..s2   23us : netif_receive_skb_entry: dev=eth0 napi_id=0x0 
queue_mapping=0 skbaddr=dca04400 vlan_tagged=0 vlan_proto=0x vlan_tci=0x000
0 protocol=0x0800 ip_summed=0 hash=0x l4_hash=0 len=88 data_len=0 
truesize=1984 mac_header_valid=1 mac_header=-14 nr_frags=0 gso_size=0 
gso_type=0x0
  -0   0..s2   30us+: netif_receive_skb: dev=eth0 skbaddr=dca04400 
len=88
  -0   0d.s5   98us : sched_waking: comm=sshd pid=750 prio=120 
target_cpu=000
  -0   0d.s6  105us : sched_stat_sleep: comm=sshd pid=750 
delay=3125230447 [ns]
  -0   0dns6  110us+: sched_wakeup: comm=sshd pid=750 prio=120 
target_cpu=000
  -0   0dns4  123us+: timer_start: timer=dc940e9c 
function=tcp_delack_timer expires=9746 [timeout=10] flags=0x
  -0   0dnH3  150us : irq_handler_entry: irq=176 
name=4a10.ethernet
  -0   0dnH3  153us : softirq_raise: vec=3 [action=NET_RX]
  -0   0dnH3  155us : irq_handler_exit: irq=176 ret=hand

Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Peter Hurley
Hi Jacob,

On 01/05/2016 06:34 AM, Jacob Siverskog wrote:
> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
>> On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
>>> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
>>>> On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
>>>>> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang <xiyou.wangc...@gmail.com> 
>>>>> wrote:
>>>>>> On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>>>>>> <jacob@teenage.engineering> wrote:
>>>>>>> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet <eduma...@google.com> 
>>>>>>> wrote:
>>>>>>>> How often can you trigger this bug ?
>>>>>>>
>>>>>>> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
>>>>>>> just a
>>>>>>> few times when bringing up/down network interfaces. Does the trace
>>>>>>> give any clue?
>>>>>>>
>>>>>>
>>>>>> A little bit. You need to help people to narrow down the problem
>>>>>> because there are too many places using skb->next and skb->prev.
>>>>>>
>>>>>> Since you mentioned it seems related to network interface flip,
>>>>>> what network interfaces are you using? What's is your TC setup?
>>>>>>
>>>>>> Thanks.
>>>>>
>>>>> The system contains only one physical network interface (TI WL1837,
>>>>> wl18xx module).
>>>>> The state prior to the crash was as follows:
>>>>> - One virtual network interface active (as STA, associated with access 
>>>>> point)
>>>>> - Bluetooth (BLE only) active (same physical chip, co-existence,
>>>>> btwilink/st_drv modules)
>>>>>
>>>>> Actions made around the time of the crash:
>>>>> - Bluetooth disabled
>>>>> - One additional virtual network interface brought up (also as STA)
>>>>>
>>>>> I believe the crash occurred between these two actions. I just saw
>>>>> that there are some interesting events in the log prior to the crash:
>>>>> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>>>>> kernel: (stc):  proto stack 4's ->recv failed
>>>>> kernel: (stc): remove_channel_from_table: id 3
>>>>> kernel: (stc): remove_channel_from_table: id 2
>>>>> kernel: (stc): remove_channel_from_table: id 4
>>>>> kernel: (stc):  all chnl_ids unregistered
>>>>> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>>>>>
>>>>> The first print is from btwilink.c. However, I can't see the
>>>>> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>>>>> 6LoWPAN or anything similar).
>>>>>
>>>>> Thanks, Jacob
>>>>
>>>> Definitely these details are useful ;)
>>>>
>>>> Could you try :
>>>>
>>>> diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
>>>> index 6e3af8b42cdd..0c99a74fb895 100644
>>>> --- a/drivers/misc/ti-st/st_core.c
>>>> +++ b/drivers/misc/ti-st/st_core.c
>>>> @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
>>>> skb_queue_purge(_gdata->txq);
>>>> skb_queue_purge(_gdata->tx_waitq);
>>>> kfree_skb(st_gdata->rx_skb);
>>>> +   st_gdata->rx_skb = NULL;
>>>> kfree_skb(st_gdata->tx_skb);
>>>> +   st_gdata->tx_skb = NULL;
>>>> /* TTY ldisc cleanup */
>>>> err = tty_unregister_ldisc(N_TI_WL);
>>>> if (err)

FWIW,

You don't need that ti-st junk to get the WL1837 working; the WL1837 only
has BT channels. Unfortunately, that's really all I can say about it; sorry.

Regards,
Peter Hurley




Re: [PATCH v2 0/4] Replace tty->closing

2015-12-13 Thread Peter Hurley
Greg,

Would you drop these 4 patches from tty-testing please?


On 11/09/2015 04:15 AM, Peter Hurley wrote:
> Hi Greg,
> 
> This series cleans up a messy and poorly documented mechanism required
> at tty final close to prevent drivers from crashing after h/w shutdown.
> 
> Without special handling, N_TTY echoing will cause driver i/o requests
> _after_ h/w shutdown, which typically crashes the driver. Currently,
> the tty_struct::closing flag triggers this special handling. However,
> this mechanism is error-prone and subject to driver misuse.
> 
> This series replaces tty->closing with a ldisc-specific interface,
> tty_ldisc_closing(), and implements this interface for N_TTY.
> For tty drivers which use tty_port_close_start(), this change eliminates
> the last vestige of direct driver<->ldisc interaction. The few tty drivers
> which open-code the close() method [1] still use tty_ldisc_closing() directly.
> 
> The tty driver is aware final close for the tty has commenced because the
> tty->count == 1 in the close() method. On final close, the following is
> also true:
> 1. port->count == 1. tty drivers which ref count the port, use the
>--port->count == 0 as a substitute condition for final close.

I overlooked the implications of a blocked open here.

Since a blocked open drops the port count while blocking, a port count of 1
is not equivalent to a tty count of 1 at tty_release() time. So even though
the port is shutting down, the extra tty count held by the blocking open
will keep the ldisc instance from being destructed; iow, a port count of 1
does _not_ imply final tty close.

For the blocked open itself, this would not be a problem because the open
will error out anyway, drop the tty count and cause final close. (And, oddly,
trigger a second port shutdown which I need to consider the implications of.)

However, there is a very small race window where a third process might be
able to successfully reopen the tty, but now would have a dead-stick
ldisc instance (because this series does not reset the ldisc instance back
to the not closing state).

Regards,
Peter Hurley


> 2. final close is occurring as a result of the last in-use file descriptor
>release. Consequently, there will be no read/poll/ioctl in-progress.
> 3. the line discipline instance will be stopped and destroyed immediately
>after the tty driver completes the close() method
> 4. the tty itself will be unrefed immediately after the line discipline
>instance is destroyed.
> 
> Thus, the ldisc and tty buffers need only be flushed once, as any data
> received by the tty driver after the flush but before h/w shutdown will
> be deleted when the line discipline instance is destroyed.
> No new echoes will occur after the ldisc flush because the echo buffer
> is also flushed and new input (which otherwise might generate echoes) is
> ignored while closing. This series removes the extra tty_ldisc_flush()
> being performed by most drivers after h/w shutdown.
> 
> Additionally, the ldisc closing state need not be reset since the line
> discipline instance is being destroyed, so no interface is provided to reset
> closing.
> 
> 
> [1] tty drivers which open-code the close() method
> drivers/staging/dgnc/dgnc_tty.c
> drivers/staging/dgap/dgap.c
> drivers/tty/hvc/hvsi.c
> drivers/tty/serial/68328serial.c
> drivers/tty/serial/crisv10.c
> drivers/isdn/i4l/isdn_tty.c
> drivers/s390/char/con3215.c
> 
> Changes to v2:
> * Fixed tty_ldisc_closing() ld use found by Johannes Stezenbach 
> <j...@sig21.net>
> 
> 
> Regards,
> 
> Peter Hurley (4):
>   tty: rocket: Remove private close_wait
>   n_tty: Ignore all read data when closing
>   tty: Abstract and encapsulate tty->closing behavior
>   tty: Remove drivers' extra tty_ldisc_flush()
> 
>  drivers/char/pcmcia/synclink_cs.c |  3 ---
>  drivers/isdn/i4l/isdn_tty.c   |  2 +-
>  drivers/s390/char/con3215.c   |  3 +--
>  drivers/staging/dgap/dgap.c   |  4 +---
>  drivers/staging/dgnc/dgnc_tty.c   |  4 +---
>  drivers/tty/amiserial.c   |  2 --
>  drivers/tty/hvc/hvsi.c|  2 +-
>  drivers/tty/ipwireless/tty.c  |  1 -
>  drivers/tty/n_tty.c   | 15 +++
>  drivers/tty/rocket.c  |  5 -
>  drivers/tty/rocket_int.h  |  1 -
>  drivers/tty/serial/68328serial.c  |  3 +--
>  drivers/tty/serial/crisv10.c  |  3 +--
>  drivers/tty/serial/serial_core.c  |  3 ---
>  drivers/tty/synclink.c|  1 -
>  drivers/tty/synclink_gt.c |  1 -
>  drivers/tty/synclinkmp.c  |  1 -
>  drivers/tty/tty_ldisc.c   | 20 
>  drivers/tty/tty_port.c|  5 +
>  include/linux/tty.h   |  2 +-
>  include/

Re: [PATCH 2/3] ser_gigaset: fix deallocation of platform device structure

2015-12-10 Thread Peter Hurley
Hi Tilman,

On 12/09/2015 03:10 AM, Tilman Schmidt wrote:
> Am 09.12.2015 um 00:12 schrieb Paul Bolle:
> 
>>> --- a/drivers/isdn/gigaset/ser-gigaset.c
>>> +++ b/drivers/isdn/gigaset/ser-gigaset.c
>>> @@ -370,19 +370,23 @@ static void gigaset_freecshw(struct cardstate
>>> *cs)
>>> tasklet_kill(>write_tasklet);
>>> if (!cs->hw.ser)
>>> return;
>>> -   dev_set_drvdata(>hw.ser->dev.dev, NULL);
>>> platform_device_unregister(>hw.ser->dev);
>>> -   kfree(cs->hw.ser);
>>> -   cs->hw.ser = NULL;
>>>  }
>>>  
>>>  static void gigaset_device_release(struct device *dev)
>>>  {
>>> struct platform_device *pdev = to_platform_device(dev);
>>> +   struct cardstate *cs = dev_get_drvdata(dev);
>>>  
>>> /* adapted from platform_device_release() in
>>> drivers/base/platform.c */
>>> kfree(dev->platform_data);
>>> kfree(pdev->resource);
>>> +
>>> +   if (!cs)
>>> +   return;
>>> +   dev_set_drvdata(dev, NULL);

This is of marginal value and (I think) unnecessary; it implies
the core will use the device after release, which would trigger
many problems if true.


>> dev equals cs->hw.ser->dev.dev, doesn't it?
> 
> Correct.
> 
>> So what does setting
>> cs->hw.ser->dev.dev.driver_data to NULL just before freeing it buy us?
> 
> We're freeing cs->hw.ser, not cs->hw.ser->dev.
> Clearing the reference to cs from the device structure before freeing cs
> guards against possible use-after-free.
> 
>>> +   kfree(cs->hw.ser);
>>> +   cs->hw.ser = NULL;

This pattern is common, and defends against much more common
driver bugs.

Unfortunately, much of the good this pattern is intended to do in finding
use-after-free bugs is undone by explicit tests for null everywhere else.
Not saying that's the case here; rather, generally speaking.

Like the
if (!tty && !tty->ops && )

code.

Better just to let it crash.

Regards,
Peter Hurley


>> I might be missing something, but what does setting this to NULL buy us
>> here?
> 
> Just defensive programming. Guarding against possible use-after-free or
> double-free.
> 
>>
>> (I realize that I'm asking questions to code that isn't actually new but
>> only moved around, but I think that's still an opportunity to have
>> another look at that code.)
> 
> I'm a big fan of one change per patch. If we also want to modify the
> moved code then that should be done in a separate patch. It makes
> bisecting so much easier. Same reason why I separated out patch 3/3. And
> btw same reason why I think patch 1/3 should go in as-is, as an obvious
> fix to commit f34d7a5b, and any concerns about whether those tests are
> useful should be addressed by a separate patch.
> 
> Regards,
> Tilman
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gigaset: freeing an active object

2015-12-02 Thread Peter Hurley
On 11/30/2015 01:01 PM, Paul Bolle wrote:
> On ma, 2015-11-30 at 00:23 +0100, Paul Bolle wrote:
>> Relevant part of dmesg attached at the end of this message. This
>> should give me (and Tilman too?) an entry to get to bottom of this. 
>> Since this is relevant for anyone with just the ser-gigaset module 
>> installed, I hope to do that soon.
> 
> I'm planning to send something similar to the attached draft to netdev
> in a few days. It fixes the issue on my machine. Sascha, does it fix
> this issue for syzkaller too? 
> 
> Should (something like) this go into stable too?

Definitely for stable since it has a userspace triggerable component.

> Any further comments on that draft are appreciated too, of course.
> 
> 
> Paul Bolle
> --
> [DRAFT] gigaset: don't free() a struct platform_device
> 
> One is not supposed to free() a struct platform_device. Instead one
> should, in the common case, only call platform_device_unregister(). That
> will drop the platform device's reference count. (Actually it's the
> reference count of the embedded kobject that is important here. But for
> users of platform devices that's basically irrelevant.)
> 
> So move struct platform_device dev out of struct ser_cardstate, because
> ser_cardstate is (malloc'ed and) free'd.
> 
> Reported-by: Sasha Levin <sasha.le...@oracle.com>
> Not-yet-signed-off-by: Paul Bolle <pebo...@tiscali.nl>
> ---
>  drivers/isdn/gigaset/ser-gigaset.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/isdn/gigaset/ser-gigaset.c 
> b/drivers/isdn/gigaset/ser-gigaset.c
> index 375be509e95f..f8ffa253496e 100644
> --- a/drivers/isdn/gigaset/ser-gigaset.c
> +++ b/drivers/isdn/gigaset/ser-gigaset.c
> @@ -42,8 +42,9 @@ MODULE_PARM_DESC(cidmode, "stay in CID mode when idle");
>  
>  static struct gigaset_driver *driver;
>  
> +static struct platform_device pdev;
> +
>  struct ser_cardstate {
> - struct platform_device  dev;
>   struct tty_struct   *tty;
>   atomic_trefcnt;
>   struct completion   dead_cmp;
> @@ -370,8 +371,8 @@ static void gigaset_freecshw(struct cardstate *cs)
>   tasklet_kill(>write_tasklet);
>   if (!cs->hw.ser)
>   return;
> - dev_set_drvdata(>hw.ser->dev.dev, NULL);
> - platform_device_unregister(>hw.ser->dev);
> + dev_set_drvdata(, NULL);
> + platform_device_unregister();
>   kfree(cs->hw.ser);

Tilman,

Is there a 1:1 correspondence and lifetime for the embedded platform
device and it's containing memory?

I ask because the typical approach for device teardown is to put the
kfree() in the release method; naturally, that won't work if there
is some other lifetime issue.

Regards,
Peter Hurley


>   cs->hw.ser = NULL;
>  }
> @@ -401,17 +402,17 @@ static int gigaset_initcshw(struct cardstate *cs)
>   }
>   cs->hw.ser = scs;
>  
> - cs->hw.ser->dev.name = GIGASET_MODULENAME;
> - cs->hw.ser->dev.id = cs->minor_index;
> - cs->hw.ser->dev.dev.release = gigaset_device_release;
> - rc = platform_device_register(>hw.ser->dev);
> + pdev.name = GIGASET_MODULENAME;
> + pdev.id = cs->minor_index;
> + pdev.dev.release = gigaset_device_release;
> + rc = platform_device_register();
>   if (rc != 0) {
>   pr_err("error %d registering platform device\n", rc);
>   kfree(cs->hw.ser);
>   cs->hw.ser = NULL;
>   return rc;
>   }
> - dev_set_drvdata(>hw.ser->dev.dev, cs);
> + dev_set_drvdata(, cs);
>  
>   tasklet_init(>write_tasklet,
>gigaset_modem_fill, (unsigned long) cs);
> @@ -520,7 +521,7 @@ gigaset_tty_open(struct tty_struct *tty)
>   goto error;
>   }
>  
> - cs->dev = >hw.ser->dev.dev;
> + cs->dev = 
>   cs->hw.ser->tty = tty;
>   atomic_set(>hw.ser->refcnt, 1);
>   init_completion(>hw.ser->dead_cmp);
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gigaset: freeing an active object

2015-11-29 Thread Peter Hurley
Hi Tilman,

On 11/29/2015 10:30 AM, Tilman Schmidt wrote:
> Hi Sasha,
> 
> thanks for the report. As the original author of the code in question, I
> am somewhat at a loss what to make of it.
> 
> Am 27.11.2015 um 16:19 schrieb Sasha Levin:
>> Fuzzing with syzkaller on the latest -next kernel produced this error:
> 
> Is there a way to know the actual sequence of events that triggered this
> warning?
> 
>> [  413.536749] WARNING: CPU: 6 PID: 25400 at lib/debugobjects.c:263 
>> debug_print_object+0x1c4/0x1e0()
>> [  413.538111] ODEBUG: free active (active state 0) object type: timer_list 
>> hint: delayed_work_timer_fn+0x0/0x90
> 
> This message seems to indicate that an object of type timer_list was
> freed which was still active. However the driver in question
> (ser_gigaset) does not use any timers.
> 
> What are the exact conditions for producing this message? IOW how does
> the ODEBUG code determine that an object of type timer_list is being
> freed, and that it is still in use?
> 
> Are there any messages from ser_gigaset or another one of the gigaset
> drivers before that warning?
> 
>> [  413.539598] Modules linked 
>> in:3470693efef57268844f02f5de3ab392d8cf5e209671ddd87163cb964c510659
> 
> This message does not tell me anything. What does that hex string after
> the colon mean?
> 
>> [  413.540448] CPU: 6 PID: 25400 Comm: syzkaller_execu Not tainted 
>> 4.4.0-rc2-next-20151126-sasha-5-g00d303e-dirty #2653
>> [  413.547614] Call Trace:
>> [  413.548077]  [] dump_stack+0x72/0xb7
>> [  413.548765]  [] warn_slowpath_common+0x113/0x140
>> [  413.551151]  [] warn_slowpath_fmt+0xcb/0x100
>> [  413.554295]  [] debug_print_object+0x1c4/0x1e0
>> [  413.556592]  [] __debug_check_no_obj_freed+0x215/0x7a0
>> [  413.560526]  [] debug_check_no_obj_freed+0x2c/0x40
>> [  413.561328]  [] kfree+0x1fc/0x2f0
> 
> Judging from the backtrace below this must be the call
> 
> kfree(cs->hw.ser);
> 
> in drivers/isdn/gigaset/ser-gigaset.c line 375.
> cs->hw.ser is of type struct ser_cardstate *.
> struct ser_cardstate consists of a struct platform_device, a struct
> completion, an atomic_t and a pointer. No timer_list.
> 
>> [  413.561970]  [] gigaset_freecshw+0xe1/0x120
> 
> There are functions by this name in all three Gigaset hardware dependent
> modules (bas_gigaset, ser_gigaset and usb_gigaset), but ...
> 
>> [  413.562723]  [] gigaset_freecs+0x2ad/0x600
>> [  413.564240]  [] gigaset_tty_close+0x210/0x280
> 
> this function only exists in ser_gigaset.

The platform_device embedded in struct ser_cardstate hasn't been released when
you kfree() the memory it's in.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] wan/x25: Fix use-after-free in x25_asy_open_tty()

2015-11-27 Thread Peter Hurley
The N_X25 line discipline may access the previous line discipline's closed
and already-freed private data on open [1].

The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).

[1]
[  634.336761] 
==
[  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 
at addr 8800a743efd0
[  634.339558] Read of size 4 by task syzkaller_execu/8981
[  634.340359] 
=
[  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
...
[  634.405018] Call Trace:
[  634.405277] dump_stack (lib/dump_stack.c:52)
[  634.405775] print_trailer (mm/slub.c:655)
[  634.406361] object_err (mm/slub.c:662)
[  634.406824] kasan_report_error (mm/kasan/report.c:138 
mm/kasan/report.c:236)
[  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
[  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 
(discriminator 1))
[  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
[  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
[  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 
drivers/tty/tty_io.c:2879)
[  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)

Reported-and-tested-by: Sasha Levin <sasha.le...@oracle.com>
Cc: <sta...@vger.kernel.org>
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/net/wan/x25_asy.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index 5c47b01..cd39025 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -549,16 +549,12 @@ static void x25_asy_receive_buf(struct tty_struct *tty,
 
 static int x25_asy_open_tty(struct tty_struct *tty)
 {
-   struct x25_asy *sl = tty->disc_data;
+   struct x25_asy *sl;
int err;
 
if (tty->ops->write == NULL)
return -EOPNOTSUPP;
 
-   /* First make sure we're not already connected. */
-   if (sl && sl->magic == X25_ASY_MAGIC)
-   return -EEXIST;
-
/* OK.  Find a free X.25 channel to use. */
sl = x25_asy_alloc();
if (sl == NULL)
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gigaset: freeing an active object

2015-11-27 Thread Peter Hurley
On 11/27/2015 07:28 PM, Paul Bolle wrote:
> (A few quick notes follow. The hope here is basically that my display of
> ignorance might trigger others to speak up while I'm still pondering on
> this bug.)
> 
> On vr, 2015-11-27 at 13:15 -0500, Sasha Levin wrote:
>> On 11/27/2015 12:57 PM, Paul Bolle wrote:
>>> On vr, 2015-11-27 at 10:19 -0500, Sasha Levin wrote:
>>>> Fuzzing with syzkaller on the latest -next kernel produced this
>>>> error:
>>>
>>> (syzkaller is new to me. I'll have to do some web searches.)
>>
>> It's a new fancy syscall/ioctl fuzzer, 
>> https://github.com/google/syzkaller/blob/master/README.md
> 
> Thanks.
> 
> That fuzzer apparently requires either CONFIG_KASAN, CONFIG_KTSAN, or
> CONFIG_UBSAN, none of which I'm familiar with.
> 
>>>> [  413.536749] WARNING: CPU: 6 PID: 25400 at
>>>> lib/debugobjects.c:263
>>>> debug_print_object+0x1c4/0x1e0()
>>>> [  413.538111] ODEBUG: free active (active state 0) object type:
>>>> timer_list hint: delayed_work_timer_fn+0x0/0x90
> 
> There are two places that add "free" here, but there's no obvious way to
> distinguish between them. Annoying.
> 
> Anyhow, this is interesting. ser-gigaset doesn't use timer_list while
> bas-gigaset does.
> 
>>>> [  413.539598] Modules linked
>>>> in:3470693efef57268844f02f5de3ab392d8cf5e209671ddd87163cb964c51065
>>>> 9
> 
> Not sure what this means. The bug concerns ser-gigaset so it's safe to
> assume the fuzzer at one point called
> ioctl(fd, TIOCSETD, N_GIGASET_M101)

Sasha,

It would really help if you included the syzkaller-generated applet with
the bug reports; state previously established by the applet can be
crucial in understanding why the call stack looks the way it does.

Also, every generated applet that triggers a report should become
a future regression test; I'm collecting the ones pertinent to tty/serial/
ldisc (so that includes this one; if you could send me the x25 one too
would be great).

Regards,
Peter Hurley


> which would trigger the use of ser-gigaset.
> 
>>>> [  413.540448] CPU: 6 PID: 25400 Comm: syzkaller_execu Not tainted
>>>> 4.4.0-rc2-next-20151126-sasha-5-g00d303e-dirty #2653
>>>> [  413.547614] Call Trace:
>>>> [  413.548077]  [] dump_stack+0x72/0xb7
>>>> [  413.548765]  []
>>>> warn_slowpath_common+0x113/0x140
>>>> [  413.551151]  [] warn_slowpath_fmt+0xcb/0x100
>>>> [  413.554295]  []
>>>> debug_print_object+0x1c4/0x1e0
>>>> [  413.556592]  []
>>>> __debug_check_no_obj_freed+0x215/0x7a0
>>>> [  413.560526]  []
>>>> debug_check_no_obj_freed+0x2c/0x40
>>>> [  413.561328]  [] kfree+0x1fc/0x2f0
> 
> This should be
> kfree(cs->hw.ser)
> 
> Note that cs->hw is a union of struct base_cardstate, struct
> ser_cardstate, and struct usb_cardstate. And it's obvious that struct
> ser_cardstate is much smaller that struct bas_cardstate. So we're
> probably free-ing some memory that, so to speak, includes
> bas_cardstate.timer_int_in, a struct timer_list. That's likely way
> beyond the end of struct ser_cardstate and so it should still contain
> garbage.
> 
>>>> [  413.561970]  [] gigaset_freecshw+0xe1/0x120
>>>> [  413.562723]  [] gigaset_freecs+0x2ad/0x600
>>>> [  413.564240]  [] gigaset_tty_close+0x210/0x280
>>>> [  413.565774]  []
>>>> tty_ldisc_close.isra.1+0xc2/0xd0
>>>> [  413.566550]  [] tty_ldisc_kill+0x4b/0x170
>>>> [  413.567253]  [] tty_ldisc_release+0x183/0x240
>>>> [  413.568000]  [] tty_release+0xd1c/0xe80
>>>> [  413.570176]  [] __fput+0x32a/0x680
>>>> [  413.570888]  [] fput+0x1a/0x20
>>>> [  413.571565]  [] task_work_run+0x19c/0x1e0
>>>> [  413.572290]  [] do_exit+0xdf7/0x28f0
>>>> [  413.576188]  [] do_group_exit+0x1b5/0x300
>>>> [  413.576905]  [] get_signal+0x1182/0x1360
>>>> [  413.577627]  [] do_signal+0x93/0x1690
>>>> [  413.584630]  []
>>>> exit_to_usermode_loop+0xc0/0x1e0
>>>> [  413.585412]  []
>>>> prepare_exit_to_usermode+0x10b/0x140
>>>> [  413.586187]  [] retint_user+0x8/0x23
>>>
> 
> I have no idea (yet) what triggers retint_user.
> 
> Anyhow, my first hunch is to do a s/kmalloc/kzalloc/ on
> drv->cs = kmalloc(minors * sizeof *drv->cs, GFP_KERNEL)
> 
> in drivers/isdn/gigaset/common.c and see if this still triggers. But to
> test that I need to know how to reproduce this.
> 
> To be continued...


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tty,net: use-after-free in x25_asy_open_tty

2015-11-20 Thread Peter Hurley
[ + David Miller ]

On 11/20/2015 08:56 AM, Sasha Levin wrote:
> Hi all,
> 
> While fuzzing with syzkaller inside a kvmtools guest running latest -next 
> kernel, I've hit:
> 
> [  634.336761] 
> ==
> [  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at 
> addr 8800a743efd0
> [  634.339558] Read of size 4 by task syzkaller_execu/8981
> [  634.340359] 
> =
> [  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected

Thanks for the report, Sasha.
Would you please test the patch below?

The ldisc api should really prevent these kinds of errors. I'll prepare
a patch to the tty core which should address the api weakness.

Regards,
Peter Hurley

--->% ---
Subject: [PATCH] wan/x25: Fix use-after-free in x25_asy_open_tty()

The N_X25 line discipline may access the previous line discipline's closed
and already-freed private data on open [1].

The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).

[1] Report by Sasha Levin <sasha.le...@oracle.com>
[  634.336761] 
==
[  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 
at addr 8800a743efd0
[  634.339558] Read of size 4 by task syzkaller_execu/8981
[  634.340359] 
=
[  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
...
[  634.405018] Call Trace:
[  634.405277] dump_stack (lib/dump_stack.c:52)
[  634.405775] print_trailer (mm/slub.c:655)
[  634.406361] object_err (mm/slub.c:662)
[  634.406824] kasan_report_error (mm/kasan/report.c:138 
mm/kasan/report.c:236)
[  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
[  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 
(discriminator 1))
[  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
[  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
[  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 
drivers/tty/tty_io.c:2879)
[  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)

Reported-by: Sasha Levin <sasha.le...@oracle.com>
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/net/wan/x25_asy.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index 5c47b01..cd39025 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -549,16 +549,12 @@ static void x25_asy_receive_buf(struct tty_struct *tty,
 
 static int x25_asy_open_tty(struct tty_struct *tty)
 {
-   struct x25_asy *sl = tty->disc_data;
+   struct x25_asy *sl;
int err;
 
if (tty->ops->write == NULL)
return -EOPNOTSUPP;
 
-   /* First make sure we're not already connected. */
-   if (sl && sl->magic == X25_ASY_MAGIC)
-   return -EEXIST;
-
/* OK.  Find a free X.25 channel to use. */
sl = x25_asy_alloc();
if (sl == NULL)
-- 
2.6.3


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] Replace tty->closing

2015-11-09 Thread Peter Hurley
Hi Greg,

This series cleans up a messy and poorly documented mechanism required
at tty final close to prevent drivers from crashing after h/w shutdown.

Without special handling, N_TTY echoing will cause driver i/o requests
_after_ h/w shutdown, which typically crashes the driver. Currently,
the tty_struct::closing flag triggers this special handling. However,
this mechanism is error-prone and subject to driver misuse.

This series replaces tty->closing with a ldisc-specific interface,
tty_ldisc_closing(), and implements this interface for N_TTY.
For tty drivers which use tty_port_close_start(), this change eliminates
the last vestige of direct driver<->ldisc interaction. The few tty drivers
which open-code the close() method [1] still use tty_ldisc_closing() directly.

The tty driver is aware final close for the tty has commenced because the
tty->count == 1 in the close() method. On final close, the following is
also true:
1. port->count == 1. tty drivers which ref count the port, use the
   --port->count == 0 as a substitute condition for final close.
2. final close is occurring as a result of the last in-use file descriptor
   release. Consequently, there will be no read/poll/ioctl in-progress.
3. the line discipline instance will be stopped and destroyed immediately
   after the tty driver completes the close() method
4. the tty itself will be unrefed immediately after the line discipline
   instance is destroyed.

Thus, the ldisc and tty buffers need only be flushed once, as any data
received by the tty driver after the flush but before h/w shutdown will
be deleted when the line discipline instance is destroyed.
No new echoes will occur after the ldisc flush because the echo buffer
is also flushed and new input (which otherwise might generate echoes) is
ignored while closing. This series removes the extra tty_ldisc_flush()
being performed by most drivers after h/w shutdown.

Additionally, the ldisc closing state need not be reset since the line
discipline instance is being destroyed, so no interface is provided to reset
closing.


[1] tty drivers which open-code the close() method
drivers/staging/dgnc/dgnc_tty.c
drivers/staging/dgap/dgap.c
drivers/tty/hvc/hvsi.c
drivers/tty/serial/68328serial.c
drivers/tty/serial/crisv10.c
drivers/isdn/i4l/isdn_tty.c
drivers/s390/char/con3215.c

Changes to v2:
* Fixed tty_ldisc_closing() ld use found by Johannes Stezenbach <j...@sig21.net>


Regards,

Peter Hurley (4):
  tty: rocket: Remove private close_wait
  n_tty: Ignore all read data when closing
  tty: Abstract and encapsulate tty->closing behavior
  tty: Remove drivers' extra tty_ldisc_flush()

 drivers/char/pcmcia/synclink_cs.c |  3 ---
 drivers/isdn/i4l/isdn_tty.c   |  2 +-
 drivers/s390/char/con3215.c   |  3 +--
 drivers/staging/dgap/dgap.c   |  4 +---
 drivers/staging/dgnc/dgnc_tty.c   |  4 +---
 drivers/tty/amiserial.c   |  2 --
 drivers/tty/hvc/hvsi.c|  2 +-
 drivers/tty/ipwireless/tty.c  |  1 -
 drivers/tty/n_tty.c   | 15 +++
 drivers/tty/rocket.c  |  5 -
 drivers/tty/rocket_int.h  |  1 -
 drivers/tty/serial/68328serial.c  |  3 +--
 drivers/tty/serial/crisv10.c  |  3 +--
 drivers/tty/serial/serial_core.c  |  3 ---
 drivers/tty/synclink.c|  1 -
 drivers/tty/synclink_gt.c |  1 -
 drivers/tty/synclinkmp.c  |  1 -
 drivers/tty/tty_ldisc.c   | 20 
 drivers/tty/tty_port.c|  5 +
 include/linux/tty.h   |  2 +-
 include/linux/tty_ldisc.h |  9 +
 21 files changed, 49 insertions(+), 41 deletions(-)

-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] tty: Remove drivers' extra tty_ldisc_flush()

2015-11-09 Thread Peter Hurley
The tty_port_close_start() helper already flushes the tty and ldisc
buffers on final close; tty drivers which use this helper need not
repeat tty_ldisc_flush().

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/char/pcmcia/synclink_cs.c | 3 ---
 drivers/tty/amiserial.c   | 2 --
 drivers/tty/rocket.c  | 2 --
 drivers/tty/serial/serial_core.c  | 2 --
 drivers/tty/synclink.c| 1 -
 drivers/tty/synclink_gt.c | 1 -
 drivers/tty/synclinkmp.c  | 1 -
 drivers/tty/tty_port.c| 2 --
 8 files changed, 14 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index 45df4bf..3f74677 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -2354,10 +2354,7 @@ static void mgslpc_close(struct tty_struct *tty, struct 
file * filp)
mgslpc_wait_until_sent(tty, info->timeout);
 
mgslpc_flush_buffer(tty);
-
-   tty_ldisc_flush(tty);
shutdown(info, tty);
-   
tty_port_close_end(port, tty);
tty_port_tty_set(port, NULL);
 cleanup:
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index 2caaf5a..bffc0a4 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -1420,8 +1420,6 @@ static void rs_close(struct tty_struct *tty, struct file 
* filp)
}
shutdown(tty, state);
rs_flush_buffer(tty);
-   
-   tty_ldisc_flush(tty);
port->tty = NULL;
 
tty_port_close_end(port, tty);
diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index a6b5ce0..5905200 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -1022,8 +1022,6 @@ static void rp_close(struct tty_struct *tty, struct file 
*filp)
sClrDTR(cp);
 
rp_flush_buffer(tty);
-   
-   tty_ldisc_flush(tty);
 
clear_bit((info->aiop * 8) + info->chan, (void *) 
_flags[info->board]);
 
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index db27a40..418587f 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1439,8 +1439,6 @@ static void uart_close(struct tty_struct *tty, struct 
file *filp)
wake_up_interruptible(>open_wait);
 
mutex_unlock(>mutex);
-
-   tty_ldisc_flush(tty);
 }
 
 static void uart_wait_until_sent(struct tty_struct *tty, int timeout)
diff --git a/drivers/tty/synclink.c b/drivers/tty/synclink.c
index 6188059..1334498 100644
--- a/drivers/tty/synclink.c
+++ b/drivers/tty/synclink.c
@@ -3099,7 +3099,6 @@ static void mgsl_close(struct tty_struct *tty, struct 
file * filp)
if (info->port.flags & ASYNC_INITIALIZED)
mgsl_wait_until_sent(tty, info->timeout);
mgsl_flush_buffer(tty);
-   tty_ldisc_flush(tty);
shutdown(info);
mutex_unlock(>port.mutex);
 
diff --git a/drivers/tty/synclink_gt.c b/drivers/tty/synclink_gt.c
index 5505ea8..1987fb4 100644
--- a/drivers/tty/synclink_gt.c
+++ b/drivers/tty/synclink_gt.c
@@ -729,7 +729,6 @@ static void close(struct tty_struct *tty, struct file *filp)
if (info->port.flags & ASYNC_INITIALIZED)
wait_until_sent(tty, info->timeout);
flush_buffer(tty);
-   tty_ldisc_flush(tty);
 
shutdown(info);
mutex_unlock(>port.mutex);
diff --git a/drivers/tty/synclinkmp.c b/drivers/tty/synclinkmp.c
index fb00a06..fb17dac 100644
--- a/drivers/tty/synclinkmp.c
+++ b/drivers/tty/synclinkmp.c
@@ -816,7 +816,6 @@ static void close(struct tty_struct *tty, struct file *filp)
wait_until_sent(tty, info->timeout);
 
flush_buffer(tty);
-   tty_ldisc_flush(tty);
shutdown(info);
mutex_unlock(>port.mutex);
 
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index ecb6435..c30525a 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -503,8 +503,6 @@ void tty_port_close_end(struct tty_port *port, struct 
tty_struct *tty)
 {
unsigned long flags;
 
-   tty_ldisc_flush(tty);
-
spin_lock_irqsave(>lock, flags);
 
if (port->blocked_open) {
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] n_tty: Ignore all read data when closing

2015-11-09 Thread Peter Hurley
On final port close (and thus final tty close), only output flow
control requests in the input data should be processed. Ignore all
other input data, including parity errors, overruns and breaks.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/n_tty.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index bc613b8..2de0283 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1564,8 +1564,6 @@ n_tty_receive_buf_closing(struct tty_struct *tty, const 
unsigned char *cp,
flag = *fp++;
if (likely(flag == TTY_NORMAL))
n_tty_receive_char_closing(tty, *cp++);
-   else
-   n_tty_receive_char_flagged(tty, *cp++, flag);
}
 }
 
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] tty: Abstract and encapsulate tty->closing behavior

2015-11-09 Thread Peter Hurley
tty->closing is exclusively used to cause the N_TTY line discipline
to drop further input on final tty close (except XON/XOFF output flow
control changes). In turn, this prevents the line discipline from
generating new tty driver i/o requests (eg., when echoing) after the tty
driver has performed h/w shutdown.

Abstract this notification with new ldisc api function,
tty_ldisc_closing(), which invokes the new line discipline method,
ops->closing(). Define this method for N_TTY line discipline and
localize closing state to n_tty private data. Remove tty->closing.

Note that resetting tty->closing to 0 (and thus allowing the
line discipline to resume normal input processing) is unnecessary
and undesirable: since the tty is in final close, both the line
discipline instance and the tty are being destroyed, so resuming normal
input processing after h/w shutdown is counter-productive.

NB: ipwireless_tty_free() is completely bogus; freeing the tty (?!) with
open, in-use file descriptors is laughable.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---

v2: Fixed tty_ldisc_closing() ld use found by Johannes Stezenbach 
<j...@sig21.net>

 drivers/isdn/i4l/isdn_tty.c  |  2 +-
 drivers/s390/char/con3215.c  |  3 +--
 drivers/staging/dgap/dgap.c  |  4 +---
 drivers/staging/dgnc/dgnc_tty.c  |  4 +---
 drivers/tty/hvc/hvsi.c   |  2 +-
 drivers/tty/ipwireless/tty.c |  1 -
 drivers/tty/n_tty.c  | 13 +++--
 drivers/tty/rocket.c |  1 -
 drivers/tty/serial/68328serial.c |  3 +--
 drivers/tty/serial/crisv10.c |  3 +--
 drivers/tty/serial/serial_core.c |  1 -
 drivers/tty/tty_ldisc.c  | 20 
 drivers/tty/tty_port.c   |  3 +--
 include/linux/tty.h  |  2 +-
 include/linux/tty_ldisc.h|  9 +
 15 files changed, 49 insertions(+), 22 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index 2175225..cddba25 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1574,7 +1574,7 @@ isdn_tty_close(struct tty_struct *tty, struct file *filp)
}
port->flags |= ASYNC_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
/*
 * At this point we stop accepting input.  To do this, we
 * disable the receive line status interrupts, and tell the
diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c
index 0fc3fe5..715251d 100644
--- a/drivers/s390/char/con3215.c
+++ b/drivers/s390/char/con3215.c
@@ -1006,11 +1006,10 @@ static void tty3215_close(struct tty_struct *tty, 
struct file * filp)
raw = (struct raw3215_info *) tty->driver_data;
if (raw == NULL || tty->count > 1)
return;
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
/* Shutdown the terminal */
raw3215_shutdown(raw);
tasklet_kill(>tlet);
-   tty->closing = 0;
tty_port_tty_set(>port, NULL);
 }
 
diff --git a/drivers/staging/dgap/dgap.c b/drivers/staging/dgap/dgap.c
index 9112dd2..0456e28 100644
--- a/drivers/staging/dgap/dgap.c
+++ b/drivers/staging/dgap/dgap.c
@@ -4622,7 +4622,7 @@ static void dgap_tty_close(struct tty_struct *tty, struct 
file *file)
 
un->un_flags |= UN_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
 
/*
 * Only officially close channel if count is 0 and
@@ -4645,8 +4645,6 @@ static void dgap_tty_close(struct tty_struct *tty, struct 
file *file)
 
spin_lock_irqsave(>ch_lock, lock_flags);
 
-   tty->closing = 0;
-
/*
 * If we have HUPCL set, lower DTR and RTS
 */
diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index fbfe79a..96960d8 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -1410,7 +1410,7 @@ static void dgnc_tty_close(struct tty_struct *tty, struct 
file *file)
/* OK, its the last close on the unit */
un->un_flags |= UN_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
 
 
/*
@@ -1441,8 +1441,6 @@ static void dgnc_tty_close(struct tty_struct *tty, struct 
file *file)
 
spin_lock_irqsave(>ch_lock, flags);
 
-   tty->closing = 0;
-
/*
 * If we have HUPCL set, lower DTR and RTS
 */
diff --git a/drivers/tty/hvc/hvsi.c b/drivers/tty/hvc/hvsi.c
index a75146f..fbaa6ab 100644
--- a/drivers/tty/hvc/hvsi.c
+++ b/drivers/tty/hvc/hvsi.c
@@ -796,7 +796,7 @@ static void hvsi_close(struct tty_struct *tty, struct file 
*filp)
 * any data delivered to the tty layer after this will 
be
 * discarded (except for XON/XOFF)
 */
-   tty->closing = 1;
+   tty_ldi

[PATCH v2 1/4] tty: rocket: Remove private close_wait

2015-11-09 Thread Peter Hurley
This driver's private completion variable, close_wait, is no longer
used for wait since "tty: Remove ASYNC_CLOSING checks in open()/hangup";
remove.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/rocket.c | 2 --
 drivers/tty/rocket_int.h | 1 -
 2 files changed, 3 deletions(-)

diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index 802eac7..c5179f8 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -643,7 +643,6 @@ static void init_r_port(int board, int aiop, int chan, 
struct pci_dev *pci_dev)
info->chan = chan;
tty_port_init(>port);
info->port.ops = _port_ops;
-   init_completion(>close_wait);
info->flags &= ~ROCKET_MODE_MASK;
switch (pc104[board][line]) {
case 422:
@@ -1049,7 +1048,6 @@ static void rp_close(struct tty_struct *tty, struct file 
*filp)
mutex_unlock(>mutex);
tty_port_tty_set(port, NULL);
 
-   complete_all(>close_wait);
atomic_dec(_num_ports_open);
 
 #ifdef ROCKET_DEBUG_OPEN
diff --git a/drivers/tty/rocket_int.h b/drivers/tty/rocket_int.h
index 67e0f1e..ef1e1be 100644
--- a/drivers/tty/rocket_int.h
+++ b/drivers/tty/rocket_int.h
@@ -1144,7 +1144,6 @@ struct r_port {
int read_status_mask;
int cps;
 
-   struct completion close_wait;   /* Not yet matching the core */
spinlock_t slock;
struct mutex write_mtx;
 };
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] tty: Abstract and encapsulate tty->closing behavior

2015-11-09 Thread Peter Hurley
On 11/09/2015 04:12 AM, Johannes Stezenbach wrote:
> On Sun, Nov 08, 2015 at 05:02:52PM -0500, Peter Hurley wrote:
>> +void tty_ldisc_closing(struct tty_struct *tty)
>> +{
>> +struct tty_ldisc *ld = tty_ldisc_ref(tty);
>> +
>> +if (ld->ops->closing)
>> +ld->ops->closing(tty);
>> +if (ld)
>> +tty_ldisc_deref(ld);
>> +}
> 
> That looks strange.  Do you need to check ld _before_ dereferencing?

Yes. Thanks for noticing.

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] tty: rocket: Remove private close_wait

2015-11-08 Thread Peter Hurley
This driver's private completion variable, close_wait, is no longer
used for wait since "tty: Remove ASYNC_CLOSING checks in open()/hangup";
remove.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/rocket.c | 2 --
 drivers/tty/rocket_int.h | 1 -
 2 files changed, 3 deletions(-)

diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index 802eac7..c5179f8 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -643,7 +643,6 @@ static void init_r_port(int board, int aiop, int chan, 
struct pci_dev *pci_dev)
info->chan = chan;
tty_port_init(>port);
info->port.ops = _port_ops;
-   init_completion(>close_wait);
info->flags &= ~ROCKET_MODE_MASK;
switch (pc104[board][line]) {
case 422:
@@ -1049,7 +1048,6 @@ static void rp_close(struct tty_struct *tty, struct file 
*filp)
mutex_unlock(>mutex);
tty_port_tty_set(port, NULL);
 
-   complete_all(>close_wait);
atomic_dec(_num_ports_open);
 
 #ifdef ROCKET_DEBUG_OPEN
diff --git a/drivers/tty/rocket_int.h b/drivers/tty/rocket_int.h
index 67e0f1e..ef1e1be 100644
--- a/drivers/tty/rocket_int.h
+++ b/drivers/tty/rocket_int.h
@@ -1144,7 +1144,6 @@ struct r_port {
int read_status_mask;
int cps;
 
-   struct completion close_wait;   /* Not yet matching the core */
spinlock_t slock;
struct mutex write_mtx;
 };
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] tty: Remove drivers' extra tty_ldisc_flush()

2015-11-08 Thread Peter Hurley
The tty_port_close_start() helper already flushes the tty and ldisc
buffers on final close; tty drivers which use this helper need not
repeat tty_ldisc_flush().

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/char/pcmcia/synclink_cs.c | 3 ---
 drivers/tty/amiserial.c   | 2 --
 drivers/tty/rocket.c  | 2 --
 drivers/tty/serial/serial_core.c  | 2 --
 drivers/tty/synclink.c| 1 -
 drivers/tty/synclink_gt.c | 1 -
 drivers/tty/synclinkmp.c  | 1 -
 drivers/tty/tty_port.c| 2 --
 8 files changed, 14 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index 45df4bf..3f74677 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -2354,10 +2354,7 @@ static void mgslpc_close(struct tty_struct *tty, struct 
file * filp)
mgslpc_wait_until_sent(tty, info->timeout);
 
mgslpc_flush_buffer(tty);
-
-   tty_ldisc_flush(tty);
shutdown(info, tty);
-   
tty_port_close_end(port, tty);
tty_port_tty_set(port, NULL);
 cleanup:
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index 2caaf5a..bffc0a4 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -1420,8 +1420,6 @@ static void rs_close(struct tty_struct *tty, struct file 
* filp)
}
shutdown(tty, state);
rs_flush_buffer(tty);
-   
-   tty_ldisc_flush(tty);
port->tty = NULL;
 
tty_port_close_end(port, tty);
diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index a6b5ce0..5905200 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -1022,8 +1022,6 @@ static void rp_close(struct tty_struct *tty, struct file 
*filp)
sClrDTR(cp);
 
rp_flush_buffer(tty);
-   
-   tty_ldisc_flush(tty);
 
clear_bit((info->aiop * 8) + info->chan, (void *) 
_flags[info->board]);
 
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index db27a40..418587f 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1439,8 +1439,6 @@ static void uart_close(struct tty_struct *tty, struct 
file *filp)
wake_up_interruptible(>open_wait);
 
mutex_unlock(>mutex);
-
-   tty_ldisc_flush(tty);
 }
 
 static void uart_wait_until_sent(struct tty_struct *tty, int timeout)
diff --git a/drivers/tty/synclink.c b/drivers/tty/synclink.c
index 6188059..1334498 100644
--- a/drivers/tty/synclink.c
+++ b/drivers/tty/synclink.c
@@ -3099,7 +3099,6 @@ static void mgsl_close(struct tty_struct *tty, struct 
file * filp)
if (info->port.flags & ASYNC_INITIALIZED)
mgsl_wait_until_sent(tty, info->timeout);
mgsl_flush_buffer(tty);
-   tty_ldisc_flush(tty);
shutdown(info);
mutex_unlock(>port.mutex);
 
diff --git a/drivers/tty/synclink_gt.c b/drivers/tty/synclink_gt.c
index 5505ea8..1987fb4 100644
--- a/drivers/tty/synclink_gt.c
+++ b/drivers/tty/synclink_gt.c
@@ -729,7 +729,6 @@ static void close(struct tty_struct *tty, struct file *filp)
if (info->port.flags & ASYNC_INITIALIZED)
wait_until_sent(tty, info->timeout);
flush_buffer(tty);
-   tty_ldisc_flush(tty);
 
shutdown(info);
mutex_unlock(>port.mutex);
diff --git a/drivers/tty/synclinkmp.c b/drivers/tty/synclinkmp.c
index fb00a06..fb17dac 100644
--- a/drivers/tty/synclinkmp.c
+++ b/drivers/tty/synclinkmp.c
@@ -816,7 +816,6 @@ static void close(struct tty_struct *tty, struct file *filp)
wait_until_sent(tty, info->timeout);
 
flush_buffer(tty);
-   tty_ldisc_flush(tty);
shutdown(info);
mutex_unlock(>port.mutex);
 
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index ecb6435..c30525a 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -503,8 +503,6 @@ void tty_port_close_end(struct tty_port *port, struct 
tty_struct *tty)
 {
unsigned long flags;
 
-   tty_ldisc_flush(tty);
-
spin_lock_irqsave(>lock, flags);
 
if (port->blocked_open) {
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] Replace tty->closing

2015-11-08 Thread Peter Hurley
Hi Greg,

This series cleans up a messy and poorly documented mechanism required
at tty final close to prevent drivers from crashing after h/w shutdown.

Without special handling, N_TTY echoing will cause driver i/o requests
_after_ h/w shutdown, which typically crashes the driver. Currently,
the tty_struct::closing flag triggers this special handling. However,
this mechanism is error-prone and subject to driver misuse.

This series replaces tty->closing with a ldisc-specific interface,
tty_ldisc_closing(), and implements this interface for N_TTY.
For tty drivers which use tty_port_close_start(), this change eliminates
the last vestige of direct driver<->ldisc interaction. The few tty drivers
which open-code the close() method still use [1] tty_ldisc_closing() directly.

The tty driver is aware final close for the tty is commencing because the
tty->count == 1 in the close() method. On final close, the following is
also true:
1. port->count == 1. tty drivers which ref count the port, use the
   --port->count == 0 as a substitute condition for final close.
2. final close is occurring as a result of the last in-use file descriptor
   release. Consequently, there will be no read/poll/ioctl in-progress.
3. the line discipline instance will be stopped and destroyed immediately
   after the tty driver completes the close() method
4. the tty itself will be unrefed immediately after the line discipline
   instance is destroyed.

Thus, the ldisc and tty buffers need only be flushed once, as any data
received by the tty driver after the flush but before h/w shutdown will
be deleted when the line discipline instance is destroyed.
No new echoes will occur after the ldisc flush because the echo buffer
is also flushed and new input (which otherwise might generate echoes) is
ignored while closing. This series removes the extra tty_ldisc_flush()
being performed by most drivers after h/w shutdown.

Additionally, the ldisc closing state need not be reset since the line
discipline instance is being destroyed, so no interface is provided to reset
closing.


[1] tty drivers which open-code the close() method
drivers/staging/dgnc/dgnc_tty.c
drivers/staging/dgap/dgap.c
drivers/tty/hvc/hvsi.c
drivers/tty/serial/68328serial.c
drivers/tty/serial/crisv10.c
drivers/isdn/i4l/isdn_tty.c
drivers/s390/char/con3215.c


Regards,

Peter Hurley (4):
  tty: rocket: Remove private close_wait
  n_tty: Ignore all read data when closing
  tty: Abstract and encapsulate tty->closing behavior
  tty: Remove drivers' extra tty_ldisc_flush()

 drivers/char/pcmcia/synclink_cs.c |  3 ---
 drivers/isdn/i4l/isdn_tty.c   |  2 +-
 drivers/s390/char/con3215.c   |  3 +--
 drivers/staging/dgap/dgap.c   |  4 +---
 drivers/staging/dgnc/dgnc_tty.c   |  4 +---
 drivers/tty/amiserial.c   |  2 --
 drivers/tty/hvc/hvsi.c|  2 +-
 drivers/tty/ipwireless/tty.c  |  1 -
 drivers/tty/n_tty.c   | 15 +++
 drivers/tty/rocket.c  |  5 -
 drivers/tty/rocket_int.h  |  1 -
 drivers/tty/serial/68328serial.c  |  3 +--
 drivers/tty/serial/crisv10.c  |  3 +--
 drivers/tty/serial/serial_core.c  |  3 ---
 drivers/tty/synclink.c|  1 -
 drivers/tty/synclink_gt.c |  1 -
 drivers/tty/synclinkmp.c  |  1 -
 drivers/tty/tty_ldisc.c   | 19 +++
 drivers/tty/tty_port.c|  5 +
 include/linux/tty.h   |  2 +-
 include/linux/tty_ldisc.h |  9 +
 21 files changed, 48 insertions(+), 41 deletions(-)

-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] n_tty: Ignore all read data when closing

2015-11-08 Thread Peter Hurley
On final port close (and thus final tty close), only output flow
control requests in the input data should be processed. Ignore all
other input data, including parity errors, overruns and breaks.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/n_tty.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index bc613b8..2de0283 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1564,8 +1564,6 @@ n_tty_receive_buf_closing(struct tty_struct *tty, const 
unsigned char *cp,
flag = *fp++;
if (likely(flag == TTY_NORMAL))
n_tty_receive_char_closing(tty, *cp++);
-   else
-   n_tty_receive_char_flagged(tty, *cp++, flag);
}
 }
 
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] tty: Abstract and encapsulate tty->closing behavior

2015-11-08 Thread Peter Hurley
tty->closing is exclusively used to cause the N_TTY line discipline
to drop further input on final tty close (except XON/XOFF output flow
control changes). In turn, this prevents the line discipline from
generating new tty driver i/o requests (eg., when echoing) after the tty
driver has performed h/w shutdown.

Abstract this notification with new ldisc api function,
tty_ldisc_closing(), which invokes the new line discipline method,
ops->closing(). Define this method for N_TTY line discipline and
localize closing state to n_tty private data. Remove tty->closing.

Note that resetting tty->closing to 0 (and thus allowing the
line discipline to resume normal input processing) is unnecessary
and undesirable: since the tty is in final close, both the line
discipline instance and the tty are being destroyed, so resuming normal
input processing after h/w shutdown is counter-productive.

NB: ipwireless_tty_free() is completely bogus; freeing the tty (?!) with
open, in-use file descriptors is laughable.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/isdn/i4l/isdn_tty.c  |  2 +-
 drivers/s390/char/con3215.c  |  3 +--
 drivers/staging/dgap/dgap.c  |  4 +---
 drivers/staging/dgnc/dgnc_tty.c  |  4 +---
 drivers/tty/hvc/hvsi.c   |  2 +-
 drivers/tty/ipwireless/tty.c |  1 -
 drivers/tty/n_tty.c  | 13 +++--
 drivers/tty/rocket.c |  1 -
 drivers/tty/serial/68328serial.c |  3 +--
 drivers/tty/serial/crisv10.c |  3 +--
 drivers/tty/serial/serial_core.c |  1 -
 drivers/tty/tty_ldisc.c  | 19 +++
 drivers/tty/tty_port.c   |  3 +--
 include/linux/tty.h  |  2 +-
 include/linux/tty_ldisc.h|  9 +
 15 files changed, 48 insertions(+), 22 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index 2175225..cddba25 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1574,7 +1574,7 @@ isdn_tty_close(struct tty_struct *tty, struct file *filp)
}
port->flags |= ASYNC_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
/*
 * At this point we stop accepting input.  To do this, we
 * disable the receive line status interrupts, and tell the
diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c
index 0fc3fe5..715251d 100644
--- a/drivers/s390/char/con3215.c
+++ b/drivers/s390/char/con3215.c
@@ -1006,11 +1006,10 @@ static void tty3215_close(struct tty_struct *tty, 
struct file * filp)
raw = (struct raw3215_info *) tty->driver_data;
if (raw == NULL || tty->count > 1)
return;
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
/* Shutdown the terminal */
raw3215_shutdown(raw);
tasklet_kill(>tlet);
-   tty->closing = 0;
tty_port_tty_set(>port, NULL);
 }
 
diff --git a/drivers/staging/dgap/dgap.c b/drivers/staging/dgap/dgap.c
index 9112dd2..0456e28 100644
--- a/drivers/staging/dgap/dgap.c
+++ b/drivers/staging/dgap/dgap.c
@@ -4622,7 +4622,7 @@ static void dgap_tty_close(struct tty_struct *tty, struct 
file *file)
 
un->un_flags |= UN_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
 
/*
 * Only officially close channel if count is 0 and
@@ -4645,8 +4645,6 @@ static void dgap_tty_close(struct tty_struct *tty, struct 
file *file)
 
spin_lock_irqsave(>ch_lock, lock_flags);
 
-   tty->closing = 0;
-
/*
 * If we have HUPCL set, lower DTR and RTS
 */
diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index fbfe79a..96960d8 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -1410,7 +1410,7 @@ static void dgnc_tty_close(struct tty_struct *tty, struct 
file *file)
/* OK, its the last close on the unit */
un->un_flags |= UN_CLOSING;
 
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
 
 
/*
@@ -1441,8 +1441,6 @@ static void dgnc_tty_close(struct tty_struct *tty, struct 
file *file)
 
spin_lock_irqsave(>ch_lock, flags);
 
-   tty->closing = 0;
-
/*
 * If we have HUPCL set, lower DTR and RTS
 */
diff --git a/drivers/tty/hvc/hvsi.c b/drivers/tty/hvc/hvsi.c
index a75146f..fbaa6ab 100644
--- a/drivers/tty/hvc/hvsi.c
+++ b/drivers/tty/hvc/hvsi.c
@@ -796,7 +796,7 @@ static void hvsi_close(struct tty_struct *tty, struct file 
*filp)
 * any data delivered to the tty layer after this will 
be
 * discarded (except for XON/XOFF)
 */
-   tty->closing = 1;
+   tty_ldisc_closing(tty);
 
spin_unlock_irqrestore(>lock, flags);
 
diff --git 

Re: [PATCH] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Peter Hurley
Hi Dan,

On 10/26/2015 08:16 PM, dan.street...@canonical.com wrote:
> From: Dan Streetman <dan.street...@canonical.com>
> 
> The driver currently waits 1us after issuing a RST, but the spec
> requires it to wait 1ms.
> 
> Signed-off-by: Dan Streetman <dan.street...@canonical.com>
> Signed-off-by: Dan Streetman <ddstr...@ieee.org>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c 
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> index 4e75843..147bc65 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> @@ -113,7 +113,12 @@ mac_reset_top:
>  
>   /* Poll for reset bit to self-clear indicating reset is complete */
>   for (i = 0; i < 10; i++) {
> - udelay(1);
> + /* sec 8.2.4.1.1 :
> +  * programmers must wait approximately 1 ms after setting before
> +  * attempting to check if the bit has cleared or to access (read
> +  * or write) any other device register.
> +  */
> + mdelay(1);

Since ixgbe_reset_hw_x540() goes on to msleep(100) immediately after this
busy-wait loop, this should instead be:

msleep(1);

Regards,
Peter Hurley


>   ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
>   if (!(ctrl & IXGBE_CTRL_RST_MASK))
>   break;
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Intel-wired-lan] [PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Peter Hurley
On 10/27/2015 02:35 PM, ND Linux CI Server wrote:
> Greetings,
> 
> This email is automatically generated by ND's Linux Patch Testing framework
> based on aiaiai. I have performed some automatic testing of a patch (series)
> you submitted to intel-wired-...@lists.osuosl.org
> 
> The following contains output of any tests which failed to pass, and might be
> the result of developer error. The tests performed include but may not be
> limited to checkpatch.pl, bisection testing, compilation on a default kernel
> config, coccinelle scripts, cppcheck, and smatch.
> 
> If you have received this email in error, or believe that aiaiai has detected 
> a
> false positive, please email Jacob Keller <jacob.e.kel...@intel.com>.

False positive.

As long as the delay is at least 1ms (which is guaranteed), slightly longer
delays (relative to the existing reset delay of 100ms) are not harmful.

Use of usleep_range() would be unnecessary overkill for the purpose.

Regards,
Peter Hurley


> ---
> 
> I have tested your changes
> 
> [Intel-wired-lan] [PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST
> 
> Project: net (net-current development queue)
> 
> Configurations: intel_defconfig,x86
> 
> Tested the patch(es) on top of the following commits:
> 505b857 ixgbe: Reset interface after enabling SR-IOV
> ce9d9b8 net: sysctl: fix a kmemleak warning
> 1acea4f ppp: fix pppoe_dev deletion condition in pppoe_release()
> f6b8dec9 af_key: fix two typos
> 
> 
> 
> Successfully built configuration "intel_defconfig,x86", no issues.
> 
> 
> 
> checkpatch.pl has some complaints:
> 
> 
> 
> checkpatch.pl results for patch "[PATCH] ixgbe: Wait for 1ms, not 1us, after 
> RST"
> 
> WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see 
> Documentation/timers/timers-howto.txt
> #29: FILE: drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c:119:
> + msleep(1);
> 
> total: 0 errors, 1 warnings, 0 checks, 13 lines checked
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] tty: Remove wait_event_interruptible_tty()

2015-10-10 Thread Peter Hurley
In-tree users of wait_event_interruptible_tty() have been removed;
remove.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 include/linux/tty.h | 26 --
 1 file changed, 26 deletions(-)

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 090ce2a..c2889f4 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -655,32 +655,6 @@ extern void __lockfunc tty_unlock(struct tty_struct *tty);
 extern void __lockfunc tty_lock_slave(struct tty_struct *tty);
 extern void __lockfunc tty_unlock_slave(struct tty_struct *tty);
 extern void tty_set_lock_subclass(struct tty_struct *tty);
-/*
- * wait_event_interruptible_tty -- wait for a condition with the tty lock held
- *
- * The condition we are waiting for might take a long time to
- * become true, or might depend on another thread taking the
- * BTM. In either case, we need to drop the BTM to guarantee
- * forward progress. This is a leftover from the conversion
- * from the BKL and should eventually get removed as the BTM
- * falls out of use.
- *
- * Do not use in new code.
- */
-#define wait_event_interruptible_tty(tty, wq, condition)   \
-({ \
-   int __ret = 0;  \
-   if (!(condition))   \
-   __ret = __wait_event_interruptible_tty(tty, wq, \
-  condition);  \
-   __ret;  \
-})
-
-#define __wait_event_interruptible_tty(tty, wq, condition) \
-   ___wait_event(wq, condition, TASK_INTERRUPTIBLE, 0, 0,  \
-   tty_unlock(tty);\
-   schedule(); \
-   tty_lock(tty))
 
 #ifdef CONFIG_PROC_FS
 extern void proc_tty_register_driver(struct tty_driver *);
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] tty: Remove tty_port::close_wait

2015-10-10 Thread Peter Hurley
With the removal of tty_wait_until_sent_from_close(), tty drivers
no longer wait during open for parallel closes to complete (instead,
the tty core waits before calling the driver open() method). Thus,
the close_wait waitqueue is no longer used for waiting.

Remove struct tty_port::close_wait.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/rocket.c | 1 -
 drivers/tty/serial/68328serial.c | 1 -
 drivers/tty/serial/crisv10.c | 1 -
 drivers/tty/serial/serial_core.c | 1 -
 drivers/tty/tty_port.c   | 2 --
 include/linux/tty.h  | 1 -
 6 files changed, 7 deletions(-)

diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index 69c72d1..802eac7 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -1049,7 +1049,6 @@ static void rp_close(struct tty_struct *tty, struct file 
*filp)
mutex_unlock(>mutex);
tty_port_tty_set(port, NULL);
 
-   wake_up_interruptible(>close_wait);
complete_all(>close_wait);
atomic_dec(_num_ports_open);
 
diff --git a/drivers/tty/serial/68328serial.c b/drivers/tty/serial/68328serial.c
index 9ba0c93..0140ba4 100644
--- a/drivers/tty/serial/68328serial.c
+++ b/drivers/tty/serial/68328serial.c
@@ -1071,7 +1071,6 @@ static void rs_close(struct tty_struct *tty, struct file 
* filp)
wake_up_interruptible(>open_wait);
}
port->flags &= ~(ASYNC_NORMAL_ACTIVE|ASYNC_CLOSING);
-   wake_up_interruptible(>close_wait);
local_irq_restore(flags);
 }
 
diff --git a/drivers/tty/serial/crisv10.c b/drivers/tty/serial/crisv10.c
index 06f4fe9..f13f2eb 100644
--- a/drivers/tty/serial/crisv10.c
+++ b/drivers/tty/serial/crisv10.c
@@ -3655,7 +3655,6 @@ rs_close(struct tty_struct *tty, struct file * filp)
wake_up_interruptible(>port.open_wait);
}
info->port.flags &= ~(ASYNC_NORMAL_ACTIVE|ASYNC_CLOSING);
-   wake_up_interruptible(>port.close_wait);
local_irq_restore(flags);
 
/* port closed */
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index df4271a..def5199 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1437,7 +1437,6 @@ static void uart_close(struct tty_struct *tty, struct 
file *filp)
clear_bit(ASYNCB_CLOSING, >flags);
spin_unlock_irq(>lock);
wake_up_interruptible(>open_wait);
-   wake_up_interruptible(>close_wait);
 
mutex_unlock(>mutex);
 
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index 0e1cf04..e04a8cf 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -22,7 +22,6 @@ void tty_port_init(struct tty_port *port)
memset(port, 0, sizeof(*port));
tty_buffer_init(port);
init_waitqueue_head(>open_wait);
-   init_waitqueue_head(>close_wait);
init_waitqueue_head(>delta_msr_wait);
mutex_init(>mutex);
mutex_init(>buf_mutex);
@@ -520,7 +519,6 @@ void tty_port_close_end(struct tty_port *port, struct 
tty_struct *tty)
wake_up_interruptible(>open_wait);
}
port->flags &= ~(ASYNC_NORMAL_ACTIVE | ASYNC_CLOSING);
-   wake_up_interruptible(>close_wait);
spin_unlock_irqrestore(>lock, flags);
 }
 EXPORT_SYMBOL(tty_port_close_end);
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 614c822..090ce2a 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -227,7 +227,6 @@ struct tty_port {
int blocked_open;   /* Waiting to open */
int count;  /* Usage count */
wait_queue_head_t   open_wait;  /* Open waiters */
-   wait_queue_head_t   close_wait; /* Close waiters */
wait_queue_head_t   delta_msr_wait; /* Modem status change */
unsigned long   flags;  /* TTY flags ASY_*/
unsigned char   console:1,  /* port is a console */
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] tty: Remove tty_wait_until_sent_from_close()

2015-10-10 Thread Peter Hurley
tty_wait_until_sent_from_close() drops the tty lock while waiting
for the tty driver to finish sending previously accepted data (ie.,
data remaining in its write buffer and transmit fifo).

tty_wait_until_sent_from_close() was added by commit a57a7bf3fc7e
("TTY: define tty_wait_until_sent_from_close") to prevent the entire
tty subsystem from being unable to open new ttys while waiting for
one tty to close while output drained.

However, since commit 0911261d4cb6 ("tty: Don't take tty_mutex for tty
count changes"), holding a tty lock while closing does not prevent other
ttys from being opened/closed/hung up, but only prevents lifetime event
changes for the tty under lock.

Holding the tty lock while waiting for output to drain does prevent
parallel non-blocking opens (O_NONBLOCK) from advancing or returning
while the tty lock is held. However, all parallel opens _already_
block even if the tty lock is dropped while closing and the parallel
open advances. Blocking in open has been in mainline since at least 2.6.29
(see tty_port_block_til_ready(); note the test for O_NONBLOCK is _after_
the wait while ASYNC_CLOSING).

IOW, before this patch a non-blocking open will sleep anyway for the
_entire_ duration of a parallel hardware shutdown, and when it wakes, the
error return will cause a release of its tty, and it will restart with
a fresh attempt to open. Similarly with a blocking open that is already
waiting; when it's woken, the hardware shutdown has already completed
to ASYNC_INITIALIZED is not set, which forces a release and restart as
well.

So, holding the tty lock across the _entire_ close (which is what this
patch does), even while waiting for output to drain, is equivalent to
the current outcome wrt parallel opens.

Cc: Alan Cox <a...@linux.intel.com>
Cc: David Laight <david.lai...@aculab.com>
CC: Arnd Bergmann <a...@arndb.de>
CC: Karsten Keil <i...@linux-pingi.de>
CC: linuxppc-...@lists.ozlabs.org
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/isdn/i4l/isdn_tty.c   |  2 +-
 drivers/tty/hvc/hvc_console.c |  2 +-
 drivers/tty/hvc/hvcs.c|  2 +-
 drivers/tty/tty_port.c| 11 ++-
 include/linux/tty.h   | 18 --
 5 files changed, 5 insertions(+), 30 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index bc91261..2175225 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1582,7 +1582,7 @@ isdn_tty_close(struct tty_struct *tty, struct file *filp)
 * line status register.
 */
if (port->flags & ASYNC_INITIALIZED) {
-   tty_wait_until_sent_from_close(tty, 3000);  /* 30 seconds 
timeout */
+   tty_wait_until_sent(tty, 3000); /* 30 seconds timeout */
/*
 * Before we drop DTR, make sure the UART transmitter
 * has completely drained; this is especially
diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 9c30f67..e46d628 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -418,7 +418,7 @@ static void hvc_close(struct tty_struct *tty, struct file * 
filp)
 * there is no buffered data otherwise sleeps on a wait queue
 * waking periodically to check chars_in_buffer().
 */
-   tty_wait_until_sent_from_close(tty, HVC_CLOSE_WAIT);
+   tty_wait_until_sent(tty, HVC_CLOSE_WAIT);
} else {
if (hp->port.count < 0)
printk(KERN_ERR "hvc_close %X: oops, count is %d\n",
diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
index f7ff97c..5997b17 100644
--- a/drivers/tty/hvc/hvcs.c
+++ b/drivers/tty/hvc/hvcs.c
@@ -1230,7 +1230,7 @@ static void hvcs_close(struct tty_struct *tty, struct 
file *filp)
irq = hvcsd->vdev->irq;
spin_unlock_irqrestore(>lock, flags);
 
-   tty_wait_until_sent_from_close(tty, HVCS_CLOSE_WAIT);
+   tty_wait_until_sent(tty, HVCS_CLOSE_WAIT);
 
/*
 * This line is important because it tells hvcs_open that this
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index 40b3183..d7d9f9c 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -463,10 +463,7 @@ static void tty_port_drain_delay(struct tty_port *port, 
struct tty_struct *tty)
schedule_timeout_interruptible(timeout);
 }
 
-/* Caller holds tty lock.
- * NB: may drop and reacquire tty lock (in tty_wait_until_sent_from_close())
- * so tty and tty port may have changed state (but not hung up or reopened).
- */
+/* Caller holds tty lock. */
 int tty_port_close_start(struct tty_port *port,
struct tty_struct *tty, struct file *filp)
 {
@@ -502,7 +499,7 @@ int tty_port_close_start(struct tty_port *port,

[PATCH 5/7] tty: r3964: Use tty->read_wait waitqueue

2015-10-10 Thread Peter Hurley
The tty core provides read_wait waitqueue specifically for line
disciplines to wait readers; otherwise, the line discipline may
miss wakeups generated by the tty core.

NB: The tty core already provides serialization for the line discipline's
close() method, and guarantees no readers or writers will be using the
closing instance of the line discipline. Completely remove that wakeup.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/n_r3964.c   | 10 --
 include/linux/n_r3964.h |  3 ---
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/tty/n_r3964.c b/drivers/tty/n_r3964.c
index 8b157d6..6fdef92 100644
--- a/drivers/tty/n_r3964.c
+++ b/drivers/tty/n_r3964.c
@@ -276,7 +276,7 @@ static void remove_from_tx_queue(struct r3964_info *pInfo, 
int error_code)
add_msg(pHeader->owner, R3964_MSG_ACK, pHeader->length,
error_code, NULL);
}
-   wake_up_interruptible(>read_wait);
+   wake_up_interruptible(>tty->read_wait);
}
 
spin_lock_irqsave(>lock, flags);
@@ -542,7 +542,7 @@ static void on_receive_block(struct r3964_info *pInfo)
pBlock);
}
}
-   wake_up_interruptible(>read_wait);
+   wake_up_interruptible(>tty->read_wait);
 
pInfo->state = R3964_IDLE;
 
@@ -979,7 +979,6 @@ static int r3964_open(struct tty_struct *tty)
 
spin_lock_init(>lock);
pInfo->tty = tty;
-   init_waitqueue_head(>read_wait);
pInfo->priority = R3964_MASTER;
pInfo->rx_first = pInfo->rx_last = NULL;
pInfo->tx_first = pInfo->tx_last = NULL;
@@ -1045,7 +1044,6 @@ static void r3964_close(struct tty_struct *tty)
}
 
/* Free buffers: */
-   wake_up_interruptible(>read_wait);
kfree(pInfo->rx_buf);
TRACE_M("r3964_close - rx_buf kfree %p", pInfo->rx_buf);
kfree(pInfo->tx_buf);
@@ -1077,7 +1075,7 @@ static ssize_t r3964_read(struct tty_struct *tty, struct 
file *file,
goto unlock;
}
/* block until there is a message: */
-   wait_event_interruptible_tty(tty, pInfo->read_wait,
+   wait_event_interruptible_tty(tty, tty->read_wait,
(pMsg = remove_msg(pInfo, pClient)));
}
 
@@ -1227,7 +1225,7 @@ static unsigned int r3964_poll(struct tty_struct *tty, 
struct file *file,
 
pClient = findClient(pInfo, task_pid(current));
if (pClient) {
-   poll_wait(file, >read_wait, wait);
+   poll_wait(file, >read_wait, wait);
spin_lock_irqsave(>lock, flags);
pMsg = pClient->first_msg;
spin_unlock_irqrestore(>lock, flags);
diff --git a/include/linux/n_r3964.h b/include/linux/n_r3964.h
index 5d0b2a1..e9adb42 100644
--- a/include/linux/n_r3964.h
+++ b/include/linux/n_r3964.h
@@ -152,9 +152,6 @@ struct r3964_info {
unsigned char *rx_buf;/* ring buffer */
unsigned char *tx_buf;
 
-   wait_queue_head_t read_wait;
-   //struct wait_queue *read_wait;
-
struct r3964_block_header *rx_first;
struct r3964_block_header *rx_last;
struct r3964_block_header *tx_first;
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] tty: Remove ASYNC_CLOSING checks in open()/hangup() methods

2015-10-10 Thread Peter Hurley
Since at least before 2.6.30, tty drivers that do not drop the tty lock
while closing cannot observe ASYNC_CLOSING set while holding the
tty lock; this includes the tty driver's open() and hangup() methods,
since the tty core calls these methods holding the tty lock.

For these drivers, waiting for ASYNC_CLOSING to clear while opening
is not required, since this condition cannot occur. Similarly, even
when the open() method drops and reacquires the tty lock after
blocking, ASYNC_CLOSING cannot be set (again, for drivers that
do not drop the tty lock while closing).

Now that tty port drivers no longer drop the tty lock while closing
(since 'tty: Remove tty_wait_until_sent_from_close()'), the same
conditions apply: waiting for ASYNC_CLOSING to clear while opening
is not required, nor is re-checking ASYNC_CLOSING after dropping and
reacquiring the tty lock while blocking (eg., in *_block_til_ready()).

Note: The ASYNC_CLOSING flag state is still maintained since several
bitrotting drivers use it for (dubious) other purposes.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/char/pcmcia/synclink_cs.c |  9 -
 drivers/tty/cyclades.c|  9 -
 drivers/tty/rocket.c  | 12 
 drivers/tty/serial/crisv10.c  | 33 +
 drivers/tty/synclink.c| 18 --
 drivers/tty/synclink_gt.c | 14 ++
 drivers/tty/synclinkmp.c  | 14 ++
 drivers/tty/tty_port.c| 13 +
 net/irda/ircomm/ircomm_tty.c  | 31 +--
 9 files changed, 11 insertions(+), 142 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index 7680d52..45df4bf 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -2507,15 +2507,6 @@ static int mgslpc_open(struct tty_struct *tty, struct 
file * filp)
printk("%s(%d):mgslpc_open(%s), old ref count = %d\n",
 __FILE__, __LINE__, tty->driver->name, port->count);
 
-   /* If port is closing, signal caller to try again */
-   if (port->flags & ASYNC_CLOSING){
-   wait_event_interruptible_tty(tty, port->close_wait,
-!(port->flags & ASYNC_CLOSING));
-   retval = ((port->flags & ASYNC_HUP_NOTIFY) ?
-   -EAGAIN : -ERESTARTSYS);
-   goto cleanup;
-   }
-
port->low_latency = (port->flags & ASYNC_LOW_LATENCY) ? 1 : 0;
 
spin_lock_irqsave(>netlock, flags);
diff --git a/drivers/tty/cyclades.c b/drivers/tty/cyclades.c
index 87f6578..d4a1331 100644
--- a/drivers/tty/cyclades.c
+++ b/drivers/tty/cyclades.c
@@ -1577,15 +1577,6 @@ static int cy_open(struct tty_struct *tty, struct file 
*filp)
 #endif
 
/*
-* If the port is the middle of closing, bail out now
-*/
-   if (info->port.flags & ASYNC_CLOSING) {
-   wait_event_interruptible_tty(tty, info->port.close_wait,
-   !(info->port.flags & ASYNC_CLOSING));
-   return (info->port.flags & ASYNC_HUP_NOTIFY) ? -EAGAIN: 
-ERESTARTSYS;
-   }
-
-   /*
 * Start up serial port
 */
retval = cy_startup(info, tty);
diff --git a/drivers/tty/rocket.c b/drivers/tty/rocket.c
index c8dd8dc..69c72d1 100644
--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -895,14 +895,6 @@ static int rp_open(struct tty_struct *tty, struct file 
*filp)
if (!page)
return -ENOMEM;
 
-   if (port->flags & ASYNC_CLOSING) {
-   retval = wait_for_completion_interruptible(>close_wait);
-   free_page(page);
-   if (retval)
-   return retval;
-   return ((port->flags & ASYNC_HUP_NOTIFY) ? -EAGAIN : 
-ERESTARTSYS);
-   }
-
/*
 * We must not sleep from here until the port is marked fully in use.
 */
@@ -1511,10 +1503,6 @@ static void rp_hangup(struct tty_struct *tty)
 #endif
rp_flush_buffer(tty);
spin_lock_irqsave(>port.lock, flags);
-   if (info->port.flags & ASYNC_CLOSING) {
-   spin_unlock_irqrestore(>port.lock, flags);
-   return;
-   }
if (info->port.count)
atomic_dec(_num_ports_open);
clear_bit((info->aiop * 8) + info->chan, (void *) 
_flags[info->board]);
diff --git a/drivers/tty/serial/crisv10.c b/drivers/tty/serial/crisv10.c
index 3e4470a..06f4fe9 100644
--- a/drivers/tty/serial/crisv10.c
+++ b/drivers/tty/serial/crisv10.c
@@ -3759,23 +3759,6 @@ block_til_ready(struct tty_struct *tty, struct file * 
filp,
int do_clocal = 0;
 
/*
-* If the device is in the middle of b

[PATCH 6/7] tty: r3964: Replace/remove bogus tty lock use

2015-10-10 Thread Peter Hurley
The tty lock is strictly for serializing tty lifetime events
(open/close/hangup), and not for line discipline serialization.

The tty core already provides serialization of concurrent writes
to the same tty, and line discipline lifetime management (by ldisc
references), so pinning the tty via tty_lock() is unnecessary and
counter-productive; remove tty lock use.

However, the line discipline is responsible for serializing reads
(if required by the line discipline); add read_lock mutex to
serialize calls of r3964_read().

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/tty/n_r3964.c   | 20 +---
 include/linux/n_r3964.h |  5 +++--
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/tty/n_r3964.c b/drivers/tty/n_r3964.c
index 6fdef92..3451114 100644
--- a/drivers/tty/n_r3964.c
+++ b/drivers/tty/n_r3964.c
@@ -978,6 +978,7 @@ static int r3964_open(struct tty_struct *tty)
}
 
spin_lock_init(>lock);
+   mutex_init(>read_lock);
pInfo->tty = tty;
pInfo->priority = R3964_MASTER;
pInfo->rx_first = pInfo->rx_last = NULL;
@@ -1063,7 +1064,16 @@ static ssize_t r3964_read(struct tty_struct *tty, struct 
file *file,
 
TRACE_L("read()");
 
-   tty_lock(tty);
+   /*
+*  Internal serialization of reads.
+*/
+   if (file->f_flags & O_NONBLOCK) {
+   if (!mutex_trylock(>read_lock))
+   return -EAGAIN;
+   } else {
+   if (mutex_lock_interruptible(>read_lock))
+   return -ERESTARTSYS;
+   }
 
pClient = findClient(pInfo, task_pid(current));
if (pClient) {
@@ -1075,7 +1085,7 @@ static ssize_t r3964_read(struct tty_struct *tty, struct 
file *file,
goto unlock;
}
/* block until there is a message: */
-   wait_event_interruptible_tty(tty, tty->read_wait,
+   wait_event_interruptible(tty->read_wait,
(pMsg = remove_msg(pInfo, pClient)));
}
 
@@ -1105,7 +1115,7 @@ static ssize_t r3964_read(struct tty_struct *tty, struct 
file *file,
}
ret = -EPERM;
 unlock:
-   tty_unlock(tty);
+   mutex_unlock(>read_lock);
return ret;
 }
 
@@ -1154,8 +1164,6 @@ static ssize_t r3964_write(struct tty_struct *tty, struct 
file *file,
pHeader->locks = 0;
pHeader->owner = NULL;
 
-   tty_lock(tty);
-
pClient = findClient(pInfo, task_pid(current));
if (pClient) {
pHeader->owner = pClient;
@@ -1173,8 +1181,6 @@ static ssize_t r3964_write(struct tty_struct *tty, struct 
file *file,
add_tx_queue(pInfo, pHeader);
trigger_transmit(pInfo);
 
-   tty_unlock(tty);
-
return 0;
 }
 
diff --git a/include/linux/n_r3964.h b/include/linux/n_r3964.h
index e9adb42..90a803a 100644
--- a/include/linux/n_r3964.h
+++ b/include/linux/n_r3964.h
@@ -161,8 +161,9 @@ struct r3964_info {
unsigned char last_rx;
unsigned char bcc;
 unsigned int  blocks_in_rx_queue;
- 
-   
+
+   struct mutex read_lock; /* serialize r3964_read */
+
struct r3964_client_info *firstClient;
unsigned int state;
unsigned int flags;
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] tty close cleanup

2015-10-10 Thread Peter Hurley
Hi Greg,

This series is a redux of cleanup I originally submitted back in 2014;
the point is to remove the cruft which stems from dropping the tty lock
while waiting for output to drain before closing the tty.

Dropping the tty lock during close was added by "TTY: define
tty_wait_until_sent_from_close" to fix stalls on other parallel tty
operations while a tty was draining output while closing.

Since commit 89c8d91e31f2 ("tty: localise the lock") and commit
aa3cb814a8ef ("tty: Drop tty_mutex before tty reopen"), parallel tty
open/close/hangup on _other_ ttys no longer stall waiting for a tty
close to complete.

Continuing to hold the tty lock for the tty which is closing significantly
simplifies the state handling when opening a tty, since the tty cannot
have been closed concurrently. [Ideally, I would have liked to entirely
remove the TTY_CLOSING state flag, but unfortunately some older bit-rotting
drivers have co-opted it for dubious purposes].

In the previous series, David Laight had raised concerns about non-blocking
opens on the _same_ tty which is closing. However, as I pointed out in this
reply http://www.spinics.net/lists/linux-serial/msg14216.html, the outcome of
a parallel open while closing the same tty has not changed with this series,
since the existing code has blocked while ASYNC_CLOSING (and has done since at
least before 2.6.29).

I cc'd the commenters from the original series, Felipe Balbi for the gserial
usb gadget changes, and David Miller for the ISDN/IRDA changes.

Regards,

Peter Hurley (7):
  tty: Remove tty_wait_until_sent_from_close()
  tty: Remove ASYNC_CLOSING checks in open()/hangup() methods
  usb: gadget: gserial: Privatize close_wait
  tty: Remove tty_port::close_wait
  tty: r3964: Use tty->read_wait waitqueue
  tty: r3964: Replace/remove bogus tty lock use
  tty: Remove wait_event_interruptible_tty()

 drivers/char/pcmcia/synclink_cs.c  |  9 ---
 drivers/isdn/i4l/isdn_tty.c|  2 +-
 drivers/tty/cyclades.c |  9 ---
 drivers/tty/hvc/hvc_console.c  |  2 +-
 drivers/tty/hvc/hvcs.c |  2 +-
 drivers/tty/n_r3964.c  | 28 -
 drivers/tty/rocket.c   | 13 --
 drivers/tty/serial/68328serial.c   |  1 -
 drivers/tty/serial/crisv10.c   | 34 +
 drivers/tty/serial/serial_core.c   |  1 -
 drivers/tty/synclink.c | 18 +++---
 drivers/tty/synclink_gt.c  | 14 ++-
 drivers/tty/synclinkmp.c   | 14 ++-
 drivers/tty/tty_port.c | 26 +++-
 drivers/usb/gadget/function/u_serial.c |  6 +++--
 include/linux/n_r3964.h|  8 +++---
 include/linux/tty.h| 45 --
 net/irda/ircomm/ircomm_tty.c   | 31 +--
 18 files changed, 39 insertions(+), 224 deletions(-)

-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] usb: gadget: gserial: Privatize close_wait

2015-10-10 Thread Peter Hurley
close_wait is no longer needed or provided by the tty core.
Move close_wait to struct gs_port.

Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
---
 drivers/usb/gadget/function/u_serial.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/gadget/function/u_serial.c 
b/drivers/usb/gadget/function/u_serial.c
index 7ee0579..42894f5 100644
--- a/drivers/usb/gadget/function/u_serial.c
+++ b/drivers/usb/gadget/function/u_serial.c
@@ -114,6 +114,7 @@ struct gs_port {
struct gs_buf   port_write_buf;
wait_queue_head_t   drain_wait; /* wait while writes drain */
boolwrite_busy;
+   wait_queue_head_t   close_wait;
 
/* REVISIT this state ... */
struct usb_cdc_line_coding port_line_coding;/* 8-N-1 etc */
@@ -884,7 +885,7 @@ static void gs_close(struct tty_struct *tty, struct file 
*file)
pr_debug("gs_close: ttyGS%d (%p,%p) done!\n",
port->port_num, tty, file);
 
-   wake_up(>port.close_wait);
+   wake_up(>close_wait);
 exit:
spin_unlock_irq(>port_lock);
 }
@@ -1044,6 +1045,7 @@ gs_port_alloc(unsigned port_num, struct 
usb_cdc_line_coding *coding)
tty_port_init(>port);
spin_lock_init(>port_lock);
init_waitqueue_head(>drain_wait);
+   init_waitqueue_head(>close_wait);
 
tasklet_init(>push, gs_rx_push, (unsigned long) port);
 
@@ -1074,7 +1076,7 @@ static void gserial_free_port(struct gs_port *port)
 {
tasklet_kill(>push);
/* wait for old opens to finish */
-   wait_event(port->port.close_wait, gs_closed(port));
+   wait_event(port->close_wait, gs_closed(port));
WARN_ON(port->port_usb != NULL);
tty_port_destroy(>port);
kfree(port);
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] usb: gadget: gserial: Privatize close_wait

2015-10-10 Thread Peter Hurley
[ forgot to addr Felipe here, sorry ]

On 10/10/2015 04:00 PM, Peter Hurley wrote:
> close_wait is no longer needed or provided by the tty core.
> Move close_wait to struct gs_port.
> 
> Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
> ---
>  drivers/usb/gadget/function/u_serial.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/u_serial.c 
> b/drivers/usb/gadget/function/u_serial.c
> index 7ee0579..42894f5 100644
> --- a/drivers/usb/gadget/function/u_serial.c
> +++ b/drivers/usb/gadget/function/u_serial.c
> @@ -114,6 +114,7 @@ struct gs_port {
>   struct gs_buf   port_write_buf;
>   wait_queue_head_t   drain_wait; /* wait while writes drain */
>   boolwrite_busy;
> + wait_queue_head_t   close_wait;
>  
>   /* REVISIT this state ... */
>   struct usb_cdc_line_coding port_line_coding;/* 8-N-1 etc */
> @@ -884,7 +885,7 @@ static void gs_close(struct tty_struct *tty, struct file 
> *file)
>   pr_debug("gs_close: ttyGS%d (%p,%p) done!\n",
>   port->port_num, tty, file);
>  
> - wake_up(>port.close_wait);
> + wake_up(>close_wait);
>  exit:
>   spin_unlock_irq(>port_lock);
>  }
> @@ -1044,6 +1045,7 @@ gs_port_alloc(unsigned port_num, struct 
> usb_cdc_line_coding *coding)
>   tty_port_init(>port);
>   spin_lock_init(>port_lock);
>   init_waitqueue_head(>drain_wait);
> + init_waitqueue_head(>close_wait);
>  
>   tasklet_init(>push, gs_rx_push, (unsigned long) port);
>  
> @@ -1074,7 +1076,7 @@ static void gserial_free_port(struct gs_port *port)
>  {
>   tasklet_kill(>push);
>   /* wait for old opens to finish */
> - wait_event(port->port.close_wait, gs_closed(port));
> + wait_event(port->close_wait, gs_closed(port));
>   WARN_ON(port->port_usb != NULL);
>   tty_port_destroy(>port);
>   kfree(port);
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Documentation: improve line discipline method descriptions

2015-09-30 Thread Peter Hurley
On 09/29/2015 07:45 PM, Tilman Schmidt wrote:
> Mention that the ldisc open method must set tty->receive_room, and
> that many methods are optional. Add description of receive_buf2 method.

Thanks, Tilman!

Reviewed-by: Peter Hurley <pe...@hurleysoftware.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] isdn/gigaset: reset tty-receive_room when attaching ser_gigaset

2015-07-13 Thread Peter Hurley
On 07/13/2015 06:37 PM, Tilman Schmidt wrote:
 Commit 79901317ce80 (n_tty: Don't flush buffer when closing ldisc),
 first merged in kernel release 3.10, caused the following regression
 in the Gigaset M101 driver:
 
 Before that commit, when closing the N_TTY line discipline in
 preparation to switching to N_GIGASET_M101, receive_room would be
 reset to a non-zero value by the call to n_tty_flush_buffer() in
 n_tty's close method. With the removal of that call, receive_room
 might be left at zero, blocking data reception on the serial line.

That commit didn't cause the problem; it was a bug all along.

For example, if the tty had first been hooked up to some other line
discipline which consumed most of tty-receive_room, _then_
switched to N_GIGASET_M101 line discipline, the same problem would
have occurred.

Non-flow controlling line disciplines _must_ set tty-receive_room
on line discipline open because they are declaring that every
input they can accept that much data.

Regards,
Peter Hurley

 The present patch fixes that regression by setting receive_room
 to an appropriate value in the ldisc open method.
 
 Fixes: 79901317ce80 (n_tty: Don't flush buffer when closing ldisc)
 Signed-off-by: Tilman Schmidt til...@imap.cc
 ---
  drivers/isdn/gigaset/ser-gigaset.c |   11 ++-
  1 files changed, 10 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/isdn/gigaset/ser-gigaset.c 
 b/drivers/isdn/gigaset/ser-gigaset.c
 index 8c91fd5..3ac9c41 100644
 --- a/drivers/isdn/gigaset/ser-gigaset.c
 +++ b/drivers/isdn/gigaset/ser-gigaset.c
 @@ -524,9 +524,18 @@ gigaset_tty_open(struct tty_struct *tty)
   cs-hw.ser-tty = tty;
   atomic_set(cs-hw.ser-refcnt, 1);
   init_completion(cs-hw.ser-dead_cmp);
 -
   tty-disc_data = cs;
  
 + /* Set the amount of data we're willing to receive per call
 +  * from the hardware driver to half of the input buffer size
 +  * to leave some reserve.
 +  * Note: We don't do flow control towards the hardware driver.
 +  * If more data is received than will fit into the input buffer,
 +  * it will be dropped and an error will be logged. This should
 +  * never happen as the device is slow and the buffer size ample.
 +  */
 + tty-receive_room = RBUFSIZE/2;
 +
   /* OK.. Initialization of the datastructures and the HW is done.. Now
* startup system and notify the LL that we are ready to run
*/
 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html