Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-11 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 10:40:12PM -0700, Nicholas A. Bellinger wrote:
> Hey Peter & Co,
> 
> On Wed, 2018-10-10 at 10:43 +0200, Peter Zijlstra wrote:
> > On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote:
> > > From: Nicholas Bellinger 
> > > 
> > > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly 
> > > assumes no
> > > signals will be pending for task_struct executing the normal session 
> > > shutdown
> > > and I/O quiesce code-path.
> > > 
> > > For example, iscsi-target and iser-target issue SIGINT to all kthreads as
> > > part of session shutdown.  This has been the behaviour since day one.
> > 
> > Not knowing much context here; but does it make sense for those
> > kthreads to handle signals, ever? Most kthreads should be fine with
> > ignore_signals().
> > 
> 
> iscsi-target + ib-isert uses SIGINT amongst dedicated rx/tx connection
> kthreads to signal connection shutdown, requiring in-flight se_cmd I/O
> descriptors to be quiesced before making forward progress to release
> se_session.
> 
> By the point wait_event_lock_irq_timeout() is called in the example
> here, one of the two rx/tx connection kthreads has been stopped, and the
> other kthread is still processing shutdown.  So while historically the
> pending SIGINTs where not cleared (or ignored) during shutdown at this
> point, there is no reason why they could not be ignored for iscsi-target
> + ib-isert.
> 
> That said, pre commit 00d909a107 code always used wait_for_completion()
> and ignored pending signals.  As-is target_wait_for_sess_cmds() is
> called directly from fabric driver code and in one case also from
> user-space via configfs_write_file(), so AFAICT it does need
> TASK_UNINTERRUPTIBLE.
> 

Fair enough, thanks for the background. I'm always a bit wary when
kthreads need to deal with signals, but this seems OK.


Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes no
> signals will be pending for task_struct executing the normal session shutdown
> and I/O quiesce code-path.
> 
> For example, iscsi-target and iser-target issue SIGINT to all kthreads as
> part of session shutdown.  This has been the behaviour since day one.

Not knowing much context here; but does it make sense for those
kthreads to handle signals, ever? Most kthreads should be fine with
ignore_signals().



Re: [PATCH 1/2] sched/wait: Add wait_event_lock_irq_timeout for TASK_UNINTERRUPTIBLE usage

2018-10-10 Thread Peter Zijlstra
On Wed, Oct 10, 2018 at 03:23:09AM +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> Short of reverting commit 00d909a107 for v4.19, target-core needs a
> wait_event_t marco can be executed using TASK_UNINTERRUPTIBLE to
> function correctly with existing fabric drivers that expect to run
> with signals pending during session shutdown and active se_cmd I/O
> quiesce.
> 
> The most notable is iscsi-target/iser-target, while ibmvscsi_tgt invokes
> session shutdown logic from userspace via configfs attribute that could
> also potentially have signals pending.
> 
> So go ahead and introduce wait_event_lock_irq_timeout() to achieve this,
> and update + rename __wait_event_lock_irq_timeout() to make it accept
> 'state' as a parameter.
> 
> Fixes: 00d909a107 ("scsi: target: Make the session shutdown code also wait 
> for commands that are being aborted")
> Cc: Bart Van Assche 
> Cc: Mike Christie 
> Cc: Hannes Reinecke 
> Cc: Christoph Hellwig 
> Cc: Sagi Grimberg 
> Cc: Bryant G. Ly 
> Cc: Peter Zijlstra (Intel) 
> Tested-by: Nicholas Bellinger 
> Signed-off-by: Nicholas Bellinger 

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-08 Thread Peter Zijlstra
On Mon, Jan 08, 2018 at 11:43:42AM +, Alan Cox wrote:
> On Mon, 8 Jan 2018 11:08:36 +0100
> Peter Zijlstra  wrote:
> 
> > On Fri, Jan 05, 2018 at 10:30:16PM -0800, Dan Williams wrote:
> > > On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Biederman  
> > > wrote:  
> > > > In at least one place (mpls) you are patching a fast path.  Compile out
> > > > or don't load mpls by all means.  But it is not acceptable to change the
> > > > fast path without even considering performance.  
> > > 
> > > Performance matters greatly, but I need help to identify a workload
> > > that is representative for this fast path to see what, if any, impact
> > > is incurred. Even better is a review that says "nope, 'index' is not
> > > subject to arbitrary userspace control at this point, drop the patch."  
> > 
> > I think we're focussing a little too much on pure userspace. That is, we
> > should be saying under the attackers control. Inbound network packets
> > could equally be under the attackers control.
> 
> Inbound network packets don't come with a facility to read back and do
> cache timimg. 

But could they not be used in conjunction with a local task to prime the
stuff?


Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-08 Thread Peter Zijlstra
On Fri, Jan 05, 2018 at 10:30:16PM -0800, Dan Williams wrote:
> On Fri, Jan 5, 2018 at 6:22 PM, Eric W. Biederman  
> wrote:
> > In at least one place (mpls) you are patching a fast path.  Compile out
> > or don't load mpls by all means.  But it is not acceptable to change the
> > fast path without even considering performance.
> 
> Performance matters greatly, but I need help to identify a workload
> that is representative for this fast path to see what, if any, impact
> is incurred. Even better is a review that says "nope, 'index' is not
> subject to arbitrary userspace control at this point, drop the patch."

I think we're focussing a little too much on pure userspace. That is, we
should be saying under the attackers control. Inbound network packets
could equally be under the attackers control.

Sure, userspace is the most direct and highest bandwidth one, but I
think we should treat all (kernel) external values with the same
paranoia.


Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-30 Thread Peter Zijlstra
On Wed, Aug 30, 2017 at 10:42:07AM +0200, Peter Zijlstra wrote:
> 
> So the overhead looks to be spread out over all sorts, which makes it
> harder to find and fix.
> 
> stack unwinding is done lots and is fairly expensive, I've not yet
> checked if crossrelease does too much of that.

Aah, we do an unconditional stack unwind for every __lock_acquire() now.
It keeps a trace in the xhlocks[].

Does the below cure most of that overhead?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..7b872036b72e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4872,7 +4872,7 @@ static void add_xhlock(struct held_lock *hlock)
xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
xhlock->trace.entries = xhlock->trace_entries;
xhlock->trace.skip = 3;
-   save_stack_trace(&xhlock->trace);
+   /* save_stack_trace(&xhlock->trace); */
 }
 
 static inline int same_context_xhlock(struct hist_lock *xhlock)


Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-30 Thread Peter Zijlstra
On Wed, Aug 30, 2017 at 03:15:11PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> On (08/30/17 14:43), Byungchul Park wrote:
> [..]
> > > notably slower than earlier 4.13 linux-next. (e.g. scrolling in vim
> > > is irritatingly slow)
> > 
> > To Ingo,
> > 
> > I cannot decide if we have to roll back CONFIG_LOCKDEP_CROSSRELEASE
> > dependency on CONFIG_PROVE_LOCKING in Kconfig. With them enabled,
> > lockdep detection becomes strong but has performance impact. But,
> > it's anyway a debug option so IMHO we don't have to take case of the
> > performance impact. Please let me know your decision.
> 
> well, I expected it :)
> 
> I've been running lockdep enabled kernels for years, and was OK with
> the performance. but now it's just too much and I'm looking at disabling
> lockdep.
> 
> a more relevant test -- compilation of a relatively small project
> 
>   LOCKDEP -CROSSRELEASE -COMPLETIONS LOCKDEP +CROSSRELEASE +COMPLETIONS
> 
>real1m23.722s  real2m9.969s
>user4m11.300s  user4m15.458s
>sys 0m49.386s  sys 2m3.594s
> 
> 
> you don't want to know how much time now it takes to recompile the
> kernel ;)

Right,.. so when I look at perf annotate for __lock_acquire and
lock_release (the two most expensive lockdep functions in a kernel
profile) I don't actually see much cross-release stuff.

So the overhead looks to be spread out over all sorts, which makes it
harder to find and fix.

stack unwinding is done lots and is fairly expensive, I've not yet
checked if crossrelease does too much of that.

The below saved about 50% of my __lock_acquire() time, not sure it made
a significant difference over all though.

---
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 44c8d0d17170..f8db1ead1c48 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3386,7 +3386,7 @@ static int __lock_acquire(struct lockdep_map *lock, 
unsigned int subclass,
if (!class)
return 0;
}
-   atomic_inc((atomic_t *)&class->ops);
+   /* atomic_inc((atomic_t *)&class->ops); */
if (very_verbose(class)) {
printk("\nacquire class [%p] %s", class->key, class->name);
if (class->name_version > 1)


Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-23 Thread Peter Zijlstra
On Wed, Aug 23, 2017 at 09:03:04AM +0900, Byungchul Park wrote:
> On Tue, Aug 22, 2017 at 09:43:56PM +, Bart Van Assche wrote:

> The report is talking about the following lockup:
> 
> A work in a worker A task work on exit to user
> -- ---
> mutex_lock(&bdev->bd_mutex)
>mutext_lock(&bdev->bd_mutex)
> blk_execute_rq()
>wait_for_completion_io_timeout(&A)
>complete(&A)
> 
> Is this impossible?
> 
> To Peterz,
> 
> Anyway I wanted to avoid lockdep reports in the case using a timeout
> interface. Do you think it's still worth reporting the kind of lockup?

Yes, people might not have expected to hit the timeout on this. They
might think timeout means a dead device or something like that.

I'd like to heard from the block folks if this was constructed thus on
purpose though.


Re: [PATCH 0/5] v2: block subsystem refcounter conversions

2017-04-21 Thread Peter Zijlstra
On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote:
> You have it so easy - the code is completely standalone, building a
> small test framework around it and measuring performance in _user space_
> is trivial.

Something like this you mean:

  
https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5...@hirez.programming.kicks-ass.net




Re: [PATCH 0/5] block subsystem refcounter conversions

2017-02-20 Thread Peter Zijlstra
On Mon, Feb 20, 2017 at 07:41:01AM -0800, James Bottomley wrote:
> On Mon, 2017-02-20 at 08:15 -0700, Jens Axboe wrote:
> > On 02/20/2017 04:16 AM, Elena Reshetova wrote:
> > > Now when new refcount_t type and API are finally merged
> > > (see include/linux/refcount.h), the following
> > > patches convert various refcounters in the block susystem from 
> > > atomic_t to refcount_t. By doing this we prevent intentional or 
> > > accidental underflows or overflows that can led to use-after-free
> > > vulnerabilities.
> 
> This description isn't right ... nothing is prevented; we get warnings
> on saturation and use after free with this.

The thing that is prevented is overflow and then a use-after-free by
making it a leak.

Modular stuff, you put and free at: (n+1) mod n, by saturating at n-1
we'll never get there.

So you loose use-after-free, you gain a resource leak. The general idea
being that use-after-free is a nice trampoline for exploits, leaks are
'only' a DoS.



Re: [BUG] hpsa: Controller lockup detected: 0x00150028

2015-05-22 Thread Peter Zijlstra
On Fri, May 22, 2015 at 05:10:44PM +0200, Tomas Henzl wrote:
> >> I've updated to 6.62 and it appears to be working now; or rather, it has

I've since gotten 6.64 from HP to test; which does not seem public yet.

6.64 actually fixes the issue for me.

> An older issue for mptsas seems to handle a similar case
> 2a1b7e575b [SCSI] mptsas: fix hangs caused by ATA pass-through
> that might be for hpsa -

> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -1067,6 +1067,8 @@ static int hpsa_slave_alloc(struct scsi_device *sdev)
> if (sd != NULL)
> sdev->hostdata = sd;
> spin_unlock_irqrestore(&h->devlock, flags);
> +
> +   blk_queue_dma_alignment (sdev->request_queue, 512 - 1);
> return 0;
>  }

That does indeed seem _very_ similar; I'll have to defer to Mark Oelke
and or Don Brace to say if the above is a useful alternative. Since they
seem to now know what was the root cause.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] hpsa: Controller lockup detected: 0x00150028

2015-05-18 Thread Peter Zijlstra
On Mon, May 18, 2015 at 06:03:45PM +0200, Peter Zijlstra wrote:
> On Mon, May 18, 2015 at 05:20:34PM +0200, Peter Zijlstra wrote:
> > On Mon, May 18, 2015 at 01:57:39PM +, Oelke, Mark wrote:
> > > The P212/P410/P411 firmware was recently spun to address an issue that 
> > > sounds exactly like this problem.
> > > Which version of controller firmware are you using?
> > 
> > Smart Array P212 in Slot 1
> > 
> >Hardware Revision: C
> >Firmware Version: 6.60
> 
> I've updated to 6.62 and it appears to be working now; or rather, it has
> not locked up yet where I think it would've locked up by now earlier.
> 
> I'll let it run for a few more hours before calling it fixed, I'll let
> you know.

And right after sending this email it went...

[ 1119.052144] hpsa :06:00.0: Controller lockup detected: 0x00150029

So sadly no dice.

Anything else I can do?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kref: warn on uninitialized kref

2014-05-19 Thread Peter Zijlstra
On Sat, May 17, 2014 at 05:14:13PM -0400, Mikulas Patocka wrote:
> BTW. if we talk about performance - what about replacing:
> 
>   if (atomic_dec_and_test(&variable)) {
>   ... release(object);
>   }
> 
> with this:
> 
>   if (atomic_read(&variable) == 1 || atomic_dec_and_test(&variable)) {
>   barrier();
>   ... release(object);
>   }

That would completely wreck kref_get_unless_zero().


pgpVP1oQ2uYC6.pgp
Description: PGP signature