On 07.10.2013 19:09, John Baldwin wrote:
On Sunday, October 06, 2013 3:30:42 am Alexander Motin wrote:
On 02.10.2013 20:30, John Baldwin wrote:
On Saturday, September 07, 2013 2:32:45 am Alexander Motin wrote:
On 07.09.2013 02:02, Jeremie Le Hen wrote:
On Fri, Sep 06, 2013 at 11:29:11AM +0300, Alexander Motin wrote:
On 06.09.2013 11:06, Jeremie Le Hen wrote:
On Fri, Sep 06, 2013 at 12:46:27AM +0200, Olivier Cochard-Labbé wrote:
On Thu, Sep 5, 2013 at 11:38 PM, Alexander Motin <m...@freebsd.org>
wrote:
I've found and fixed possible double request completion, that could
cause
such symptoms if happened. Updated patch located as usual:

http://people.freebsd.org/~mav/camlock_patches/camlock_20130905.patch

With this new one I cannot boot any more (I also updated the source
tree).  This is a hand transcripted version:

Trying to mount root from zfs:zroot/root []...
panic: Batch flag already set
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper()
kdb_backtrace()
vpanic()
kassert_panic()
xpt_batch_start()
ata_interrupt()
softclock_call_cc()
softclock()
ithread_loop()
fork_exit()
fork_trampoline()

Thank you for the report. I see my fault. It is probably specific to
ata(4) driver only. I've workarounded that in new patch version, but
probably that area needs some rethinking.

http://people.freebsd.org/~mav/camlock_patches/camlock_20130906.patch

I'm not sure you needed a confirmation, but it boots.  Thanks :).

I didn't quite understand the thread; is direct dispatch enabled for
amd64?  ISTR you said only i386 but someone else posted the macro for
amd64.

Yes, it is enabled for amd64. I've said x86, meaning both i386 and amd64.

FYI, I tested mfi with this patch set and mfid worked fine for handling
g_up
directly:

Index: dev/mfi/mfi_disk.c
===================================================================
--- dev/mfi/mfi_disk.c  (revision 257407)
+++ dev/mfi/mfi_disk.c  (working copy)
@@ -162,6 +162,7 @@
          sc->ld_disk->d_unit = sc->ld_unit;
          sc->ld_disk->d_sectorsize = secsize;
          sc->ld_disk->d_mediasize = sectors * secsize;
+       sc->ld_disk->d_flags = DISKFLAG_DIRECT_COMPLETION;
          if (sc->ld_disk->d_mediasize >= (1 * 1024 * 1024)) {
                  sc->ld_disk->d_fwheads = 255;
                  sc->ld_disk->d_fwsectors = 63;


Thank you for the feedback. But looking on mfi driver sources I would
say that it calls biodone() from mfi_disk_complete() from cm_complete()
method, which is called while holding mfi_io_lock mutex. I guess that if
on top of mfi device would be some GEOM class, supporting direct
dispatch and sending new requests down on previous request completion
(or retrying requests), that could cause recursive mfi_io_lock
acquisition. That is exactly the cause why I've added this flag. May be
it is a bit paranoid, but it is better to be safe then sorry.

Another good reason to drop the lock before calling biodone() would be
reducing the lock hold time. Otherwise it may just increase lock
congestion there and destroy all benefits of the direct dispatch.

Ah, interesting.  What is your policy for such drivers?  Should they be
left using g_up, should they drop the lock around biodone when completeing
multiple requests in an interrupt?  Should they try to batch them by
waiting and doing biodone at the end after dropping the lock?

In CAM's da and ada drivers I've managed to drop the periph lock before calling biodone(). New CAM locking design allowed to do it safe. Possibility of dropping CAM SIM (HBA driver) locks on completion though depends on HBA design. For example, for ATA/SATA drivers I haven't found safe way to drop the lock in place, so I delay commands completions up to the moment when lock can be safely dropped and then process completions in a batch. For SCSI HBAs with more separated request and response queues I guess that may be easier. But now all SCSI drivers are still queuing requests completion to separate CAM threads to decouple the locks that way. I've just added more of them to spread the load between cores. But I hope that, for example, mps driver could be reworked to allow both using multiple MSI-X interrupt threads and lock-less request completion.

Whether it is possible to do it safe for mfi I don't know. For cases when dropping the locks is not possible, there is g_up thread to use. For setups with single controller in a system with single interrupt thread queuing to g_up thread may even be better, helping to split load between two CPUs instead of one. But that is in case if HBA is fast or ineffective enough to saturate one CPU.

--
Alexander Motin
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to