The current UHCI driver constructs the bulk transfer queue as a simple list with a 'terminate' marker on the end. This means that the bulk queue runs only once per frame period. This is OK for devices with large input buffers, but in the case of a large transfer to a device with a small input buffer, it limits the throughput to 1 buffer per frame time (nominal 1ms). In the case of the hardware I am using, the buffer is 128 bytes, so I only get 128000 bytes/sec throughput with the UHCI driver, compared to over 200000 bytes/sec with OHCI.
If the UHCI driver arranges the bulk transfer queue as a circular list, transfers will be retried repeatedly in what would otherwise be wasted time at the end of the frame; this is similar to what OHCI does. In fact in my application the patched UHCI driver comes out slightly better than OHCI (though this may be other factors like CPU speed). The patch to do this appears to be very simple (this diff is against -stable as my -current machine is OHCI, but the code is identical in -current).
Index: uhci.c =================================================================== RCS file: /repository/src/sys/dev/usb/uhci.c,v retrieving revision 1.40.2.7 diff -c -r1.40.2.7 uhci.c *** uhci.c 31 Oct 2000 23:23:29 -0000 1.40.2.7 --- uhci.c 15 Dec 2001 23:19:17 -0000 *************** *** 371,377 **** bsqh = uhci_alloc_sqh(sc); if (bsqh == NULL) return (USBD_NOMEM); ! bsqh->qh.qh_hlink = LE(UHCI_PTR_T); /* end of QH chain */ bsqh->qh.qh_elink = LE(UHCI_PTR_T); sc->sc_bulk_start = sc->sc_bulk_end = bsqh; --- 371,378 ---- bsqh = uhci_alloc_sqh(sc); if (bsqh == NULL) return (USBD_NOMEM); ! bsqh->hlink = bsqh; /* Circular QH chain */ ! bsqh->qh.qh_hlink = LE(bsqh->physaddr | UHCI_PTR_Q); bsqh->qh.qh_elink = LE(UHCI_PTR_T); sc->sc_bulk_start = sc->sc_bulk_end = bsqh; *************** *** 890,896 **** DPRINTFN(10, ("uhci_remove_bulk: sqh=%p\n", sqh)); for (pqh = sc->sc_bulk_start; pqh->hlink != sqh; pqh = pqh->hlink) #if defined(DIAGNOSTIC) || defined(UHCI_DEBUG) ! if (LE(pqh->qh.qh_hlink) & UHCI_PTR_T) { printf("uhci_remove_bulk: QH not found\n"); return; } --- 891,897 ---- DPRINTFN(10, ("uhci_remove_bulk: sqh=%p\n", sqh)); for (pqh = sc->sc_bulk_start; pqh->hlink != sqh; pqh = pqh->hlink) #if defined(DIAGNOSTIC) || defined(UHCI_DEBUG) ! if (pqh == sc->sc_bulk_end) { printf("uhci_remove_bulk: QH not found\n"); return; }