Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-04 Thread Sowmini Varadhan
One last question before I spin out v9.. the dma_mask code is a bit confusing to me, so I want to make sure... the code is if (limit + tbl-it_offset mask) { limit = mask - tbl-it_offset + 1; /* If we're constrained on address range, first try * at

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-04 Thread Benjamin Herrenschmidt
On Sat, 2015-04-04 at 07:27 -0400, Sowmini Varadhan wrote: One last question before I spin out v9.. the dma_mask code is a bit confusing to me, so I want to make sure... the code is if (limit + tbl-it_offset mask) { limit = mask - tbl-it_offset + 1; /* If

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-03 Thread Sowmini Varadhan
Just want to confirm: + again: + if (pass == 0 handle *handle + (*handle = pool-start) (*handle pool-end)) + start = *handle; + else + start = pool-hint; Now this means handle might be pool-hint, in that case you also need a lazy flush. Or

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-03 Thread Sowmini Varadhan
On (04/04/15 08:06), Benjamin Herrenschmidt wrote: No, I meant n pool-hint, ie, the start of the newly allocated block. ah, got it. I'll do my drill with patchset and get back, probably by Monday. --Sowmini ___ Linuxppc-dev mailing list

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-03 Thread Benjamin Herrenschmidt
On Fri, 2015-04-03 at 14:28 -0400, Sowmini Varadhan wrote: Just want to confirm: + again: + if (pass == 0 handle *handle + (*handle = pool-start) (*handle pool-end)) + start = *handle; + else + start = pool-hint; Now this means handle might be

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Thu, 2015-04-02 at 17:54 -0400, Sowmini Varadhan wrote: the other question that comes to my mind is: the whole lazy_flush optimization probably works best when there is exactly one pool, and no large pools. In most other cases, we'd end up doing a lazy_flush when we wrap within our pool

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Sowmini Varadhan
On (04/03/15 08:57), Benjamin Herrenschmidt wrote: I only just noticed too, you completely dropped the code to honor the dma mask. Why that ? Some devices rely on this. /* Sowmini's comment about this coming from sparc origins.. */ Probably, not that many devices have limits on DMA

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Tue, 2015-03-31 at 10:40 -0400, Sowmini Varadhan wrote: + if (largealloc) { + pool = (iommu-large_pool); + spin_lock_irqsave(pool-lock, flags); + pool_nr = 0; /* to keep compiler happy */ + } else { + /* pick out pool_nr */ +

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Thu, 2015-04-02 at 18:15 -0400, Sowmini Varadhan wrote: On (04/03/15 08:57), Benjamin Herrenschmidt wrote: I only just noticed too, you completely dropped the code to honor the dma mask. Why that ? Some devices rely on this. /* Sowmini's comment about this coming from sparc

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Fri, 2015-04-03 at 09:01 +1100, Benjamin Herrenschmidt wrote: On Fri, 2015-04-03 at 08:57 +1100, Benjamin Herrenschmidt wrote: No, the last argument to iommu_area_alloc() which is passed from the callers when doing consistent allocs. Basically, the DMA api mandates that consistent

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Fri, 2015-04-03 at 08:57 +1100, Benjamin Herrenschmidt wrote: No, the last argument to iommu_area_alloc() which is passed from the callers when doing consistent allocs. Basically, the DMA api mandates that consistent allocs are naturally aligned (to their own size), we implement that on

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Sowmini Varadhan
On (04/03/15 07:54), Benjamin Herrenschmidt wrote: + limit = pool-end; + + /* The case below can happen if we have a small segment appended +* to a large, or when the previous alloc was at the very end of +* the available space. If so, go back to the beginning and flush. +

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Benjamin Herrenschmidt
On Thu, 2015-04-02 at 17:43 -0400, Sowmini Varadhan wrote: On (04/03/15 07:54), Benjamin Herrenschmidt wrote: + limit = pool-end; + + /* The case below can happen if we have a small segment appended + * to a large, or when the previous alloc was at the very end of + * the

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-04-02 Thread Sowmini Varadhan
the other question that comes to my mind is: the whole lazy_flush optimization probably works best when there is exactly one pool, and no large pools. In most other cases, we'd end up doing a lazy_flush when we wrap within our pool itself, losing the benefit of that optimization. Given that the

RE: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-03-31 Thread David Laight
From: Sowmini Varadhan Investigation of multithreaded iperf experiments on an ethernet interface show the iommu-lock as the hottest lock identified by lockstat, with something of the order of 21M contentions out of 27M acquisitions, and an average wait time of 26 us for the lock. This is not

Re: [PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

2015-03-31 Thread Sowmini Varadhan
On (03/31/15 15:15), David Laight wrote: I've wondered whether the iommu setup for ethernet receive (in particular) could be made much more efficient if there were a function that would unmap one buffer and map a second buffer? My thought is that iommu pte entry used by the old buffer could