At 05:10 PM 3/20/2006, Fabian Tillier wrote:
On 3/20/06, Talpey, Thomas <[EMAIL PROTECTED]> wrote:
> Ok, this is a longer answer.
>
> At 06:08 PM 3/20/2006, Fabian Tillier wrote:
> >As to using FMRs to create virtually contiguous regions, the last data
> >I saw about this related to SRP (not on OpenIB), and resulted in a
> >gain of ~25% in throughput when using FMRs vs the "full frontal" DMA
> >MR.  So there is definitely something to be gained by creating
> >virutally contiguous regions, especially if you're doing a lot of RDMA
> >reads for which there's a fairly low limit to how many can be in
> >flight (4 comes to mind).
>
> 25% throughput over what workload? And I assume, this was with the
> "lazy deregistration" method implemented with the current fmr pool?
> What was your analysis of the reason for the improvement - if it was
> merely reducing the op count on the wire, I think your issue lies elsewhere.

This was a large block "read" workload (since HDDs typically give
better read performance than write).  It was with lazy deregistration,
and the analysis was that the reduction of the op count on the wire
was the reason.  It may well have to do with how the target chose to
respond, though, and I have no idea how that side of things was
implemented.  It could well be that performance could be improved
without going with FMRs.

Quite often performance is governed by the target more than the initiator as it is in turn governed by its local cache and disc mech performance / capacity.  Large data movements typically are a low op count from the initiator perspective therefore it seems a bit odd to state that performance can be dramatically impacted by the op count on the wire.


> Also, see previous paragraph - if your SRP is fast but not safe, then only
> fast but not safe applications will want to use it. Fibre channel adapters
> do not introduce this vulnerability, but they go fast. I can show you NFS
> running this fast too, by the way.

Why can't Fibre Channel adapters, or any locally attached hardware for
that matter, DMA anywhere in memory?  Unless the chipset somehow
protect against it, doesn't locally attached hardware have free reign
over DMA?

As a general practice, future volume I/O chipsets across multiple market segments will implement an IOMMU to restrict where DMA is allowed.  Both AMD and Intel have recently announced specifications to this effect which reflect what has been implemented in many non-x86 chipset offerings.  Whether a given OS always requires this protection to be enabled is implementation-specific but it is something that many within the industry and customer base require.

Mike


Also, please don't take my anectdotal benchmark results as an
endorsement of the Mellanox FMR design - the data was presented to me
by Mellanox as a reason to add FMR support to the Windows stack (which
currently uses the "full frontal" approach due to limitations of the
verbs API and how it needs to be used for storage).  I never had a
chance to look into why the gains where so large, and it could be
either the SRP target implementation, a hardware limitation, or a
number of other issues, especially since a read workload results in
RDMA Writes from the target to the host which can be pipelined much
deeper than RDMA Reads.

- Fab
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to