On 4/27/2015 9:22 PM, Or Gerlitz wrote:
On Mon, Apr 27, 2015 at 10:32 AM, Matan Barak <[email protected]> wrote:
On 4/26/2015 8:20 PM, Or Gerlitz wrote:
On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur
<[email protected]> wrote:
From: Matan Barak <[email protected]>
In order to manage multiple types, vlans and MACs per GID, we
need to store them along the GID itself. We store the net device
as well, as sometimes GIDs should be handled according to the
net device they came from. Since populating the GID table should
be identical for every RoCE provider, the GIDs table should be
handled in ib_core.
Adding a GID cache table that supports a lockless find, add and
delete gids. The lockless nature comes from using a unique
sequence number per table entry and detecting that while reading/
writing this sequence wasn't changed.
Matan, please use existing mechanism which fits the problem you are
trying to solve, I guess one of RCU or seqlock should do the job.
seqcount fits this problem better. Since if a write and read are done in
parallel, there's a good chance we read an out of date entry and we are
going to use a GID entry that's going to change in T+epsilon, so RCU doesn't
really have an advantage here.
So going back to the problem... we are talking on applications/drivers
that attempt to establish new connections doing reads and writes done
on behalf of IP stack changes, both are very much not critical path.
So this is kind of similar to the neighbour table maintained by ND
subsystem which is used by all IP based networking applications and
that code uses RCU. I don't see what's wrong with RCU for our sort
smaller scale subsystem and what is even wrong with simple rwlock
which is the mechanism used today by the IB core git cache, this goes
too complex and for no reason that I can think of.
I think the real question is why to deal with RCUs that will require
re-allocation of entries when it's not necessary or why do we want to
use rwlock if the kernel provides a mechanism (called seqcount) that
fits this problem better?
I disagree about seqcount being complex - if you look at its API you'll
find it's a lot simpler than RCU.
The current implementation is a bit more efficient than seqcount, as it
allows early termination of read-while-write (because the write puts a known
"currently updating" value that the read knows to ignore). AFAIK, this
doesn't exist in the current seqcount implementation. However, since this
isn't a crucial data-path, I'll change that to seqcount.
seqcount is preferred over seqlock, as I don't need the spinlock in seqlock.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html