On 20 January 2015 at 13:32, Dan Berindei <dan.berin...@gmail.com> wrote: > Adrian, I don't think that will work. The Hash doesn't know the number of > segments so it can't tell where a particular key will land - even assuming > knowledge about how the ConsistentHash will map hash codes to segments. > > However, I'm all for replacing the current Hash interface with another > interface that maps keys directly to segments.
Right, I'll eventually need a different abstraction, or a change to the Hash interface. However my need seems highly specialistic, I'm not sure if there would be a general interest into such a capability for other Hash implementors? > > Cheers > Dan Never ever sign if you have more interesting comments below, I only saw them by chance ;) > On Tue, Jan 20, 2015 at 4:08 AM, Adrian Nistor <anis...@redhat.com> wrote: >> >> Hi Sanne, >> >> An alternative approach would be to implement an >> org.infinispan.commons.hash.Hash which delegates to the stock >> implementation for all keys except those that need to be assigned to a >> specific segment. It should return the desired segment for those. >> >> Adrian >> >> >> On 01/20/2015 02:48 AM, Sanne Grinovero wrote: >> > Hi all, >> > >> > I'm playing with an idea for some internal components to be able to >> > "tag" the key for an entry to be stored into Infinispan in a very >> > specific segment of the CH. >> > >> > Conceptually the plan is easy to understand by looking at this patch: >> > >> > >> > https://github.com/Sanne/infinispan/commit/45a3d9e62318d5f5f950a60b5bb174d23037335f >> > >> > Hacking the change into ReplicatedConsistentHash is quite barbaric, >> > please bear with me as I couldn't figure a better way to be able to >> > experiment with this. I'll probably want to extend this class, but >> > then I'm not sure how to plug it in? > > > You would need to create your own ConsistentHashFactory, possibly extending > ReplicatedConsistentHashFactory. You can then plug the factory in with > > configurationBuilder.clustering().hash().consistentHashFactory(yourFactory) > > However, this isn't a really good idea, because then you need a different > implementation for distributed mode, and then another implementation for > topology-aware clusters (with rack/machine/site ids). And your users would > also need to select the proper factory for each cache. Right, this is the complexity I was facing. I'll stick to my hack solution for our little POC... but ultimately I'll need to plug this in if none of the solutions below work out. >> > What would you all think of such a "tagging" mechanism? >> > >> > # Why I didn't use the KeyAffinityService >> > - I need to use my own keys, not the meaningless stuff produced by the >> > service >> > - the extensive usage of Random in there doesn't seem suited for a >> > performance critical path > > > You can plug in your own KeyGenerator to generate keys, and maybe replace > the Random with a static/thread-local counter. Thanks for the tip on KeyGenerator, I'll investigate on that :) But I'll never add a static/threadlocal.. I'd rather commit my ugly code from the commit linked above. > >> >> >> >> > >> > # Why I didn't use the Grouping API >> > - I need to pick the specific storage segment, not just co-locate with >> > a different key >> > > > > This is actually a drawback of the KeyAffinityService more than Grouping. > With grouping, you can actually follow the KeyAffinityService strategy and > generate random strings until you get one in the proper segment, and then > tag all your keys with that exact string. Interesting! A bit convoluted but could spare me to plug in the HashFactory. BTW I really dislike this idea of the KeyAffinityService to generate random keys until it works out.. I guess it might not be too bad if you want to pick a node out of ten, but I'm working at segment granularity level and with the right luck it could take a long time. It would be nice to have a function like this which would return in a deterministic amount of time, like simply an inverse Hash. >> > The general goal is to make it possible to "tag" all entries of an >> > index, and have an independent index for each segment of the CH. So >> > the resulting effect would be, that when a primary owner for any key K >> > is making an update, and this triggers an index update, that update is >> > A) going to happen on the same node -> no need to forwarding to a >> > "master indexing node" >> > B) each such writes on the index happen on the same node which is >> > primary owner for all the written entries of the index. >> > >> > There are two additional nice consequences: >> > - there would be no need to perform a reliable "master election": >> > ownership singleton is already guaranteed by Infinispan's essential >> > logic, so it would reuse that >> > - the propagation of writes on the index from the primary owner >> > (which is the local node by definition) to backup owners could use >> > REPL_ASYNC for most practical use cases. >> > >> > So net result is that the overhead for indexing is reduced to 0 (ZERO) >> > blocking RPCs if the async repl is acceptable, or to only one blocking >> > roundtrip if very strict consistency is required. > > > Sounds very interesting, but I think there may be a problem with your > strategy: Infinispan doesn't guarantee you that one of the nodes executing > the CommitCommand is the primary owner at the time the CommitCommand is > executed. You could have something like this: Index storage is generally used without transactions, but even assuming we had transactions enabled or the "vanilla" put operation suffered had a similar timing issue (as we'd determine this node to be owner higher up in the search stack, before the actual put reaches the Infinispan core API) it's not a problem as we'd simply lose locality of the write: it would be slightly less efficient, but still write the "right thing". The intention is to maximise locality with the hints, but failing locality on writes I just expect it to be handled as any other put operation on Infinispan.. with a couple more RPCs, with any race condition regarding topology changes being handled at lower level. Which is exactly why I'm now working on top of container segments: they are a stable building block and allow us to not worry on how you'll actually distribute data or re-route update commands. Thanks all for the suggestions! Sanne > > Cluster [A, B, C, D], key k, owners(k) = [A, B] (A is primary) > C initiates a tx that executes put(k, v) > Tx prepare succeeds on A and B > A crashes, but the other nodes don't detect the crash yet > Tx commit succeeds on B, who still thinks is a backup owner > B detects the crash, installs a new cluster view consistent hash with > owners(k) = [B] > > >> >> > >> > Thanks, >> > Sanne _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev