On Fri, Apr 27, 2012 at 9:08 AM, Emmanuel Lécharny <[email protected]> wrote: > Le 4/27/12 5:46 PM, Selcuk AYA a écrit : > >> On Fri, Apr 27, 2012 at 6:06 AM, Emmanuel Lécharny<[email protected]> >> wrote: >>> >>> Le 4/27/12 2:09 PM, Selcuk AYA a écrit : >>> >>>> Hi, >>>> On Thu, Apr 26, 2012 at 4:52 PM, Emmanuel Lécharny<[email protected]> >>>> wrote: >>>>> >>>>> Hi guys, >>>>> >>>>> currently, we can replace an existing value when its associated key >>>>> exists >>>>> in the BTree. What I'd like to do is to replace both key and value in >>>>> this >>>>> case. >>>>> >>>>> The rational is that the key mau contain some extra information that >>>>> are >>>>> not >>>>> used to find the key, like the nbChildren/nbDescendant for a >>>>> ParentIdAndrdn >>>>> key in the RdnIndex. >>>>> >>>>> Is there an easy way to do that with JDBM, or should we extend the API >>>>> it >>>>> offers ? >>>> >>>> as I understand you are trying to store some auxiliary information per >>>> key? >>> >>> Yep. >>> >>> >>>> With JDBM we can: >>>> - provide<K, aux> tuple as the key and pass the key comparator to >>>> be the comparator for the actual keys. >>>> - change JDBM code to update key inside when replacing an entry. >>>> >>>> however, in general, I do not think we can expect a key/value store to >>>> update the key when replacing an entry. So I am not sure it is a good >>>> idea to do this kind of thing. >>> >>> >>> The pb is that those auxiliary informations are not part of the compare >>> being done. Ie, two keys may be seen as equals, but the auxiliary >>> information might be different. In this very case, I'd like the new >>> version >>> of the key replaced. >>> It may be very specific to my need, and we can also consider that the >>> value >>> might cary those extra informations, too, instead of storing them in the >>> key. >>> >>> So for the<ParentIdAndRdn, ID> tuple, where the data structure is : >>> ParentIdAndRdn : >>> { >>> ID parentId, >>> Rdn[] rdns, >>> int nbChildren, >>> int nbDescendant >>> } >>> >>> we could inject the two ints after having read the value if this value >>> structure was : >>> >>> AugmentedID >>> { >>> ID id >>> int nbChildren, >>> int nbDescendant >>> } >> >> I think this would be better. However, note that in txns branch, we >> removed most of the generics and assumed that indices store UUID as >> their value. I hope we go with removal/add ( I guess this is what you >> are doing now) > > Should not be a big problem here. Generics are an issue anyway... > OK, on txn branch the assumption that we store UUID as the value of an index has to be removed then.
>> >>> and serializing the ParentIdAndRdn ID and Rdn[] only. >> >> I am kind of confused, I have a couple of questions: >> >> 1) Can you explain why we need to store nbChildren and nbDescedant but >> do not need to serialize it? > > We *do* serialize those elements. What I suggest is that we store those > information in the values, not in the keys, as in any case, we will get a > tuple <key, value>, so we will always have the extended informations. > > >> What if the page storing an entry is >> kicked out of cache and we loose nbChildren and nbDescendant? > > Then we will read them from disk, if the cache does not contain those > information, right ? > > >> >> 2)Connected with the above question, how do we maintain these >> auxiliary data? Are there maintained only in memory or on disk as >> well? Do they have to absolutely correct to get consistent data out of >> the server? > > On memory and on disk. They are absolutely critical as they are used to > browe the tree while doing a One/Subtree scope search. > > >> >> 3) Lets say I have<rdn1, rdn2,rdn3> and<rdn1,rdn2, rdn4> . Thread 1 >> adds<rdn1, rdn2,rdn3, rdn5> and thread 2 add<rdn1,rdn2, rdn4,rdn6> >> concurrently. >> These two threads run concurrently but they should not >> logically conflict. Now they have to read nbDescendants of<rdn1,rdn2> >> and increment it but they have to do it in such a way that the net >> effect of these two threads is serialized. > > yep. > > >> I am not sure how to do >> this. > > do you mean that the indexes aren't gloablly protected when we modify some > data ? I thought that a modification will protect all the index it will > update until they have all been updated, then the next modification can > operate. No. > > We don't really care if that serialize the modifications, because the server > does not have to be fast when we inject data, only when we read data. At > least, we could think about a serialization over the operations on one index > like the RdnIndex (as an entry modification may update more than one entry > in this index). > > Is that a problem ? Depends. What I have been describing in the emails and trying to implement is an optimistic locking scheme where modifications can go in parallel unless they conflict. It seems we could just get a big lock for write txns rather than dealing with optimistic concurrency control. > > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > thanks Selcuk
