Re: Partition and Backend confusion

Alex Karasulu Fri, 08 Jul 2011 02:40:20 -0700

FYI read, will read again. Under heavy load but I am not giving up on
this thread ... it's a good one :-). Just wanted to give you a heads
up. More to come ...


Thanks for understanding,
Alex

On Fri, Jul 8, 2011 at 9:23 AM, Emmanuel Lécharny <[email protected]> wrote:
> On 7/8/11 12:10 AM, Alex Karasulu wrote:
>>
>> On Wed, Jun 29, 2011 at 3:10 PM, Emmanuel Lecharny<[email protected]>
>>  wrote:
>>
>> SNIP ...
>>
>>> We currently have a common Partition interface, which is the base on
>>> which
>>> all the backend implementations are built. It's also used as an interface
>>> for the Nexus.
>>
>> Yes.
>>
>>> In fact, we can split the Partition implementations in two categories :
>>> 1) those which are manipulation an opertation context (AddContext,
>>> DeleteOperationContext, etc)
>>> 2) those which are interacting with the underlying store
>>
>> This does not make any sense to me at all. I can't see these as being
>> two distinct categories. I must not be understanding you, can you
>> elaborate?
>
> Sure. What I'm saying is that we have one layer which takes methods with
> OperationContext parameters, and transforms them to what is expected by the
> under layer. To me, those two layers are two different things.
>>>
>>> The current hierarchy is (<XXX>  : interface, [YYY] : abstract class) :
>>> <Partition>
>>>  [AbstractPartition]
>>>    [BTreePartition<ID>]
>>>      [AbstractLdifPartition]
>>>        LdifPartition
>>>        ReadOnlyConfigurationPartition
>>>        SingleFileLdifPartition
>>>      [AbstractXdbmPartition<ID>]
>>>        AvlPartition
>>>        JdbmPartition
>>>    DefaultPartitionNexus (also implement<PartitionNexus>)
>>>    NullPartition
>>>    SchemaPartition
>>>
>>> Some few remarks :
>>> - the BTreePartition<ID>  should be renamed AbstractBTreePartition
>>> - we should have a BTreePartition interface
>>
>> Why?
>
> All the abstract classes we have in ADS are prefixed by Abstract. For
> consistency reasons, I do think that we should rename BTreePartition to
> AbstractBTreePartition.
>
> Also as it exposes methods which are specific to BTrees, an interface would
> be a good way to isolate the BTree behaviors.
>
> Nothing big here, just clarification.
>>>
>>> I'm also wondering if we should not make a better distinction between
>>> what
>>> is backed by a store (ie, BTreePartition and SchemaPartition) and what is
>>> not (ie PartitionNexus). Morever, why should the PartitionNexus extend
>>> the
>>> Partition interface ? Does it make sense?
>>
>> The PartitionNexus is a proxy to partitions so it implements the
>> interface. It's a single point to apply operations and have the route
>> to the appropriate partition.
>
> Makes sense.
>>
>> There's work to be done in this area for sure. First off I'd like to
>> see partitions that hash entries across other partitions and some that
>> contain entries and still can nest other partitions: acting both as
>> entry stores and routers of operations. For example I've wanted a root
>> partition that could also mount (nest) other partitions while still
>> storing entries so the root DSE can be mastered in it and we can
>> manage other subentries for the server in it instead of at the
>> namingContext level.
>
> With you.
>>
>> Incidentally the store interface might be able to be gotten rid of.
>
> Hmmm, can you elaborate ?
>>
>> The key to several things we're going to do down the line around
>> partitions rests around having entry ID be globally unique rather than
>> unique within just the partition. After this is done it opens the door
>> to several solutions ... including solutions to a couple recent
>> problems:
>>
>>   (1) aliases referring to entry targets across partitions
>>   (2) moddn operations across partitions
>>   (3) virtualization, via views, and other constructs need it
>
> Just wondering how badly we need to get rid of those IDs. They are not
> unique, each partition has it own, but AFAICT, if we transit one entry from
> a partition to another one (moddn), we don't care too much about the ID.
>
> Regarding Aliases, I'm not sure (yet) we have to deal with them at this
> layer. Still have to think about it.
>
> Moddn ops can be leveraged across partitions even if we keep the ID around.
> One partition does not have to know anything about the other partition's ID.
> We are just moving full entries (and all the associated index) from one
> partition to the other, as if it was a delete on one side, and an add on the
> other.
>
> Virtualization is most certainly handled at an upper layer, and should
> probably don't have to know anything about the storage.
>>
>> ....
>>
>> There's more. But first we need a globally unique UUID for entries as
>> the PK and we need to get rid of using long partition specific entry
>> IDs as the PK.
>
> Ok. I'm not sure we need to get rid of IDs right now, but I may be missing
> some element sin the big picture atm. It needs some serious consideration
> anyway. This is not something we should do lightly, and certainly not for
> 2.0. However, if we need to do this move and we may perfectly have to do it,
> then we need a stabilized base to work on.
>>
>> I would not change around interfaces right now. It's just going shift
>> things without a clear direction and as you said yourself you're new
>> to this code. Class renames and a few interface changes just to get
>> familiar and comfortable with the code base is not going to help down
>> the line.
>
> I don't think either we need to change a hell lots of things ATM. Far too
> dangerous, and probably overkilling. As you said, this is a part of the code
> I don't know well, and I'm just pushing some ideas around to see where it's
> bringing me. I already paid the price once by killing one week on a reverse
> table removal for nothing, I certainly would like to avoid such waste of
> time again.
>>
>> Let's go global on the UUID and look at the big partition picture. We
>> can redesign things to best suite small steps to get to our ultimate
>> destination.
>
> Sure. Right now, I'm pushing ideas. I don't want them to be pushed into the
> server, it's way too far fetched, and I may miss the target at large. In any
> case, I don't want to jeopardize 2.0, when what we need to make it solid is
> just a couple of features (namely, replication and DSR).
>
> Atm, I'm just trying to get aliases work smoothly but if it requires some
> huge refactoring, then I'll let it down for 2.0. We don't need aliases for
> 2.0, we just need replication.
>
> If we have to refactor heavily the backend to get aliases working fine, then
> I'm fine for a 2.1 or a 3.0. In any case, no urgency.
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>

Re: Partition and Backend confusion

Reply via email to