Re: Unstructured object format.

Sergi Vladykin Tue, 21 Jul 2015 10:02:13 -0700

I think O(N) reasoning does not make a real sense here since N is always
small, lets not fool ourselves.
To my mind operation cost of cache access (with all busy locks...),
hashCode/equals and stuff like that has much bigger impact here.
Do we still have a pluggable marshaller? Can my approach be implemented
separately?


Sergi




2015-07-21 9:14 GMT+03:00 Alexey Goncharuk <[email protected]>:

> Currently an index-enabled serialized object form has the following layout
> (simplified):
>
> [object fields][field1Offset,field1Length,
> field2Offset,field2Length,...,fieldNOffset,fieldNLength]
>
> where fields order is determined upon the first object serialization and
> stored in metadata cache which is available on all nodes. Thus, the field
> lookup is performed as follows:
>
> fieldName -> fieldIndex (metadata lookup, O(1)), fieldIndex->fieldOffset in
> footer (O(1)), fieldOffset->fieldValue (O(1)).
>
> BTW, I am finalizing the branch with marshaller changes and will send this
> for a preliminary review soon.
>
> 2015-07-16 0:55 GMT-07:00 Atri Sharma <[email protected]>:
>
> > Keep in mind that JSONB's performance comes from the fact that it uses
> > server encoding, is binary represented and can have GIN indexes on top of
> > it. Not sure if Ignite's marshalling approach keeps those features as
> well.
> >
> > On Thu, Jul 16, 2015 at 1:20 PM, Sergi Vladykin <
> [email protected]>
> > wrote:
> >
> > > HSTORE and JSONB appeared to have similar format in Postgresql (because
> > > they was developed by the same people). They noticed that they switched
> > off
> > > of using key length sorting because they sometimes need lexicographical
> > > order but this is irrelevant for us.
> > >
> > > Sergi
> > >
> > > 2015-07-16 10:43 GMT+03:00 Atri Sharma <[email protected]>:
> > >
> > > > Are you referring to JSONB here?
> > > >
> > > > On Thu, Jul 16, 2015 at 1:10 PM, Sergi Vladykin <
> > > [email protected]>
> > > > wrote:
> > > >
> > > > > Guys, specially Alexey G.
> > > > >
> > > > > I've attended PostgreSQL conference and there was a talk about
> > > > unstructured
> > > > > data format.
> > > > > They had an interesting idea of serialized layout close enough to
> > ours,
> > > > I'm
> > > > > not sure how much this is different from our approach and if we can
> > use
> > > > > some ideas from it but anywaus it looks really promising to me and
> I
> > > want
> > > > > to share.
> > > > >
> > > > > The structure basically is the following:
> > > > >
> > > > > [key headers] [keys] [values]
> > > > >
> > > > > Key headers are [key offset, key length] so they are of a fixed
> > length.
> > > > >
> > > > > The cool idea here is that keys and respectively the key headers
> > sorted
> > > > by
> > > > > (key length, key) so that you can do a lookup first by fast picking
> > key
> > > > of
> > > > > the needed length without looking at keys at all and then pick an
> > exact
> > > > > key. Both searches can be done with fast scan if there are small
> > number
> > > > of
> > > > > keys and binary search for a larger number of keys.
> > > > >
> > > > > Alexey G., could you please compare this to our new marshalling
> > > approach
> > > > > you are about to merge?
> > > > > BTW, it would be nice if you will describe it in details here.
> > > > >
> > > > > Sergi
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Atri
> > > > *l'apprenant*
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>

Re: Unstructured object format.

Reply via email to