Re: Unstructured object format.

Alexey Goncharuk Mon, 20 Jul 2015 23:15:07 -0700

Currently an index-enabled serialized object form has the following layout
(simplified):


[object fields][field1Offset,field1Length,
field2Offset,field2Length,...,fieldNOffset,fieldNLength]

where fields order is determined upon the first object serialization and
stored in metadata cache which is available on all nodes. Thus, the field
lookup is performed as follows:

fieldName -> fieldIndex (metadata lookup, O(1)), fieldIndex->fieldOffset in
footer (O(1)), fieldOffset->fieldValue (O(1)).

BTW, I am finalizing the branch with marshaller changes and will send this
for a preliminary review soon.

2015-07-16 0:55 GMT-07:00 Atri Sharma <atri.j...@gmail.com>:

> Keep in mind that JSONB's performance comes from the fact that it uses
> server encoding, is binary represented and can have GIN indexes on top of
> it. Not sure if Ignite's marshalling approach keeps those features as well.
>
> On Thu, Jul 16, 2015 at 1:20 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > HSTORE and JSONB appeared to have similar format in Postgresql (because
> > they was developed by the same people). They noticed that they switched
> off
> > of using key length sorting because they sometimes need lexicographical
> > order but this is irrelevant for us.
> >
> > Sergi
> >
> > 2015-07-16 10:43 GMT+03:00 Atri Sharma <atri.j...@gmail.com>:
> >
> > > Are you referring to JSONB here?
> > >
> > > On Thu, Jul 16, 2015 at 1:10 PM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > Guys, specially Alexey G.
> > > >
> > > > I've attended PostgreSQL conference and there was a talk about
> > > unstructured
> > > > data format.
> > > > They had an interesting idea of serialized layout close enough to
> ours,
> > > I'm
> > > > not sure how much this is different from our approach and if we can
> use
> > > > some ideas from it but anywaus it looks really promising to me and I
> > want
> > > > to share.
> > > >
> > > > The structure basically is the following:
> > > >
> > > > [key headers] [keys] [values]
> > > >
> > > > Key headers are [key offset, key length] so they are of a fixed
> length.
> > > >
> > > > The cool idea here is that keys and respectively the key headers
> sorted
> > > by
> > > > (key length, key) so that you can do a lookup first by fast picking
> key
> > > of
> > > > the needed length without looking at keys at all and then pick an
> exact
> > > > key. Both searches can be done with fast scan if there are small
> number
> > > of
> > > > keys and binary search for a larger number of keys.
> > > >
> > > > Alexey G., could you please compare this to our new marshalling
> > approach
> > > > you are about to merge?
> > > > BTW, it would be nice if you will describe it in details here.
> > > >
> > > > Sergi
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Atri
> > > *l'apprenant*
> > >
> >
>
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

Re: Unstructured object format.

Reply via email to