+1 as well. I wonder if it makes sense to further enhance the header fields to have more metadata readily available (so that we can support more JSONB like operators in a much efficient manner).
On Wed, Jul 22, 2015 at 3:37 PM, Atri Sharma <atri.j...@gmail.com> wrote: > +1 from me too > > On Wed, Jul 22, 2015 at 3:34 PM, Sergi Vladykin <sergi.vlady...@gmail.com> > wrote: > >> Ok, looks good to me. >> >> Sergi >> >> 2015-07-22 10:46 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>: >> >> > On Tue, Jul 21, 2015 at 10:15 PM, Atri Sharma <atri.j...@gmail.com> >> wrote: >> > >> > > So does that mean that local hashmap is not controlled with all the >> heavy >> > > locks that are present around the cache? >> > > >> > >> > Yes Atri, you are right. The data is stored in local hash map to avoid >> > touching a distributed cache whenever serializing objects. >> > >> > >> > > On 22 Jul 2015 07:31, "Alexey Goncharuk" <alexey.goncha...@gmail.com> >> > > wrote: >> > > >> > > > Metadata cache access is backed by a local hash map, so the real >> cost >> > is >> > > a >> > > > String object hashcode which is cached in the String object and a >> > hashmap >> > > > lookup by an integer key. >> > > > >> > > > On the other hand, the marshaller is still pluggable and after the >> > ticket >> > > > is completed, it should be fairly easy to implement this approach >> and >> > > > compare performance. >> > > > >> > > > --Alexey >> > > > >> > > > 2015-07-21 10:01 GMT-07:00 Sergi Vladykin <sergi.vlady...@gmail.com >> >: >> > > > >> > > > > I think O(N) reasoning does not make a real sense here since N is >> > > always >> > > > > small, lets not fool ourselves. >> > > > > To my mind operation cost of cache access (with all busy >> locks...), >> > > > > hashCode/equals and stuff like that has much bigger impact here. >> > > > > Do we still have a pluggable marshaller? Can my approach be >> > implemented >> > > > > separately? >> > > > > >> > > > > Sergi >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > 2015-07-21 9:14 GMT+03:00 Alexey Goncharuk < >> > alexey.goncha...@gmail.com >> > > >: >> > > > > >> > > > > > Currently an index-enabled serialized object form has the >> following >> > > > > layout >> > > > > > (simplified): >> > > > > > >> > > > > > [object fields][field1Offset,field1Length, >> > > > > > field2Offset,field2Length,...,fieldNOffset,fieldNLength] >> > > > > > >> > > > > > where fields order is determined upon the first object >> > serialization >> > > > and >> > > > > > stored in metadata cache which is available on all nodes. Thus, >> the >> > > > field >> > > > > > lookup is performed as follows: >> > > > > > >> > > > > > fieldName -> fieldIndex (metadata lookup, O(1)), >> > > > fieldIndex->fieldOffset >> > > > > in >> > > > > > footer (O(1)), fieldOffset->fieldValue (O(1)). >> > > > > > >> > > > > > BTW, I am finalizing the branch with marshaller changes and will >> > send >> > > > > this >> > > > > > for a preliminary review soon. >> > > > > > >> > > > > > 2015-07-16 0:55 GMT-07:00 Atri Sharma <atri.j...@gmail.com>: >> > > > > > >> > > > > > > Keep in mind that JSONB's performance comes from the fact >> that it >> > > > uses >> > > > > > > server encoding, is binary represented and can have GIN >> indexes >> > on >> > > > top >> > > > > of >> > > > > > > it. Not sure if Ignite's marshalling approach keeps those >> > features >> > > as >> > > > > > well. >> > > > > > > >> > > > > > > On Thu, Jul 16, 2015 at 1:20 PM, Sergi Vladykin < >> > > > > > sergi.vlady...@gmail.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > HSTORE and JSONB appeared to have similar format in >> Postgresql >> > > > > (because >> > > > > > > > they was developed by the same people). They noticed that >> they >> > > > > switched >> > > > > > > off >> > > > > > > > of using key length sorting because they sometimes need >> > > > > lexicographical >> > > > > > > > order but this is irrelevant for us. >> > > > > > > > >> > > > > > > > Sergi >> > > > > > > > >> > > > > > > > 2015-07-16 10:43 GMT+03:00 Atri Sharma <atri.j...@gmail.com >> >: >> > > > > > > > >> > > > > > > > > Are you referring to JSONB here? >> > > > > > > > > >> > > > > > > > > On Thu, Jul 16, 2015 at 1:10 PM, Sergi Vladykin < >> > > > > > > > sergi.vlady...@gmail.com> >> >> > > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Guys, specially Alexey G. >> > > > > > > > > > >> > > > > > > > > > I've attended PostgreSQL conference and there was a talk >> > > about >> > > > > > > > > unstructured >> > > > > > > > > > data format. >> > > > > > > > > > They had an interesting idea of serialized layout close >> > > enough >> > > > to >> > > > > > > ours, >> > > > > > > > > I'm >> > > > > > > > > > not sure how much this is different from our approach >> and >> > if >> > > we >> > > > > can >> > > > > > > use >> > > > > > > > > > some ideas from it but anywaus it looks really >> promising to >> > > me >> > > > > and >> > > > > > I >> > > > > > > > want >> > > > > > > > > > to share. >> > > > > > > > > > >> > > > > > > > > > The structure basically is the following: >> > > > > > > > > > >> > > > > > > > > > [key headers] [keys] [values] >> > > > > > > > > > >> > > > > > > > > > Key headers are [key offset, key length] so they are of >> a >> > > fixed >> > > > > > > length. >> > > > > > > > > > >> > > > > > > > > > The cool idea here is that keys and respectively the key >> > > > headers >> > > > > > > sorted >> > > > > > > > > by >> > > > > > > > > > (key length, key) so that you can do a lookup first by >> fast >> > > > > picking >> > > > > > > key >> > > > > > > > > of >> > > > > > > > > > the needed length without looking at keys at all and >> then >> > > pick >> > > > an >> > > > > > > exact >> > > > > > > > > > key. Both searches can be done with fast scan if there >> are >> > > > small >> > > > > > > number >> > > > > > > > > of >> > > > > > > > > > keys and binary search for a larger number of keys. >> > > > > > > > > > >> > > > > > > > > > Alexey G., could you please compare this to our new >> > > marshalling >> > > > > > > > approach >> > > > > > > > > > you are about to merge? >> > > > > > > > > > BTW, it would be nice if you will describe it in details >> > > here. >> > > > > > > > > > >> > > > > > > > > > Sergi >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > -- >> > > > > > > > > Regards, >> > > > > > > > > >> > > > > > > > > Atri >> > > > > > > > > *l'apprenant* >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Regards, >> > > > > > > >> > > > > > > Atri >> > > > > > > *l'apprenant* >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > > > > -- > Regards, > > Atri > *l'apprenant* > -- Regards, Atri *l'apprenant*