Hello!

Of course, this setting will be configurable.

Regards,
-- 
Ilya Kasnacheev


ср, 5 сент. 2018 г. в 3:21, Dmitriy Setrakyan <dsetrak...@apache.org>:

> In my view, dictionary of 1024 bytes is not going to be nearly enough.
>
> On Tue, Sep 4, 2018 at 8:06 AM, Ilya Kasnacheev <ilya.kasnach...@gmail.com
> >
> wrote:
>
> > Hello!
> >
> > In case of Apache Ignite, most of savings is due to BinaryObject format,
> > which encodes types and fields with byte sequences. Any enum/string flags
> > will also get in dictionary. And then as it processes a record it fills
> up
> > its individual dictionary.
> >
> > But, in one cache, most if not all entries have identical BinaryObject
> > layout so a tiny dictionary covers that case. Compression algorithms are
> > not very keen on large dictionaries, preferring to work with local
> > regularities in byte stream.
> >
> > E.g. if we have large entries in cache with low BinaryObject overhead,
> > they're served just fine by "generic" compression.
> >
> > All of the above is my speculations, actually. I just observe that on a
> > large data set, compression ratio is around 0.4 (2.5x) with a dictionary
> of
> > 1024 bytes. The rest is black box.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > вт, 4 сент. 2018 г. в 17:16, Dmitriy Setrakyan <dsetrak...@apache.org>:
> >
> > > On Tue, Sep 4, 2018 at 2:55 AM, Ilya Kasnacheev <
> > ilya.kasnach...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hello!
> > > >
> > > > Each node has a local dictionary (per node currently, per cache
> > planned).
> > > > Dictionary is never shared between nodes. As data patterns shift,
> > > > dictionary rotation is also planned.
> > > >
> > > > With Zstd, the best dictionary size seems to be 1024 bytes. I imagine
> > It
> > > is
> > > > enough to store common BinaryObject boilerplate, and everything else
> is
> > > > compressed on the fly. The source sample is 16k records.
> > > >
> > > >
> > > Thanks, Ilya, understood. I think per-cache is a better idea. However,
> I
> > > have a question about dictionary size. Ignite stores TBs of data. How
> do
> > > you plan the dictionary to fit in 1K bytes?
> > >
> > > D.
> > >
> >
>

Reply via email to