Hello! Of course, this setting will be configurable.
Regards, -- Ilya Kasnacheev ср, 5 сент. 2018 г. в 3:21, Dmitriy Setrakyan <dsetrak...@apache.org>: > In my view, dictionary of 1024 bytes is not going to be nearly enough. > > On Tue, Sep 4, 2018 at 8:06 AM, Ilya Kasnacheev <ilya.kasnach...@gmail.com > > > wrote: > > > Hello! > > > > In case of Apache Ignite, most of savings is due to BinaryObject format, > > which encodes types and fields with byte sequences. Any enum/string flags > > will also get in dictionary. And then as it processes a record it fills > up > > its individual dictionary. > > > > But, in one cache, most if not all entries have identical BinaryObject > > layout so a tiny dictionary covers that case. Compression algorithms are > > not very keen on large dictionaries, preferring to work with local > > regularities in byte stream. > > > > E.g. if we have large entries in cache with low BinaryObject overhead, > > they're served just fine by "generic" compression. > > > > All of the above is my speculations, actually. I just observe that on a > > large data set, compression ratio is around 0.4 (2.5x) with a dictionary > of > > 1024 bytes. The rest is black box. > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > вт, 4 сент. 2018 г. в 17:16, Dmitriy Setrakyan <dsetrak...@apache.org>: > > > > > On Tue, Sep 4, 2018 at 2:55 AM, Ilya Kasnacheev < > > ilya.kasnach...@gmail.com > > > > > > > wrote: > > > > > > > Hello! > > > > > > > > Each node has a local dictionary (per node currently, per cache > > planned). > > > > Dictionary is never shared between nodes. As data patterns shift, > > > > dictionary rotation is also planned. > > > > > > > > With Zstd, the best dictionary size seems to be 1024 bytes. I imagine > > It > > > is > > > > enough to store common BinaryObject boilerplate, and everything else > is > > > > compressed on the fly. The source sample is 16k records. > > > > > > > > > > > Thanks, Ilya, understood. I think per-cache is a better idea. However, > I > > > have a question about dictionary size. Ignite stores TBs of data. How > do > > > you plan the dictionary to fit in 1K bytes? > > > > > > D. > > > > > >