Hi Igniters, Actually I do not understand both points of view: we need to (keep/remove) cache groups.
Only one reason for refactoring I see : 'too much fsyncs', but it may be solved at level of FilePageStoreV2 with new virtual FS for partitions/index data, without any other changes. Sincerely, Dmitriy Pavlov ср, 11 апр. 2018 г. в 13:30, Vladimir Ozerov <voze...@gridgain.com>: > Anton, > > I do not see the point. What is the problem with creation or removal of > real cache? > > On Wed, Apr 11, 2018 at 1:05 PM, Anton Vinogradov <a...@apache.org> wrote: > > > Vova, > > > > Cache groups are very useful. > > > > For example, you can develop multi-tenant applications using cache groups > > as a templates. > > In case you have some cache groups, eg. Users, Loans, Deposits, you can > > keep records for Organisation_A, Organisation_B and Organisation_C at > same > > data sctuctures, but logically separated. > > Addition/Removal of orgatisation will not cause creation or removal of > real > > caches. > > > > ASAIK, you can use GridSecurity [1] over caches inside cache groups, and > > gain secured multi-tenant environment as a result. > > > > Can you propose better solution without cache groups usage? > > > > [1] https://docs.gridgain.com/docs/security-concepts > > > > 2018-04-11 0:24 GMT+03:00 Denis Magda <dma...@apache.org>: > > > > > Vladimir, > > > > > > - Data size per-cache > > > > > > > > > Could you elaborate how the data size per-cache/table task will be > > > addressed with proposed architecture? Are you going to store data of a > > > specific cache in dedicated pages/segments? What's about index size? > > > > > > -- > > > Denis > > > > > > On Tue, Apr 10, 2018 at 2:31 AM, Vladimir Ozerov <voze...@gridgain.com > > > > > wrote: > > > > > > > Dima, > > > > > > > > 1) Easy to understand for users > > > > AI 2.x: cluster -> cache group -> cache -> table > > > > AI 3.x: cluster -> cache(==table) > > > > > > > > 2) Fine grained cache management > > > > - MVCC on/off per-cache > > > > - WAL mode on/off per-cache > > > > - Data size per-cache > > > > > > > > 3) Performance: > > > > - Efficient scans are not possible with cache groups > > > > - Efficient destroy/DROP - O(N) now, O(1) afterwards > > > > > > > > "Huge refactoring" is not precise estimate. Let's think on how to do > > that > > > > instead of how not to do :-) > > > > > > > > On Tue, Apr 10, 2018 at 11:41 AM, Dmitriy Setrakyan < > > > dsetrak...@apache.org > > > > > > > > > wrote: > > > > > > > > > Vladimir, sounds like a huge refactoring. Other than "cache groups > > are > > > > > confusing", are we solving any other big issues with the new > proposed > > > > > approach? > > > > > > > > > > (every time we try to refactor rebalancing, I get goose bumps) > > > > > > > > > > D. > > > > > > > > > > On Tue, Apr 10, 2018 at 1:32 AM, Vladimir Ozerov < > > voze...@gridgain.com > > > > > > > > > wrote: > > > > > > > > > > > Igniters, > > > > > > > > > > > > Cache groups were implemented for a sole purpose - to hide > internal > > > > > > inefficiencies. Namely (add more if I missed something): > > > > > > 1) Excessive heap usage for affinity/partition data > > > > > > 2) Too much data files as we employ file-per-partition approach. > > > > > > > > > > > > These problems were resolved, but now cache groups are a great > > source > > > > of > > > > > > confusion both for users and us - hard to understand, no way to > > > > configure > > > > > > it in deterministic way. Should we resolve mentioned performance > > > issues > > > > > we > > > > > > would never had cache groups. I propose to think we would it take > > for > > > > us > > > > > to > > > > > > get rid of cache groups. > > > > > > > > > > > > Please provide your inputs to suggestions below. > > > > > > > > > > > > 1) "Merge" partition data from different caches > > > > > > Consider that we start a new cache with the same affinity > > > configuration > > > > > > (cache mode, partition number, affinity function) as some of > > already > > > > > > existing caches, Is it possible to re-use partition distribution > > and > > > > > > history of existing cache for a new cache? Think of it as a kind > of > > > > > > automatic cache grouping which is transparent to the user. This > > would > > > > > > remove heap pressure. Also it could resolve our long-standing > issue > > > > with > > > > > > FairAffinityFunction when tow caches with the same affinity > > > > configuration > > > > > > are not co-located when started on different topology versions. > > > > > > > > > > > > 2) Employ segment-extent based approach instead of > > file-per-partition > > > > > > - Every object (cache, index) reside in dedicated segment > > > > > > - Segment consists of extents (minimal allocation units) > > > > > > - Extents are allocated and deallocated as needed > > > > > > - *Ignite specific*: particular extent can be used by only one > > > > partition > > > > > > - Segments may be located in any number of data files we find > > > > convenient > > > > > > With this approach "too many fsyncs" problem goes away > > automatically. > > > > At > > > > > > the same time it would be possible to implement efficient > rebalance > > > > still > > > > > > as partition data will be split across moderate number of > extents, > > > not > > > > > > chaotically. > > > > > > > > > > > > Once we have p.1 and p.2 ready cache groups could be removed, > > > couldn't > > > > > > they? > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > >