Hello Igniters. As was discussed, IgniteSet implementation was based on on-heap data duplication (setDataMap), as a result, the data was not recovered after cluster restart and in the case of large data sets, this led to a significant heap growing and gc pressure.
We changed the implementation so that this structure works well without duplicating the data [1]. To reduce performance drop and speed up large data sets, non-collocated version of IgniteSet now uses separate cache [2]. [1] https://issues.apache.org/jira/browse/IGNITE-5553 [2] https://issues.apache.org/jira/browse/IGNITE-7823 ср, 27 июн. 2018 г. в 23:26, Amir Akhmedov <amir.akhme...@gmail.com>: > Yes, you are right. > > Thanks, > Amir > > > On Wed, Jun 27, 2018 at 1:15 PM Denis Magda <dma...@apache.org> wrote: > > > Got you. If it's about redundant data duplication in onheap region then > no > > any concerns from my side. > > > > Anyway, considering that the data structure will be interacting with the > > page memory directly then its entries can be stored in Ignite persistence > > automatically (if the latter is on). Does it mean that the data structure > > will be fully recovered after a restart and its entries can be pulled > from > > disk on demand? > > > > -- > > Denis > > > > > > On Tue, Jun 26, 2018 at 1:49 PM Amir Akhmedov <amir.akhme...@gmail.com> > > wrote: > > > > > I also think it will better to remove setDataMap support cause > > > 1. It's making extra pressure on GC by keeping entries on heap > > > 2. It has difficult logic to support with lots of nuances > > > 3. To maintain setDataMap today GridCacheMapEntry calls > > > cctx.dataStructures().onEntryUpdated() on each entry mutation. I think > > it's > > > unnecessary cohesion. > > > 4. For the case with single Ignite cache for all collocated > > datastructure, > > > an iterator creation will not be much slower than current > implementation > > > since we can run affinity call on the node where all entries reside. > > Also, > > > we can create a better affinity mapper to fairly distribute > > datastructures > > > across a cluster rather than mapping by datastructure's name. > > > > > > Thanks, > > > Amir > > > > > > > > > On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <a...@apache.org> > wrote: > > > > > > > Denis, > > > > > > > > I think that better case is to remove onheap > optimisation/duplication. > > > > This brings no drop to frequently used operations (put/remove), but > > even > > > > will make it slightly faster. > > > > > > > > The only one question we have here is "is it possible to restore > onheap > > > map > > > > in easy way?". > > > > Seems that answer is no, so, I vote for setDataMap removal. > > > > > > > > вт, 26 июн. 2018 г. в 15:00, Denis Magda <dma...@apache.org>: > > > > > > > > > Anton, > > > > > > > > > > Will it be possible to reuse such a functionality for the rest of > > data > > > > > structures? I would invest our time in this if all data structures > > > would > > > > be > > > > > able to work with Ignite persistence this way. > > > > > > > > > > -- > > > > > Denis > > > > > > > > > > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <a...@apache.org> > > > wrote: > > > > > > > > > > > >> Why don't we read data straight from the persistence layer > > warming > > > > RAM > > > > > > up > > > > > > >> in the background? > > > > > > Because it's not a trivial task to finish such loading on > unstable > > > > > > topology. > > > > > > That's possible, ofcourse, but solution and complexity will be > > almost > > > > > > equals to WAL enable/disable. > > > > > > > > > > > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dma...@apache.org>: > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > Why don't we read data straight from the persistence layer > > warming > > > > RAM > > > > > up > > > > > > > in the background? (like we do for SQL and other APIs). If > it's a > > > > > > question > > > > > > > of time, then I would suggest us not to hurry up and do it in a > > > right > > > > > > way. > > > > > > > > > > > > > > -- > > > > > > > Denis > > > > > > > > > > > > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov < > a...@apache.org> > > > > > wrote: > > > > > > > > > > > > > > > +1 to removal in case there is no easy, fast and consistent > way > > > to > > > > > > > restore > > > > > > > > setDataMap on node restart. > > > > > > > > I see that we'll gain some performance drop on size() or > > keys(), > > > > but > > > > > > > these > > > > > > > > methods are rarely used. > > > > > > > > > > > > > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin < > > xxt...@gmail.com > > > >: > > > > > > > > > > > > > > > > > Hello, Igniters. > > > > > > > > > > > > > > > > > > I tried to implement IgniteSet data recovery when > persistence > > > > > enabled > > > > > > > > > [1] using trivial cache scanning, however I cannot find > > optimal > > > > way > > > > > > to > > > > > > > > > do that because of the following reasons: > > > > > > > > > - Performing operations on IgniteSet requires completion of > > > data > > > > > > > > > loading (restoring of setDataMap) on all nodes. Do this > > during > > > > > > > > > partition map exchange is too long. > > > > > > > > > - The prohibition of operations on IgniteSet before the > > > > completion > > > > > of > > > > > > > > > asynchronous cache scanning on all nodes looks rather > > > > complicated, > > > > > > > > > because It is necessary to support all situations of > unstable > > > > > > > > > topology. > > > > > > > > > > > > > > > > > > So I see one option to fix data loss on node restart - > remove > > > the > > > > > > > > > entire optimization (setDataMap) and rework the iterator > > > > > > > > > implementation to perform cache scanning. > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553 > > > > > > > > > > > > > > > > > > > > > > > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov < > > stku...@gmail.com > > > >: > > > > > > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is > a > > > > > weighty > > > > > > > > > reason. > > > > > > > > > > > > > > > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan < > > > > > dsetrak...@apache.org > > > > > > >: > > > > > > > > > > > > > > > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov < > > > > > > > stku...@gmail.com> > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > >> > Dmitry, your way allows to reuse existing > > {{Ignite.set()}} > > > > API > > > > > > to > > > > > > > > > create > > > > > > > > > >> > both set flavors. We can adopt it unless somebody in > the > > > > > > community > > > > > > > > > >> objects. > > > > > > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach > > > proposed > > > > > by > > > > > > > > > Vladimir > > > > > > > > > >> O. > > > > > > > > > >> > more, since it emphasizes the difference between sets > > > being > > > > > > > created, > > > > > > > > > but > > > > > > > > > >> > this will require API extension. > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> Andrey, I am suggesting that Ignite.set(...) in > > > non-collocated > > > > > > mode > > > > > > > > > behaves > > > > > > > > > >> exactly the same as the proposed IgniteCache.asSet() > > > method. I > > > > > do > > > > > > > not > > > > > > > > > like > > > > > > > > > >> the IgniteCache.asSet() API because it is inconsistent > > with > > > > > Ignite > > > > > > > > data > > > > > > > > > >> structure design. All data structures are provided on > > Ignite > > > > API > > > > > > > > > directly > > > > > > > > > >> and we should not change that. > > > > > > > > > >> > > > > > > > > > >> D. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best regards, > > > > > > > > > > Andrey Kuznetsov. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >