Re: MVCC and IgniteDataStreamer
Hi, It is time to continue DataStreamer for MVCC caches discussion. Main focus is on allowOverwrite=false mode. Currently there is a problem related to partition update counters. BACKGROUND As you might know MVCC transactions update partition counters on transaction finish phase and on backups counters are applied as intervals (low, high). Basically, transaction on primary partition counts number of updates it done and increments a partition counter locally by that number during finish stage. So, it happens that primary transaction updates counter from _low_ to _high_. And that (low, high) interval is sent to backups. If a counter on particular backup is equal to _low_ value than counter is incremented to _high_. Otherwise if current value is lesser than an interval is put into a queue. It will be applied when current value becomes equal to _low_. This technique leads us to a situation when partition counters are incremented on backups in the same order as on primary. Let's consider a simple example. Assume that we have partition counter 10 at some point and 2 transactions finish concurrently. Each have made e.g. 5 updates. Partition counter is updated in some order on primary and backup receives messages from primary in reversed order. Primary [10] | Backup [10] Tx1 (10 -> 15) [15]| Tx2 (15 -> 20) [20]| | Receives (15, 20) [10] | // (15, 20) enqueued | Receives (10, 15) [20] | // (10, 15) applied, (15, 20) dequeued and applied (Partition counter value in square brackets) But in contrast data streamer updates counters right at a time of inserting an entry into cache. And it totally breaks the idea of interval counters application. If we have data steamer and transaction modifying the same partition counter concurrently we can get following situation (initially the counter is 10): 1. Tx updated counter (10 -> 15) on primary and send a (10, 15) to backup. 2. Streamer inserted an entry and updated counter (15 -> 16) on primary. 3. Streamer inserted an entry and updated counter (10 -> 11) on backup. 4. Backup receives (10, 15) from tx and does not know what to do with it as it has counter equal to 11 now. And we can have other unexpected effects, e.g. in case of 2 transactions and a streamer the order of counter application by transactions might be reordered. PROPOSAL It looks like that streamer should apply counters by intervals to resolve the inconsistency. To do so we need stream through primary partition because counters should be "reserved" in a single place. So, following could be done when we are working with MVCC cache: 1. Send batches from streamer only to primary partition owners. 2. Remember partition counter updates made on primary. 3. Forward batches to backups along with counter intervals. What do you think? вт, 14 авг. 2018 г. в 15:21, Dmitriy Setrakyan : > > On Tue, Aug 14, 2018 at 4:30 AM, Vladimir Ozerov > wrote: > > > Bypassing WAL will make the whole cache data vulnerable to complete loss in > > case of node failure. I would not do this automatically. > > > > Well, in this case I would expect a log message suggesting that there is an > option to turn off WAL which could significantly improve performance. We > could just print it out once. -- Best regards, Ivan Pavlukhin
Re: MVCC and IgniteDataStreamer
On Tue, Aug 14, 2018 at 4:30 AM, Vladimir Ozerov wrote: > Bypassing WAL will make the whole cache data vulnerable to complete loss in > case of node failure. I would not do this automatically. > Well, in this case I would expect a log message suggesting that there is an option to turn off WAL which could significantly improve performance. We could just print it out once.
Re: MVCC and IgniteDataStreamer
Bypassing WAL will make the whole cache data vulnerable to complete loss in case of node failure. I would not do this automatically. On Mon, Jul 16, 2018 at 12:28 PM Ilya Kasnacheev wrote: > Hello! > > Can we also bypass WAL for such mode automatically? > > However, we will definitely need a 'normal' mode of DataStreamer operation, > for people who use dataStreamer with custom stream transformers on existing > data in use. > > Regards, > > -- > Ilya Kasnacheev > > 2018-07-14 12:33 GMT+03:00 Vladimir Ozerov : > > > Igniters, > > > > Denis is right - please pay attention to IEP-22, as this is how we are > > going to load data into the grid in future. Note that current data > streamer > > internals are not efficient enough, primarily because it has to interact > > with page memory, free lists and various BTree's in regular manner. I > think > > that when IEP-22 is implemented, it will be integrated with data streamer > > tightly, and the most defautl way to load data would be: > > 1) Obtain exclusive table lock > > 2) Load data bypassing almost all Ignite internals > > 3) Re-build indexes > > 4) Release the lock > > > > Normally all types of data load should obey transactional semantics if > MVCC > > is enabled, and we should think separately on how to do that for > > continuous-streaming case. > > > > For now let's focus on immediate goal for MVCC release - data streamer > > should work, no new abstractions or APIs should be introduced. The > easiest > > way to do this is to agree that streamer is not transactional and use > > special version as Igor proposed. In future releases, when IEP-22 is > > implemented, it become transactional with help of exclusive table lock. > In > > more distant releases we will think about separate optimizations for > > continuous streaming and possibly other cases. > > > > Makes sense? > > > > Vladimir. > > > > > > On Fri, Jul 13, 2018 at 11:30 PM Denis Magda wrote: > > > > > Agree that initial loading and real-time streaming should be seen as > > > different use cases. > > > > > > For the loading part, I would borrow ideas from direct data load IEP > [1]. > > > Ignite should assume that no app works with the cluster until it's > > > preloaded. So, no global locks or things like that. Just fasten a seat > > belt > > > and feed data to your nodes. > > > > > > For the streaming part, I would consider 2 or 3 proposed by Igor. > > > > > > -- > > > Denis > > > > > > [1] > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP- > > 22%3A+Direct+Data+Load > > > > > > On Fri, Jul 13, 2018 at 10:03 AM Seliverstov Igor < > gvvinbl...@gmail.com> > > > wrote: > > > > > > > Ivan, > > > > > > > > Anyway DataStreamer is the fastest way to deliver data to a data > node, > > > the > > > > question is how to apply it correctly. > > > > > > > > I don’t thing we need one more tool, which 90% is the same as > > > DataStreamer. > > > > > > > > All we need is just to implement a couple of new stream receivers. > > > > > > > > Regards, > > > > Igor > > > > > > > > > 13 июля 2018 г., в 9:56, Павлухин Иван > > > написал(а): > > > > > > > > > > Hi Igniters, > > > > > > > > > > I had a look into IgniteDataStreamer. As far as I understand, > > currently > > > > it > > > > > just works incorrectly for MVCC tables. It appears as a blocker for > > > > > releasing MVCC. The simplest thing is to refuse creating streamer > for > > > > MVCC > > > > > tables. > > > > > > > > > > Next step could be hair splitting of related use cases. For me, > > initial > > > > > load and continuous streaming look quite different cases and it is > > > better > > > > > to keep them separate at least at API level. Perhaps, it is better > to > > > > > separate API basing on user experience. For example, DataStreamer > > could > > > > be > > > > > considered tool without surprises (which means leaving data always > > > > > consistent, transactions). And let's say BulkLoader is a beast for > > > > fastest > > > > > data loading but full of surprises. Such surprises could be locking > > > > tables, > > > > > rolling back user transactions and so on. So, it is of very limited > > use > > > > > (like initial load). Keeping API entities separate looks better for > > me > > > > than > > > > > introducing multiple modes, because separated entities are easier > for > > > > > understanding and so less prone to user mistakes. > > > > > > > > > > -- > > > > > Best regards, > > > > > Ivan Pavlukhin > > > > > > > > > > > > > >
Re: MVCC and IgniteDataStreamer
Hello! Can we also bypass WAL for such mode automatically? However, we will definitely need a 'normal' mode of DataStreamer operation, for people who use dataStreamer with custom stream transformers on existing data in use. Regards, -- Ilya Kasnacheev 2018-07-14 12:33 GMT+03:00 Vladimir Ozerov : > Igniters, > > Denis is right - please pay attention to IEP-22, as this is how we are > going to load data into the grid in future. Note that current data streamer > internals are not efficient enough, primarily because it has to interact > with page memory, free lists and various BTree's in regular manner. I think > that when IEP-22 is implemented, it will be integrated with data streamer > tightly, and the most defautl way to load data would be: > 1) Obtain exclusive table lock > 2) Load data bypassing almost all Ignite internals > 3) Re-build indexes > 4) Release the lock > > Normally all types of data load should obey transactional semantics if MVCC > is enabled, and we should think separately on how to do that for > continuous-streaming case. > > For now let's focus on immediate goal for MVCC release - data streamer > should work, no new abstractions or APIs should be introduced. The easiest > way to do this is to agree that streamer is not transactional and use > special version as Igor proposed. In future releases, when IEP-22 is > implemented, it become transactional with help of exclusive table lock. In > more distant releases we will think about separate optimizations for > continuous streaming and possibly other cases. > > Makes sense? > > Vladimir. > > > On Fri, Jul 13, 2018 at 11:30 PM Denis Magda wrote: > > > Agree that initial loading and real-time streaming should be seen as > > different use cases. > > > > For the loading part, I would borrow ideas from direct data load IEP [1]. > > Ignite should assume that no app works with the cluster until it's > > preloaded. So, no global locks or things like that. Just fasten a seat > belt > > and feed data to your nodes. > > > > For the streaming part, I would consider 2 or 3 proposed by Igor. > > > > -- > > Denis > > > > [1] > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP- > 22%3A+Direct+Data+Load > > > > On Fri, Jul 13, 2018 at 10:03 AM Seliverstov Igor > > wrote: > > > > > Ivan, > > > > > > Anyway DataStreamer is the fastest way to deliver data to a data node, > > the > > > question is how to apply it correctly. > > > > > > I don’t thing we need one more tool, which 90% is the same as > > DataStreamer. > > > > > > All we need is just to implement a couple of new stream receivers. > > > > > > Regards, > > > Igor > > > > > > > 13 июля 2018 г., в 9:56, Павлухин Иван > > написал(а): > > > > > > > > Hi Igniters, > > > > > > > > I had a look into IgniteDataStreamer. As far as I understand, > currently > > > it > > > > just works incorrectly for MVCC tables. It appears as a blocker for > > > > releasing MVCC. The simplest thing is to refuse creating streamer for > > > MVCC > > > > tables. > > > > > > > > Next step could be hair splitting of related use cases. For me, > initial > > > > load and continuous streaming look quite different cases and it is > > better > > > > to keep them separate at least at API level. Perhaps, it is better to > > > > separate API basing on user experience. For example, DataStreamer > could > > > be > > > > considered tool without surprises (which means leaving data always > > > > consistent, transactions). And let's say BulkLoader is a beast for > > > fastest > > > > data loading but full of surprises. Such surprises could be locking > > > tables, > > > > rolling back user transactions and so on. So, it is of very limited > use > > > > (like initial load). Keeping API entities separate looks better for > me > > > than > > > > introducing multiple modes, because separated entities are easier for > > > > understanding and so less prone to user mistakes. > > > > > > > > -- > > > > Best regards, > > > > Ivan Pavlukhin > > > > > > > > >
Re: MVCC and IgniteDataStreamer
Igniters, Denis is right - please pay attention to IEP-22, as this is how we are going to load data into the grid in future. Note that current data streamer internals are not efficient enough, primarily because it has to interact with page memory, free lists and various BTree's in regular manner. I think that when IEP-22 is implemented, it will be integrated with data streamer tightly, and the most defautl way to load data would be: 1) Obtain exclusive table lock 2) Load data bypassing almost all Ignite internals 3) Re-build indexes 4) Release the lock Normally all types of data load should obey transactional semantics if MVCC is enabled, and we should think separately on how to do that for continuous-streaming case. For now let's focus on immediate goal for MVCC release - data streamer should work, no new abstractions or APIs should be introduced. The easiest way to do this is to agree that streamer is not transactional and use special version as Igor proposed. In future releases, when IEP-22 is implemented, it become transactional with help of exclusive table lock. In more distant releases we will think about separate optimizations for continuous streaming and possibly other cases. Makes sense? Vladimir. On Fri, Jul 13, 2018 at 11:30 PM Denis Magda wrote: > Agree that initial loading and real-time streaming should be seen as > different use cases. > > For the loading part, I would borrow ideas from direct data load IEP [1]. > Ignite should assume that no app works with the cluster until it's > preloaded. So, no global locks or things like that. Just fasten a seat belt > and feed data to your nodes. > > For the streaming part, I would consider 2 or 3 proposed by Igor. > > -- > Denis > > [1] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load > > On Fri, Jul 13, 2018 at 10:03 AM Seliverstov Igor > wrote: > > > Ivan, > > > > Anyway DataStreamer is the fastest way to deliver data to a data node, > the > > question is how to apply it correctly. > > > > I don’t thing we need one more tool, which 90% is the same as > DataStreamer. > > > > All we need is just to implement a couple of new stream receivers. > > > > Regards, > > Igor > > > > > 13 июля 2018 г., в 9:56, Павлухин Иван > написал(а): > > > > > > Hi Igniters, > > > > > > I had a look into IgniteDataStreamer. As far as I understand, currently > > it > > > just works incorrectly for MVCC tables. It appears as a blocker for > > > releasing MVCC. The simplest thing is to refuse creating streamer for > > MVCC > > > tables. > > > > > > Next step could be hair splitting of related use cases. For me, initial > > > load and continuous streaming look quite different cases and it is > better > > > to keep them separate at least at API level. Perhaps, it is better to > > > separate API basing on user experience. For example, DataStreamer could > > be > > > considered tool without surprises (which means leaving data always > > > consistent, transactions). And let's say BulkLoader is a beast for > > fastest > > > data loading but full of surprises. Such surprises could be locking > > tables, > > > rolling back user transactions and so on. So, it is of very limited use > > > (like initial load). Keeping API entities separate looks better for me > > than > > > introducing multiple modes, because separated entities are easier for > > > understanding and so less prone to user mistakes. > > > > > > -- > > > Best regards, > > > Ivan Pavlukhin > > > > >
Re: MVCC and IgniteDataStreamer
Agree that initial loading and real-time streaming should be seen as different use cases. For the loading part, I would borrow ideas from direct data load IEP [1]. Ignite should assume that no app works with the cluster until it's preloaded. So, no global locks or things like that. Just fasten a seat belt and feed data to your nodes. For the streaming part, I would consider 2 or 3 proposed by Igor. -- Denis [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load On Fri, Jul 13, 2018 at 10:03 AM Seliverstov Igor wrote: > Ivan, > > Anyway DataStreamer is the fastest way to deliver data to a data node, the > question is how to apply it correctly. > > I don’t thing we need one more tool, which 90% is the same as DataStreamer. > > All we need is just to implement a couple of new stream receivers. > > Regards, > Igor > > > 13 июля 2018 г., в 9:56, Павлухин Иван написал(а): > > > > Hi Igniters, > > > > I had a look into IgniteDataStreamer. As far as I understand, currently > it > > just works incorrectly for MVCC tables. It appears as a blocker for > > releasing MVCC. The simplest thing is to refuse creating streamer for > MVCC > > tables. > > > > Next step could be hair splitting of related use cases. For me, initial > > load and continuous streaming look quite different cases and it is better > > to keep them separate at least at API level. Perhaps, it is better to > > separate API basing on user experience. For example, DataStreamer could > be > > considered tool without surprises (which means leaving data always > > consistent, transactions). And let's say BulkLoader is a beast for > fastest > > data loading but full of surprises. Such surprises could be locking > tables, > > rolling back user transactions and so on. So, it is of very limited use > > (like initial load). Keeping API entities separate looks better for me > than > > introducing multiple modes, because separated entities are easier for > > understanding and so less prone to user mistakes. > > > > -- > > Best regards, > > Ivan Pavlukhin > >
Re: MVCC and IgniteDataStreamer
Ivan, Anyway DataStreamer is the fastest way to deliver data to a data node, the question is how to apply it correctly. I don’t thing we need one more tool, which 90% is the same as DataStreamer. All we need is just to implement a couple of new stream receivers. Regards, Igor > 13 июля 2018 г., в 9:56, Павлухин Иван написал(а): > > Hi Igniters, > > I had a look into IgniteDataStreamer. As far as I understand, currently it > just works incorrectly for MVCC tables. It appears as a blocker for > releasing MVCC. The simplest thing is to refuse creating streamer for MVCC > tables. > > Next step could be hair splitting of related use cases. For me, initial > load and continuous streaming look quite different cases and it is better > to keep them separate at least at API level. Perhaps, it is better to > separate API basing on user experience. For example, DataStreamer could be > considered tool without surprises (which means leaving data always > consistent, transactions). And let's say BulkLoader is a beast for fastest > data loading but full of surprises. Such surprises could be locking tables, > rolling back user transactions and so on. So, it is of very limited use > (like initial load). Keeping API entities separate looks better for me than > introducing multiple modes, because separated entities are easier for > understanding and so less prone to user mistakes. > > -- > Best regards, > Ivan Pavlukhin
Re: MVCC and IgniteDataStreamer
Hi Igniters, I had a look into IgniteDataStreamer. As far as I understand, currently it just works incorrectly for MVCC tables. It appears as a blocker for releasing MVCC. The simplest thing is to refuse creating streamer for MVCC tables. Next step could be hair splitting of related use cases. For me, initial load and continuous streaming look quite different cases and it is better to keep them separate at least at API level. Perhaps, it is better to separate API basing on user experience. For example, DataStreamer could be considered tool without surprises (which means leaving data always consistent, transactions). And let's say BulkLoader is a beast for fastest data loading but full of surprises. Such surprises could be locking tables, rolling back user transactions and so on. So, it is of very limited use (like initial load). Keeping API entities separate looks better for me than introducing multiple modes, because separated entities are easier for understanding and so less prone to user mistakes. -- Best regards, Ivan Pavlukhin
Re: MVCC and IgniteDataStreamer
Yakov, We can introduce several modes: 1) initial loading which replaces data (allowOverwrite=true) with initial version or leaves it as is (allowOverwrite=false) and requires exclusive table lock (fastest one) 2) continuous loading which has its own version and links the data as regular transaction (allowOverwrite=true) or leaves it as is (allowOverwrite=false), doesn’t affect concurrent readers but still requires write lock on a table (less fast than previous) 3) batch loading which acts as a sequence of regular transaction with all possible optimizations, doesn’t affect concurrent readers and writers, but causes possible lock conflicts with subsequent retries, links the data as regular transaction (allowOverwrite=true) or leaves it as is (allowOverwrite=false), doesn’t cause write conflicts (like READ_COMMITTED txs) (slowest one). All the modes require table locks. Your thoughts? > 9 июля 2018 г., в 12:55, Yakov Zhdanov написал(а): > > Igor, > > I can't say if I agree with any of the suggestions. I would like us to > start from answering the question - what is data streamer used for? > > First of all, for initial data loading. This should be super fast mode > probably ignoring all transactional semantics, but providing certain > guarantees for data passed into streamer to be loaded. > > Second, for continuously streaming updates to some tables (from more than 1 > streamer) and running some analytics over data, probably, with some > modifications from non-streamer side (user transactions). This way > streamers should not rollback user txs or do any kind of unexpected > visibility tricks. I think we can think of proper streamer tx on batch or > key level. > > Third case I see is a combination of the above - we stream portions of data > to an existing table let's say once a day (which may be some market data > after closing or offloaded operations data set) with or without any other > concurrent non-streamer operations. This mode may involve table locks or do > the same as 2nd mode which should be up to user to decide. > > So, planned changes to streamer should support at least these 3 scenarios. > What do you think? > > Igniters, feel free sharing your thoughts on this. Question is pretty > important for us. > > --Yakov
Re: MVCC and IgniteDataStreamer
Igor, I can't say if I agree with any of the suggestions. I would like us to start from answering the question - what is data streamer used for? First of all, for initial data loading. This should be super fast mode probably ignoring all transactional semantics, but providing certain guarantees for data passed into streamer to be loaded. Second, for continuously streaming updates to some tables (from more than 1 streamer) and running some analytics over data, probably, with some modifications from non-streamer side (user transactions). This way streamers should not rollback user txs or do any kind of unexpected visibility tricks. I think we can think of proper streamer tx on batch or key level. Third case I see is a combination of the above - we stream portions of data to an existing table let's say once a day (which may be some market data after closing or offloaded operations data set) with or without any other concurrent non-streamer operations. This mode may involve table locks or do the same as 2nd mode which should be up to user to decide. So, planned changes to streamer should support at least these 3 scenarios. What do you think? Igniters, feel free sharing your thoughts on this. Question is pretty important for us. --Yakov