Igor,

I can't say if I agree with any of the suggestions. I would like us to
start from answering the question - what is data streamer used for?

First of all, for initial data loading. This should be super fast mode
probably ignoring all transactional semantics, but providing certain
guarantees for data passed into streamer to be loaded.

Second, for continuously streaming updates to some tables (from more than 1
streamer) and running some analytics over data, probably, with some
modifications from non-streamer side (user transactions). This way
streamers should not rollback user txs or do any kind of unexpected
visibility tricks. I think we can think of proper streamer tx on batch or
key level.

Third case I see is a combination of the above - we stream portions of data
to an existing table let's say once a day (which may be some market data
after closing or offloaded operations data set) with or without any other
concurrent non-streamer operations. This mode may involve table locks or do
the same as 2nd mode which should be up to user to decide.

So, planned changes to streamer should support at least these 3 scenarios.
What do you think?

Igniters, feel free sharing your thoughts on this. Question is pretty
important for us.

--Yakov

Reply via email to