Hi Dave, Thanks for your support. I also think this should only be for the master branch.
Thanks, Xiangying On Tue, Sep 26, 2023 at 9:34 AM Dave Fisher <wave4d...@comcast.net> wrote: > > Hi - > > OK. I’ll agree, but I think the PIP ought to include documentation. There > should also be clear communication about this use case and how to use it. > > Sent from my iPhone > > > On Sep 25, 2023, at 6:23 PM, Xiangying Meng <xiangy...@apache.org> wrote: > > > > Hi Dave, > > The uncommitted transactions do not impact actual users' bank accounts. > > Business Processing System E only reads committed transactional > > messages and operates users' accounts. It needs Exactly-once semantic. > > Real-time Monitoring System D reads uncommitted transactional > > messages. It does not need Exactly-once semantic. > > > > They use different subscriptions and choose different isolation > > levels. One needs transaction, one does not. > > In general, multiple subscriptions of the same topic do not all > > require transaction guarantees. > > Some want low latency without the exact-once semantic guarantee, and > > some must require the exactly-once guarantee. > > We just provide a new option for different subscriptions. This should > > not be a breaking change,right? > > Not a breaking change, but it does add to the API. > > It should be discussed if this PIP is only for master - 3.2, or if may be > cherry picked to current versions. > > > > > Looking forward to your reply. > > Thank you, > Dave > > > > Thanks, > > Xiangying > > > >> On Tue, Sep 26, 2023 at 4:09 AM Dave Fisher <w...@apache.org> wrote: > >> > >> > >> > >>>> On Sep 20, 2023, at 12:50 AM, Xiangying Meng <xiangy...@apache.org> > >>>> wrote: > >>> > >>> Hi, all, > >>> > >>> Let's consider another example: > >>> > >>> **System**: Financial Transaction System > >>> > >>> **Operations**: Large volume of deposit and withdrawal operations, a > >>> small number of transfer operations. > >>> > >>> **Roles**: > >>> > >>> - **Client A1** > >>> - **Client A2** > >>> - **User Account B1** > >>> - **User Account B2** > >>> - **Request Topic C** > >>> - **Real-time Monitoring System D** > >>> - **Business Processing System E** > >>> > >>> **Client Operations**: > >>> > >>> - **Withdrawal**: Client A1 decreases the deposit amount from User > >>> Account B1 or B2. > >>> - **Deposit**: Client A1 increases the deposit amount in User Account B1 > >>> or B2. > >>> - **Transfer**: Client A2 decreases the deposit amount from User > >>> Account B1 and increases it in User Account B2. Or vice versa. > >>> > >>> **Real-time Monitoring System D**: Obtains the latest data from > >>> Request Topic C as quickly as possible to monitor transaction data and > >>> changes in bank reserves in real-time. This is necessary for the > >>> timely detection of anomalies and real-time decision-making. > >>> > >>> **Business Processing System E**: Reads data from Request Topic C, > >>> then actually operates User Accounts B1, B2. > >>> > >>> **User Scenario**: Client A1 sends a large number of deposit and > >>> withdrawal requests to Request Topic C. Client A2 writes a small > >>> number of transfer requests to Request Topic C. > >>> > >>> In this case, Business Processing System E needs a read-committed > >>> isolation level to ensure operation consistency and Exactly Once > >>> semantics. The real-time monitoring system does not care if a small > >>> number of transfer requests are incomplete (dirty data). What it > >>> cannot tolerate is a situation where a large number of deposit and > >>> withdrawal requests cannot be presented in real time due to a small > >>> number of transfer requests (the current situation is that uncommitted > >>> transaction messages can block the reading of committed transaction > >>> messages). > >> > >> So you are willing to let uncommitted transactions impact actual users > >> bank accounts? Are you sure that there is not another way to bypass > >> uncommitted records? Letting uncommitted records through is not Exactly > >> once. > >> > >> Are you ready to rewrite Pulsar’s documentation to explain how normal > >> users can avoid allowing this? > >> > >> Best, > >> Dave > >> > >> > >>> > >>> In this case, it is necessary to set different isolation levels for > >>> different consumers/subscriptions. > >>> > >>> Thanks, > >>> Xiangying > >>> > >>> On Tue, Sep 19, 2023 at 11:35 PM 杨国栋 <yangguodong1...@gmail.com> wrote: > >>>> > >>>> Hi Dave and Xiangying, > >>>> Thanks for all your support. > >>>> > >>>> Let me add some background. > >>>> > >>>> Apache Paimon take message queue as External Log Systems and changelog of > >>>> Paimon can also be consumed from message queue. > >>>> By default, change-log of message queue in Paimon are visible to > >>>> consumers > >>>> only after a snapshot. Snapshot have a same life cycle as message queue > >>>> transactions. > >>>> However, users can immediately consume change-log by read uncommited > >>>> message without waiting for the next snapshot. > >>>> This behavior reduces the latency of changelog, but it relies on reading > >>>> uncommited message in Kafka or other message queue. > >>>> So we hope Pulsar can support Read Uncommitted isolation level. > >>>> > >>>> Put aside the application scenarios of Paimon. Let's discuss Read > >>>> Uncommitted isolation level itself. > >>>> > >>>> Read Uncommitted isolation will bring certain security risks, but will > >>>> also > >>>> make the message immediately readable. > >>>> Reading submitted data can ensure accuracy, and reading uncommitted data > >>>> can ensure real-time performance (there may be some repeated message or > >>>> dirty message). > >>>> Real-time performance is what users need. How to handle dirty message > >>>> should be considered by the application side. > >>>> > >>>> We can still get complete and accurate data from Read Committed isolation > >>>> level. > >>>> > >>>> Sincerely yours. > >> >