Hi Dave,

Thanks for your support.
We have completed the relevant documents: 
https://github.com/apache/pulsar-site/pull/712

Pip: https://github.com/apache/pulsar/pull/21114
Please help me take a look when you have time.


Thanks,
zhangheng
---- Replied Message ----
| From | Xiangying Meng<xiangy...@apache.org> |
| Date | 9/26/2023 09:40 |
| To | <dev@pulsar.apache.org> |
| Subject | Re: [DISSCUSS] PIP-298: Consumer supports specifying consumption 
isolation level |
Hi Dave,

Thanks for your support.
I also think this should only be for the master branch.

Thanks,
Xiangying

On Tue, Sep 26, 2023 at 9:34 AM Dave Fisher <wave4d...@comcast.net> wrote:

Hi -

OK. I’ll agree, but I think the PIP ought to include documentation. There 
should also be clear communication about this use case and how to use it.

Sent from my iPhone

On Sep 25, 2023, at 6:23 PM, Xiangying Meng <xiangy...@apache.org> wrote:

Hi Dave,
The uncommitted transactions do not impact actual users' bank accounts.
Business Processing System E only reads committed transactional
messages and operates users' accounts. It needs Exactly-once semantic.
Real-time Monitoring System D reads uncommitted transactional
messages. It does not need Exactly-once semantic.

They use different subscriptions and choose different isolation
levels. One needs transaction, one does not.
In general, multiple subscriptions of the same topic do not all
require transaction guarantees.
Some want low latency without the exact-once semantic guarantee, and
some must require the exactly-once guarantee.
We just provide a new option for different subscriptions. This should
not be a breaking change,right?

Not a breaking change, but it does add to the API.

It should be discussed if this PIP is only for master - 3.2, or if may be 
cherry picked to current versions.


Looking forward to your reply.

Thank you,
Dave

Thanks,
Xiangying

On Tue, Sep 26, 2023 at 4:09 AM Dave Fisher <w...@apache.org> wrote:



On Sep 20, 2023, at 12:50 AM, Xiangying Meng <xiangy...@apache.org> wrote:

Hi, all,

Let's consider another example:

**System**: Financial Transaction System

**Operations**: Large volume of deposit and withdrawal operations, a
small number of transfer operations.

**Roles**:

- **Client A1**
- **Client A2**
- **User Account B1**
- **User Account B2**
- **Request Topic C**
- **Real-time Monitoring System D**
- **Business Processing System E**

**Client Operations**:

- **Withdrawal**: Client A1 decreases the deposit amount from User
Account B1 or B2.
- **Deposit**: Client A1 increases the deposit amount in User Account B1 or B2.
- **Transfer**: Client A2 decreases the deposit amount from User
Account B1 and increases it in User Account B2. Or vice versa.

**Real-time Monitoring System D**: Obtains the latest data from
Request Topic C as quickly as possible to monitor transaction data and
changes in bank reserves in real-time. This is necessary for the
timely detection of anomalies and real-time decision-making.

**Business Processing System E**: Reads data from Request Topic C,
then actually operates User Accounts B1, B2.

**User Scenario**: Client A1 sends a large number of deposit and
withdrawal requests to Request Topic C. Client A2 writes a small
number of transfer requests to Request Topic C.

In this case, Business Processing System E needs a read-committed
isolation level to ensure operation consistency and Exactly Once
semantics. The real-time monitoring system does not care if a small
number of transfer requests are incomplete (dirty data). What it
cannot tolerate is a situation where a large number of deposit and
withdrawal requests cannot be presented in real time due to a small
number of transfer requests (the current situation is that uncommitted
transaction messages can block the reading of committed transaction
messages).

So you are willing to let uncommitted transactions impact actual users bank 
accounts? Are you sure that there is not another way to bypass uncommitted 
records? Letting uncommitted records through is not Exactly once.

Are you ready to rewrite Pulsar’s documentation to explain how normal users can 
avoid allowing this?

Best,
Dave



In this case, it is necessary to set different isolation levels for
different consumers/subscriptions.

Thanks,
Xiangying

On Tue, Sep 19, 2023 at 11:35 PM 杨国栋 <yangguodong1...@gmail.com> wrote:

Hi Dave and Xiangying,
Thanks for all your support.

Let me add some background.

Apache Paimon take message queue as External Log Systems and changelog of
Paimon can also be consumed from message queue.
By default, change-log of message queue in Paimon are visible to consumers
only after a snapshot. Snapshot have a same life cycle as message queue
transactions.
However, users can immediately consume change-log by read uncommited
message without waiting for the next snapshot.
This behavior reduces the latency of changelog, but it relies on reading
uncommited message in Kafka or other message queue.
So we hope Pulsar can support Read Uncommitted isolation level.

Put aside the application scenarios of Paimon. Let's discuss Read
Uncommitted isolation level itself.

Read Uncommitted isolation will bring certain security risks, but will also
make the message immediately readable.
Reading submitted data can ensure accuracy, and reading uncommitted data
can ensure real-time performance (there may be some repeated message or
dirty message).
Real-time performance is what users need. How to handle dirty message
should be considered by the application side.

We can still get complete and accurate data from Read Committed isolation
level.

Sincerely yours.


Reply via email to