shibd opened a new pull request, #24495:
URL: https://github.com/apache/pulsar/pull/24495
### Motivation
The Kinesis source connector's handling of metadata properties was rigid and
contained a critical bug.
- Properties were not configurable: Users could not select which Kinesis
metadata properties to include in Pulsar messages. This forced all properties
to be included, which could be inefficient in terms of message size.
- Key Collision Bug: A bug was discovered where all property keys were
defined as an empty string (""), causing a key collision that resulted in the
loss of all metadata except for the last one set (sequenceNumber).
This PR fixes these issues by introducing a configuration to control which
properties are included, which also resolves the underlying bug.
### Modifications
- Added a new configuration, kinesisRecordProperties, to
KinesisSourceConfig.java. This allows users to provide a comma-separated list
of properties to include.
- The default value retains all previously available properties to ensure
backward compatibility.
- As part of making the properties configurable, the empty string constants
in KinesisRecord.java were replaced with unique, descriptive keys (e.g.,
"kinesis.arrival.timestamp"), fixing the data loss bug.
### Verifying this change
- Add unit test to covert this change
### Does this pull request potentially affect one of the following parts:
<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
*If the box was checked, please highlight the changes*
- [ ] Dependencies (add or upgrade a dependency)
- [ ] The public API
- [ ] The schema
- [ ] The default values of configurations
- [ ] The threading model
- [ ] The binary protocol
- [ ] The REST endpoints
- [ ] The admin CLI options
- [ ] The metrics
- [ ] Anything that affects deployment
### Documentation
<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
- [ ] `doc` <!-- Your PR contains doc changes. -->
- [x] `doc-required` <!-- Your PR changes impact docs and you will update
later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->
### Matching PR in forked repository
PR in forked repository: <!-- ENTER URL HERE -->
<!--
After opening this PR, the build in apache/pulsar will fail and instructions
will
be provided for opening a PR in the PR author's forked repository.
apache/pulsar pull requests should be first tested in your own fork since
the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed
in
a forked repository.
The tests will be run in the forked repository until all PR review comments
have
been handled, the tests pass and the PR is approved by a reviewer.
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]