Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.4 #153

2023-08-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 495740 lines...]

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testControllerManagementMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testTopicAssignmentMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testTopicAssignmentMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testConnectionViaNettyClient() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testConnectionViaNettyClient() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testPropagateIsrChanges() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testPropagateIsrChanges() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testControllerEpochMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testControllerEpochMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testDeleteRecursive() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testDeleteRecursive() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testGetTopicPartitionStates() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testGetTopicPartitionStates() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testCreateConfigChangeNotification() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testCreateConfigChangeNotification() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testDelegationTokenMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
KafkaZkClientTest > testDelegationTokenMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testUpdateExistingPartitions() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testUpdateExistingPartitions() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testEmptyWrite() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testEmptyWrite() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testExistingKRaftControllerClaim() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testExistingKRaftControllerClaim() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testReadAndWriteProducerId() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testReadAndWriteProducerId() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testMigrateTopicConfigs() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testMigrateTopicConfigs() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testNonIncreasingKRaftEpoch() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testNonIncreasingKRaftEpoch() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testMigrateEmptyZk() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testMigrateEmptyZk() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testWriteNewTopicConfigs() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testWriteNewTopicConfigs() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testClaimAndReleaseExistingController() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testClaimAndReleaseExistingController() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testWriteNewClientQuotas() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testWriteNewClientQuotas() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 168 > 
ZkMigrationClientTest > testClaimAbsentCont

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2074

2023-08-07 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.5 #53

2023-08-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 281624 lines...]
Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > shouldRestore STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > shouldRestore PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > shouldPutGetAndDelete STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > shouldPutGetAndDelete PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > 
shouldManualUpgradeFromNonVersionedTimestampedToVersioned STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
VersionedKeyValueStoreIntegrationTest > 
shouldManualUpgradeFromNonVersionedTimestampedToVersioned PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
HandlingSourceTopicDeletionIntegrationTest > 
shouldThrowErrorAfterSourceTopicDeleted STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
HandlingSourceTopicDeletionIntegrationTest > 
shouldThrowErrorAfterSourceTopicDeleted PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorLargeNumConsumers 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorLargeNumConsumers 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyStandbys STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyThreadsPerClient STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyThreadsPerClient PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyThreadsPerClient 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyThreadsPerClient 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargePartitionCount 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargePartitionCount 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargePartitionCount STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargePartitionCount PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyStandbys STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorManyStandbys 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargeNumConsumers 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargeNumConsumers 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 177 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargeNumConsumers STARTED

Gradle Test Run :streams:i

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.3 #184

2023-08-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 253727 lines...]

org.apache.kafka.streams.integration.StandbyTaskCreationIntegrationTest > 
shouldCreateStandByTasksForMaterializedAndOptimizedSourceTables STARTED

org.apache.kafka.streams.integration.StandbyTaskCreationIntegrationTest > 
shouldCreateStandByTasksForMaterializedAndOptimizedSourceTables PASSED

org.apache.kafka.streams.integration.StateDirectoryIntegrationTest > 
testNotCleanUpStateDirIfNotEmpty STARTED

org.apache.kafka.streams.integration.StateDirectoryIntegrationTest > 
testNotCleanUpStateDirIfNotEmpty PASSED

org.apache.kafka.streams.integration.StateDirectoryIntegrationTest > 
testCleanUpStateDirIfEmpty STARTED

org.apache.kafka.streams.integration.StateDirectoryIntegrationTest > 
testCleanUpStateDirIfEmpty PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificActivePartitionStores STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificActivePartitionStores PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryAllStalePartitionStores STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryAllStalePartitionStores PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStoresMultiStreamThreads STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStoresMultiStreamThreads PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStores STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStores PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryOnlyActivePartitionStoresByDefault STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryOnlyActivePartitionStoresByDefault PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryStoresAfterAddingAndRemovingStreamThread STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQueryStoresAfterAddingAndRemovingStreamThread PASSED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStoresMultiStreamThreadsNamedTopology STARTED

org.apache.kafka.streams.integration.StoreQueryIntegrationTest > 
shouldQuerySpecificStalePartitionStoresMultiStreamThreadsNamedTopology PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testSelfJoin[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testSelfJoin[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrat

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #149

2023-08-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 439977 lines...]

FetchRequestTestDowngrade > 
testTopicIdIsRemovedFromFetcherWhenControllerDowngrades() STARTED

SslAdminIntegrationTest > testAclDescribe() PASSED

SslAdminIntegrationTest > testLegacyAclOpsNeverAffectOrReturnPrefixed() STARTED

FetchRequestTestDowngrade > 
testTopicIdIsRemovedFromFetcherWhenControllerDowngrades() PASSED

SslAdminIntegrationTest > testLegacyAclOpsNeverAffectOrReturnPrefixed() PASSED

SslAdminIntegrationTest > testCreateTopicsResponseMetadataAndConfig() STARTED

SslAdminIntegrationTest > testCreateTopicsResponseMetadataAndConfig() PASSED

SslAdminIntegrationTest > testAttemptToCreateInvalidAcls() STARTED

SslAdminIntegrationTest > testAttemptToCreateInvalidAcls() PASSED

SslAdminIntegrationTest > testAclAuthorizationDenied() STARTED

SslAdminIntegrationTest > testAclAuthorizationDenied() PASSED

SslAdminIntegrationTest > testAclOperations() STARTED

SslAdminIntegrationTest > testAclOperations() PASSED

SslAdminIntegrationTest > testAclOperations2() STARTED

SslAdminIntegrationTest > testAclOperations2() PASSED

SslAdminIntegrationTest > testAclDelete() STARTED

> Task :streams:integrationTest

org.apache.kafka.streams.integration.EosV2UpgradeIntegrationTest > 
shouldUpgradeFromEosAlphaToEosV2[true] PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargeNumConsumers STARTED

> Task :core:integrationTest

SslAdminIntegrationTest > testAclDelete() PASSED

SslAdminIntegrationTest > 
testAsynchronousAuthorizerAclUpdatesDontBlockRequestThreads() STARTED

SslAdminIntegrationTest > 
testAsynchronousAuthorizerAclUpdatesDontBlockRequestThreads() PASSED

SslAdminIntegrationTest > 
testSynchronousAuthorizerAclUpdatesBlockRequestThreads() STARTED

SslAdminIntegrationTest > 
testSynchronousAuthorizerAclUpdatesBlockRequestThreads() PASSED

SslAdminIntegrationTest > testAclUpdatesUsingAsynchronousAuthorizer() STARTED

> Task :streams:integrationTest

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargeNumConsumers PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyStandbys STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyStandbys PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyThreadsPerClient STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyThreadsPerClient PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyThreadsPerClient STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyThreadsPerClient PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargePartitionCount STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargePartitionCount PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargePartitionCount STARTED

> Task :core:integrationTest

SslAdminIntegrationTest > testAclUpdatesUsingAsynchronousAuthorizer() PASSED

SslAdminIntegrationTest > testAclUpdatesUsingSynchronousAuthorizer() STARTED

SslAdminIntegrationTest > testAclUpdatesUsingSynchronousAuthorizer() PASSED

TransactionsExpirationTest > 
testBumpTransactionalEpochAfterInvalidProducerIdMapping() STARTED

TransactionsExpirationTest > 
testBumpTransactionalEpochAfterInvalidProducerIdMapping() PASSED

DescribeAuthorizedOperationsTest > testClusterAuthorizedOperations() STARTED

> Task :streams:integrationTest

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargePartitionCount PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyStandbys STARTED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyStandbys PASSED

org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorMany

Re: [DISCUSS] KIP-955: Add stream-table join on foreign key

2023-08-07 Thread Igor Fomenko
Hi Matthias,

Hi Matthias,



Thanks for your comments.



I would like to clarify the use case a little more to show why existing
table-table foreign key join will not work for the use case I am trying to
address.

Let’s consider the very simple use case with the parent messages in one
Kafka topic (‘order event’ messages that also contain some key order info)
and the child messages in another topic (‘order items’ messages with an
additional info for the order). Relationship between parent and child
messages is 1:1. Also ‘order items’ message has OrderID as one of its
fields (foreign key).



The requirement is to combine info of the parent ‘order event’ message with
child ‘order items’ message using foreign key and to send it only once to
the target system as one ‘complete order’ message for each new ‘order
event’ message.

Please note that the new messages which are related to order items (create,
update, delete) should not trigger the resulting ‘complete order’ message).



>From the above requirements we can state the following:

1. Order events are unique and never updated or deleted; they can only
be replayed if we need to recover the event stream. For our order example I
would use OrderID as an event key but if we use the KTable to represent
events then events with the same OrderID will overwrite each other. This
may or may not cause some issues but using the stream to model seems to be
a more correct approach from at least performance point of view.

2. We do not want updates from the “order items” table on the right
side of the join to generate an output since only events should be the
trigger for output messages in our scenario. This is aligned with the
stream-table join behavior rather than table-table join when updates are
coming from both sides

3. Stream-table join will give us resulting stream which is more align
with our output requirements than the table that would be result of
table-table join



Requirement #2 above is the most important one and it can not be achieved
with existing table-table join on foreign key.



I also stated that the foreign key table in table-table join is on the
‘wrong’ side for our order management use case. By this I just meant that
in stream-table join I am proposing the foreign key table needs to be on
the right side and on the existing table-table join it is on the left. This
is however is irrelevant since we can not use table-table join anyway for
the reason #2 above.



You made a good point about aggregation of child messages for a more
complex use case of 1:n relation between parent and children. Initially I
was thinking that aggregation will be just a separate operation that could
be added after we performed a foreign key join. Now I realize that it will
not be possible to do it after.

Maybe there could be a flag to stream-table foreign key join that would
indicate if we want this join to aggregate children or not?



What do you think?

Regards,



Igor


On Fri, Aug 4, 2023 at 10:01 PM Matthias J. Sax  wrote:

> Thanks a lot for providing more background. It's getting much clear to
> me now.
>
> Couple of follow up questions:
>
>
> > It is not possible to use table-table join in this case because
> triggering
> > events are supplied separately from the actual data entity that needs to
> be
> > "assembled" and these events could only be presented as KStream due to
> > their nature.
>
> Not sure if I understand this part? Why can't those events not
> represented as a KTable. You say "could only be presented as KStream due
> to their nature" -- what do you mean by this?
>
> In the end, my understanding is the following (using the example for the
> KIP):
>
> For the shipments <-> orders and order-details <-> orders join, shipment
> and order-details are the fact table, what is "reverse" to what you
> want? Using existing FK join, it would mean you get two enriched tables,
> that you cannot join to each other any further (because we don't support
> n:m join): in the end, shipmentId+orderDetailId would be the PK of such
> a n:m join?
>
> If that's correct, (just for the purpose to make sure I understand
> correctly), if we would add an n:m join, you could join shipment <->
> order-details first, and use a FK join to enrich the result with orders.
> -- In addition, you could also do a FK join to event if you represent
> events as a table (this relates to my question from above, why events
> cannot be represented as a KTable).
>
>
> A the KIP itself, I am still wondering about details: if we get an event
> in, and we do a lookup into the "FK table" and find multiple matches,
> would we emit multiple results? This would kinda defeat the purpose to
> re-assemble everything into a single entity? (And it might require an
> additional aggregation downstream to put the entity together.) -- Or
> would we join the singe event, with all found table rows, and emit a
> single "enriched" event?
>
>
> Thus, I am actually wondering, if you would not pre-proces

[jira] [Created] (KAFKA-15312) FileRawSnapshotWriter must flush before atomic move

2023-08-07 Thread Jira
José Armando García Sancio created KAFKA-15312:
--

 Summary: FileRawSnapshotWriter must flush before atomic move
 Key: KAFKA-15312
 URL: https://issues.apache.org/jira/browse/KAFKA-15312
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Reporter: José Armando García Sancio
Assignee: José Armando García Sancio
 Fix For: 3.6.0


Not all file system fsync to disk on close. For KRaft to guarantee that the 
data has made it to disk before calling rename it needs to make sure that the 
file has been fsync.

We have seen cases were the snapshot file has zero-length data on ext4 file 
system.
{quote} "Delayed allocation" means that the filesystem tries to delay the 
allocation of physical disk blocks for written data for as long as possible. 
This policy brings some important performance benefits. Many files are 
short-lived; delayed allocation can keep the system from writing fleeting 
temporary files to disk at all. And, for longer-lived files, delayed allocation 
allows the kernel to accumulate more data and to allocate the blocks for data 
contiguously, speeding up both the write and any subsequent reads of that data. 
It's an important optimization which is found in most contemporary filesystems.

But, if blocks have not been allocated for a file, there is no need to write 
them quickly as a security measure. Since the blocks do not yet exist, it is 
not possible to read somebody else's data from them. So ext4 will not (cannot) 
write out unallocated blocks as part of the next journal commit cycle. Those 
blocks will, instead, wait until the kernel decides to flush them out; at that 
point, physical blocks will be allocated on disk and the data will be made 
persistent. The kernel doesn't like to let file data sit unwritten for too 
long, but it can still take a minute or so (with the default settings) for that 
data to be flushed - far longer than the five seconds normally seen with ext3. 
And that is why a crash can cause the loss of quite a bit more data when ext4 
is being used. 
{quote}
from: [https://lwn.net/Articles/322823/]
{quote}auto_da_alloc(*), noauto_da_alloc

Many broken applications don't use fsync() when replacing existing files via 
patterns such as fd = open("foo.new")/write(fd,..)/close(fd)/ rename("foo.new", 
"foo"), or worse yet, fd = open("foo", O_TRUNC)/write(fd,..)/close(fd). If 
auto_da_alloc is enabled, ext4 will detect the replace-via-rename and 
replace-via-truncate patterns and force that any delayed allocation blocks are 
allocated such that at the next journal commit, in the default data=ordered 
mode, the data blocks of the new file are forced to disk before the rename() 
operation is committed. This provides roughly the same level of guarantees as 
ext3, and avoids the "zero-length" problem that can happen when a system 
crashes before the delayed allocation blocks are forced to disk.
{quote}
from: [https://www.kernel.org/doc/html/latest/admin-guide/ext4.html]

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [QUESTION] What is the difference between sequence and offset for a Record?

2023-08-07 Thread Justine Olshan
The sequence summary looks right to me.
For log normalization, are you referring to compaction? The segment's first
and last offsets might change, but a batch keeps its offsets when
compaction occurs.

Hope that helps.
Justine

On Mon, Aug 7, 2023 at 8:59 AM Matthias J. Sax  wrote:

> > but the base offset may change during log normalizing.
>
> Not sure what you mean by "normalization" but offsets are immutable, so
> they don't change. (To be fair, I am not an expert on brokers, so not
> sure how this work in detail when log compaction ticks in).
>
> > This field is given by the producer and the broker should only read it.
>
> Sounds right. The point being is, that the broker has an "expected"
> value for it, and if the provided value does not match the expected one,
> the write is rejected to begin with.
>
>
> -Matthias
>
> On 8/7/23 6:35 AM, tison wrote:
> > Hi Matthias and Justine,
> >
> > Thanks for your reply!
> >
> > I can summarize the answer as -
> >
> > Record offset = base offset + offset delta. This field is calculated by
> the
> > broker and the delta won't change but the base offset may change during
> log
> > normalizing.
> > Record sequence = base sequence + (offset) delta. This field is given by
> > the producer and the broker should only read it.
> >
> > Is it correct?
> >
> > I implement the manipulation part of base offset following this
> > understanding at [1].
> >
> > Best,
> > tison.
> >
> > [1]
> >
> https://github.com/tisonkun/kafka-api/blob/d080ab7e4b57c0ab0182e0b254333f400e616cd2/simplesrv/src/lib.rs#L391-L394
> >
> >
> > Justine Olshan  于2023年8月2日周三 04:19写道:
> >
> >> For what it's worth -- the sequence number is not calculated
> >> "baseOffset/baseSequence + offset delta" but rather by monotonically
> >> increasing for a given epoch. If the epoch is bumped, we reset back to
> >> zero.
> >> This may mean that the offset and sequence may match, but do not
> strictly
> >> need to be the same. The sequence number will also always come from the
> >> client and be in the produce records sent to the Kafka broker.
> >>
> >> As for offsets, there is some code in the log layer that maintains the
> log
> >> end offset and assigns offsets to the records. The produce handling on
> the
> >> leader should typically assign the offset.
> >> I believe you can find that code here:
> >>
> >>
> https://github.com/apache/kafka/blob/b9a45546a7918799b6fb3c0fe63b56f47d8fcba9/core/src/main/scala/kafka/log/UnifiedLog.scala#L766
> >>
> >> Justine
> >>
> >> On Tue, Aug 1, 2023 at 11:38 AM Matthias J. Sax 
> wrote:
> >>
> >>> The _offset_ is the position of the record in the partition.
> >>>
> >>> The _sequence number_ is a unique ID that allows broker to de-duplicate
> >>> messages. It requires the producer to implement the idempotency
> protocol
> >>> (part of Kafka transactions); thus, sequence numbers are optional and
> as
> >>> long as you don't want to support idempotent writes, you don't need to
> >>> worry about them. (If you want to dig into details, checkout KIP-98
> that
> >>> is the original KIP about Kafka TX).
> >>>
> >>> HTH,
> >>> -Matthias
> >>>
> >>> On 8/1/23 2:19 AM, tison wrote:
>  Hi,
> 
>  I'm wringing a Kafka API Rust codec library[1] to understand how Kafka
>  models its concepts and how the core business logic works.
> 
>  During implementing the codec for Records[2], I saw a twins of fields
>  "sequence" and "offset". Both of them are calculated by
>  baseOffset/baseSequence + offset delta. Then I'm a bit confused how to
> >>> deal
>  with them properly - what's the difference between these two concepts
>  logically?
> 
>  Also, to understand how the core business logic works, I write a
> simple
>  server based on my codec library, and observe that the server may need
> >> to
>  update offset for records produced. How does Kafka set the correct
> >> offset
>  for each produced records? And how does Kafka maintain the calculation
> >>> for
>  offset and sequence during these modifications?
> 
>  I'll appreciate if anyone can answer the question or give some
> insights
> >>> :D
> 
>  Best,
>  tison.
> 
>  [1] https://github.com/tisonkun/kafka-api
>  [2] https://kafka.apache.org/documentation/#messageformat
> 
> >>>
> >>
> >
>


[jira] [Created] (KAFKA-15311) Docs incorrectly state that reverting to ZooKeeper mode during the migration is not possible

2023-08-07 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-15311:


 Summary: Docs incorrectly state that reverting to ZooKeeper mode 
during the migration is not possible
 Key: KAFKA-15311
 URL: https://issues.apache.org/jira/browse/KAFKA-15311
 Project: Kafka
  Issue Type: Bug
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [QUESTION] What is the difference between sequence and offset for a Record?

2023-08-07 Thread Matthias J. Sax

but the base offset may change during log normalizing.


Not sure what you mean by "normalization" but offsets are immutable, so 
they don't change. (To be fair, I am not an expert on brokers, so not 
sure how this work in detail when log compaction ticks in).



This field is given by the producer and the broker should only read it.


Sounds right. The point being is, that the broker has an "expected" 
value for it, and if the provided value does not match the expected one, 
the write is rejected to begin with.



-Matthias

On 8/7/23 6:35 AM, tison wrote:

Hi Matthias and Justine,

Thanks for your reply!

I can summarize the answer as -

Record offset = base offset + offset delta. This field is calculated by the
broker and the delta won't change but the base offset may change during log
normalizing.
Record sequence = base sequence + (offset) delta. This field is given by
the producer and the broker should only read it.

Is it correct?

I implement the manipulation part of base offset following this
understanding at [1].

Best,
tison.

[1]
https://github.com/tisonkun/kafka-api/blob/d080ab7e4b57c0ab0182e0b254333f400e616cd2/simplesrv/src/lib.rs#L391-L394


Justine Olshan  于2023年8月2日周三 04:19写道:


For what it's worth -- the sequence number is not calculated
"baseOffset/baseSequence + offset delta" but rather by monotonically
increasing for a given epoch. If the epoch is bumped, we reset back to
zero.
This may mean that the offset and sequence may match, but do not strictly
need to be the same. The sequence number will also always come from the
client and be in the produce records sent to the Kafka broker.

As for offsets, there is some code in the log layer that maintains the log
end offset and assigns offsets to the records. The produce handling on the
leader should typically assign the offset.
I believe you can find that code here:

https://github.com/apache/kafka/blob/b9a45546a7918799b6fb3c0fe63b56f47d8fcba9/core/src/main/scala/kafka/log/UnifiedLog.scala#L766

Justine

On Tue, Aug 1, 2023 at 11:38 AM Matthias J. Sax  wrote:


The _offset_ is the position of the record in the partition.

The _sequence number_ is a unique ID that allows broker to de-duplicate
messages. It requires the producer to implement the idempotency protocol
(part of Kafka transactions); thus, sequence numbers are optional and as
long as you don't want to support idempotent writes, you don't need to
worry about them. (If you want to dig into details, checkout KIP-98 that
is the original KIP about Kafka TX).

HTH,
-Matthias

On 8/1/23 2:19 AM, tison wrote:

Hi,

I'm wringing a Kafka API Rust codec library[1] to understand how Kafka
models its concepts and how the core business logic works.

During implementing the codec for Records[2], I saw a twins of fields
"sequence" and "offset". Both of them are calculated by
baseOffset/baseSequence + offset delta. Then I'm a bit confused how to

deal

with them properly - what's the difference between these two concepts
logically?

Also, to understand how the core business logic works, I write a simple
server based on my codec library, and observe that the server may need

to

update offset for records produced. How does Kafka set the correct

offset

for each produced records? And how does Kafka maintain the calculation

for

offset and sequence during these modifications?

I'll appreciate if anyone can answer the question or give some insights

:D


Best,
tison.

[1] https://github.com/tisonkun/kafka-api
[2] https://kafka.apache.org/documentation/#messageformat









Re: [QUESTION] What is the difference between sequence and offset for a Record?

2023-08-07 Thread tison
Hi Matthias and Justine,

Thanks for your reply!

I can summarize the answer as -

Record offset = base offset + offset delta. This field is calculated by the
broker and the delta won't change but the base offset may change during log
normalizing.
Record sequence = base sequence + (offset) delta. This field is given by
the producer and the broker should only read it.

Is it correct?

I implement the manipulation part of base offset following this
understanding at [1].

Best,
tison.

[1]
https://github.com/tisonkun/kafka-api/blob/d080ab7e4b57c0ab0182e0b254333f400e616cd2/simplesrv/src/lib.rs#L391-L394


Justine Olshan  于2023年8月2日周三 04:19写道:

> For what it's worth -- the sequence number is not calculated
> "baseOffset/baseSequence + offset delta" but rather by monotonically
> increasing for a given epoch. If the epoch is bumped, we reset back to
> zero.
> This may mean that the offset and sequence may match, but do not strictly
> need to be the same. The sequence number will also always come from the
> client and be in the produce records sent to the Kafka broker.
>
> As for offsets, there is some code in the log layer that maintains the log
> end offset and assigns offsets to the records. The produce handling on the
> leader should typically assign the offset.
> I believe you can find that code here:
>
> https://github.com/apache/kafka/blob/b9a45546a7918799b6fb3c0fe63b56f47d8fcba9/core/src/main/scala/kafka/log/UnifiedLog.scala#L766
>
> Justine
>
> On Tue, Aug 1, 2023 at 11:38 AM Matthias J. Sax  wrote:
>
> > The _offset_ is the position of the record in the partition.
> >
> > The _sequence number_ is a unique ID that allows broker to de-duplicate
> > messages. It requires the producer to implement the idempotency protocol
> > (part of Kafka transactions); thus, sequence numbers are optional and as
> > long as you don't want to support idempotent writes, you don't need to
> > worry about them. (If you want to dig into details, checkout KIP-98 that
> > is the original KIP about Kafka TX).
> >
> > HTH,
> >-Matthias
> >
> > On 8/1/23 2:19 AM, tison wrote:
> > > Hi,
> > >
> > > I'm wringing a Kafka API Rust codec library[1] to understand how Kafka
> > > models its concepts and how the core business logic works.
> > >
> > > During implementing the codec for Records[2], I saw a twins of fields
> > > "sequence" and "offset". Both of them are calculated by
> > > baseOffset/baseSequence + offset delta. Then I'm a bit confused how to
> > deal
> > > with them properly - what's the difference between these two concepts
> > > logically?
> > >
> > > Also, to understand how the core business logic works, I write a simple
> > > server based on my codec library, and observe that the server may need
> to
> > > update offset for records produced. How does Kafka set the correct
> offset
> > > for each produced records? And how does Kafka maintain the calculation
> > for
> > > offset and sequence during these modifications?
> > >
> > > I'll appreciate if anyone can answer the question or give some insights
> > :D
> > >
> > > Best,
> > > tison.
> > >
> > > [1] https://github.com/tisonkun/kafka-api
> > > [2] https://kafka.apache.org/documentation/#messageformat
> > >
> >
>


Re: [VOTE] KIP-959 Add BooleanConverter to Kafka Connect

2023-08-07 Thread Hector Geraldino (BLOOMBERG/ 919 3RD A)
Hello,

I still need help from a committer to review/approve this (small) KIP, which 
adds a new BooleanConverter to the list of converters in Kafka Connect.

The KIP has a companion PR implementing the feature as well. 

Thanks again!
Sent from Bloomberg Professional for iPhone

- Original Message -
From: Hector Geraldino 
To: dev@kafka.apache.org
At: 08/01/23 11:48:23 UTC-04:00


Hi,

Still missing one binding vote for this (very small) KIP to pass :)

From: dev@kafka.apache.org At: 07/28/23 09:37:45 UTC-4:00To:  
dev@kafka.apache.org
Subject: Re: [VOTE] KIP-959 Add BooleanConverter to Kafka Connect

Hi everyone,

Thanks everyone who has reviewed and voted for this KIP.

So far it has received 3 non-binding votes (Andrew Schofield, Yash Mayya, Kamal
Chandraprakash) and 2 binding votes (Chris Egerton, Greg Harris)- still shy of
one binding vote to pass.

Can we get help from a committer to push it through?

Thank you!
Hector

Sent from Bloomberg Professional for iPhone

- Original Message -
From: Greg Harris 
To: dev@kafka.apache.org
At: 07/26/23 12:23:20 UTC-04:00


Hey Hector,

Thanks for the straightforward and clear KIP!
+1 (binding)

Thanks,
Greg

On Wed, Jul 26, 2023 at 5:16 AM Chris Egerton  wrote:
>
> +1 (binding)
>
> Thanks Hector!
>
> On Wed, Jul 26, 2023 at 3:18 AM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > +1 (non-binding). Thanks for the KIP!
> >
> > On Tue, Jul 25, 2023 at 11:12 PM Yash Mayya  wrote:
> >
> > > Hi Hector,
> > >
> > > Thanks for the KIP!
> > >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > > Yash
> > >
> > > On Tue, Jul 25, 2023 at 11:01 PM Andrew Schofield <
> > > andrew_schofield_j...@outlook.com> wrote:
> > >
> > > > Thanks for the KIP. As you say, not that controversial.
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > > > On 25 Jul 2023, at 18:22, Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> > > > hgerald...@bloomberg.net> wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > The changes proposed by KIP-959 (Add BooleanConverter to Kafka
> > Connect)
> > > > have a limited scope and shouldn't be controversial. I'm opening a
> > voting
> > > > thread with the hope that it can be included in the next upcoming 3.6
> > > > release.
> > > > >
> > > > > Here are some links:
> > > > >
> > > > > KIP:
> > > >
> > >
> >
https://cwiki.apache.org/confluence/display/KAFKA/KIP-959%3A+Add+BooleanConverte
r+to+Kafka+Connect
> > > > > JIRA: https://issues.apache.org/jira/browse/KAFKA-15248
> > > > > Discussion thread:
> > > > https://lists.apache.org/thread/15c2t0kl9bozmzjxmkl5n57kv4l4o1dt
> > > > > Pull Request: https://github.com/apache/kafka/pull/14093
> > > > >
> > > > > Thanks!
> > > >
> > > >
> > > >
> > >
> >


[DISCUSS] KIP-964: Have visibility when produce requests become "async".

2023-08-07 Thread Sergio Daniel Troiano
Hi everyone!

I would like to start a discuss thread for this KIP



Thanks


Re: new KIP doubt

2023-08-07 Thread Josep Prat
Hi Sergio,

Thanks for contributing to Apache Kafka!
You don't need any credentials to start a discussion thread. The discussion
thread is started by sending an email to this very same mailing list with a
subject like [DISUSS] KIP-964: Have visibility when produce requests become
"async".
And then in the body state that you'd like to start a discuss thread for
this KIP and link to the KIP page. That should be it! Here you can find one
of the latest DISCUSS threads created:
https://lists.apache.org/thread/0f20kvfqkmhdqrwcb8vqgqn80szcrcdd

Let me know if you have any questions!

Best,


On Mon, Aug 7, 2023 at 2:20 PM Sergio Daniel Troiano
 wrote:

> hey everyone,
>
> Sorry for bothering you, I created a KIP long time ago and now I am
> creating a new one here
> <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263426647
> >
> The problem is I lost my apache credentials for starting a discussion
> thread, could you please tell me how I can recover them?
>
> Thanks in advance.
> Sergio Troiano
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


[VOTE] KIP-953: partition method to be overloaded to accept headers as well.

2023-08-07 Thread Jack Tomy
Hey everyone.

I would like to call for a vote on KIP-953: partition method to be
overloaded to accept headers as well.

KIP :
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937
Discussion thread :
https://lists.apache.org/thread/0f20kvfqkmhdqrwcb8vqgqn80szcrcdd

Thanks
-- 
Best Regards
*Jack*


new KIP doubt

2023-08-07 Thread Sergio Daniel Troiano
hey everyone,

Sorry for bothering you, I created a KIP long time ago and now I am
creating a new one here

The problem is I lost my apache credentials for starting a discussion
thread, could you please tell me how I can recover them?

Thanks in advance.
Sergio Troiano


Re: [DISCUSS] KIP-953: partition method to be overloaded to accept headers as well.

2023-08-07 Thread Jack Tomy
Hey everyone

I'm closing the discussion as I haven't heard from 3 days.

Before I close the thread I would like to thank Sagar and Andrew for their
suggestions and feedback. I believe it has helped to improve the KIP.

Thank you all.

On Fri, Aug 4, 2023 at 10:26 AM Jack Tomy  wrote:

> All right, Thanks Andrew.
>
> Hey everyone,
> Please share your thoughts and feedback on the KIP :
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937
>
>
>
>
> On Fri, Aug 4, 2023 at 2:50 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi Jack,
>> I do understand the idea of extending the Partitioner interface so that
>> people are now able to use headers in the partitioning decision, and I see
>> that it’s filling in gap in the interface which was left when headers were
>> originally added.
>>
>> Experience with non-default partitioning schemes in the past makes me
>> unlikely to use anything other than the default partitioning scheme.
>> But I wouldn’t let that dissuade you.
>>
>> Thanks,
>> Andrew
>>
>> > On 3 Aug 2023, at 13:43, Jack Tomy  wrote:
>> >
>> > Hey Andrew, Sagar
>> >
>> > Please share your thoughts. Thanks.
>> >
>> >
>> >
>> > On Mon, Jul 31, 2023 at 5:58 PM Jack Tomy 
>> wrote:
>> >
>> >> Hey Andrew, Sagar
>> >>
>> >> Thanks. I'm travelling so sorry for being brief and getting back late.
>> >>
>> >> 1. For the first concern, that is moving in a direction of server side
>> >> partitioner, the idea seems very much promising but I believe we still
>> have
>> >> a long way to go. Since the proposal/design for the same is still not
>> >> available, it's hard for me to defend my proposal.
>> >> 2.  For the second concern:
>> >> 2.1 Loss of order in messages, I believe the ordering of messages is
>> >> never promised and the partitioner has no requirement to ensure the
>> same.
>> >> It is upto the user to implement/use a partitioner which ensures
>> ordering
>> >> based on keys.
>> >> 2.2 Key deciding the partitioner, It is totally up to the user to
>> decide
>> >> the partition regardless of the key, we are also passing the value to
>> the
>> >> partitioner. Even the existing implementation receives the value which
>> lets
>> >> the user decide the partition based on value.
>> >> 2.3 Sending to a specific partition, for this, I need to be aware of
>> the
>> >> total number of partitions, but if I can do that same in partitioner,
>> the
>> >> cluster param gives me all the information I want.
>> >>
>> >> I would also quote a line from KIP-82 where headers were added to the
>> >> serializer : The payload is traditionally for the business object, and
>> *headers
>> >> are traditionally used for transport routing*, filtering etc. So I
>> >> believe when a user wants to add some routing information (in this case
>> >> which set of partitions to go for), headers seem to be the right place.
>> >>
>> >>
>> >>
>> >> On Sat, Jul 29, 2023 at 8:48 PM Sagar 
>> wrote:
>> >>
>> >>> Hi Andrew,
>> >>>
>> >>> Thanks for your comments.
>> >>>
>> >>> 1) Yes that makes sense and that's what even would expect to see as
>> well.
>> >>> I
>> >>> just wanted to highlight that we might still need a way to let client
>> side
>> >>> partitioning logic be present as well. Anyways, I am good on this
>> point.
>> >>> 2) The example provided does seem achievable by simply attaching the
>> >>> partition number in the ProducerRecord. I guess if we can't find any
>> >>> further examples which strengthen the case of this partitioner, it
>> might
>> >>> be
>> >>> harder to justify adding it.
>> >>>
>> >>>
>> >>> Thanks!
>> >>> Sagar.
>> >>>
>> >>> On Fri, Jul 28, 2023 at 2:05 PM Andrew Schofield <
>> >>> andrew_schofield_j...@outlook.com> wrote:
>> >>>
>>  Hi Sagar,
>>  Thanks for your comments.
>> 
>>  1) Server-side partitioning doesn’t necessarily mean that there’s
>> only
>> >>> one
>>  way to do it. It just means that the partitioning logic runs on the
>> >>> broker
>>  and
>>  any configuration of partitioning applies to the broker’s
>> partitioner.
>> >>> If
>>  we ever
>>  see a KIP for this, that’s the kind of thing I would expect to see.
>> 
>>  2) In the priority example in the KIP, there is a kind of contract
>> >>> between
>>  the
>>  producers and consumers so that some records can be processed before
>>  others regardless of the order in which they were sent. The producer
>>  wants to apply special significance to a particular header to control
>> >>> which
>>  partition is used. I would simply achieve this by setting the
>> partition
>>  number
>>  in the ProducerRecord at the time of sending.
>> 
>>  I don’t think the KIP proposes adjusting the built-in partitioner or
>>  adding to AK
>>  a new one that uses headers in the partitioning decision. So, any
>>  configuration
>>  for a partitioner that does support headers would be up to the
>>  implementation
>>  of that specific par

Re: [DISCUSS] KIP-962 Relax non-null key requirement in Kafka Streams

2023-08-07 Thread Florin Akermann
Hi Lucas,

Thanks. I added the point about the upgrade guide as well.

Florin

On Mon, 7 Aug 2023 at 11:06, Lucas Brutschy 
wrote:

> Hi Florin,
>
> thanks for the KIP! This looks good to me. I agree that the precise
> Java doc wording doesn't have to be discussed as part of the KIP.
>
> I would also suggest to include an update to
> https://kafka.apache.org/documentation/streams/upgrade-guide
>
> Cheers,
> Lucas
>
> On Mon, Aug 7, 2023 at 10:51 AM Florin Akermann
>  wrote:
> >
> > Hi Both,
> >
> > Thanks.
> > I added remarks to account for this.
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams#KIP962:RelaxnonnullkeyrequirementinKafkaStreams-Remarks
> >
> > In short, let's add a note in the Java docs? The exact wording of the
> note
> > can be scrutinized in the pull request?
> >
> > What do you think?
> >
> >
> > On Sun, 6 Aug 2023 at 19:41, Guozhang Wang 
> > wrote:
> >
> > > I'm just thinking we can try to encourage users to migrate from XX to
> > > XXWithKey in the docs, giving this as one good example that the latter
> > > can help you distinguish different scenarios whereas the former
> > > cannot.
> > >
> > > On Fri, Aug 4, 2023 at 6:32 PM Matthias J. Sax 
> wrote:
> > > >
> > > > Guozhang,
> > > >
> > > > thanks for pointing out ValueJoinerWithKey. In the end, it's just a
> > > > documentation change, ie, point out that the passed in key could be
> > > > `null` and similar?
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 8/2/23 3:20 PM, Guozhang Wang wrote:
> > > > > Thanks Florin for the writeup,
> > > > >
> > > > > One quick thing I'd like to bring up is that in KIP-149
> > > > > (
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> > > )
> > > > > we introduced ValueJoinerWithKey which is aimed to enhance
> > > > > ValueJoiner. It would have a benefit for this KIP such that
> > > > > implementers can distinguish "null-key" v.s. "not-null-key but
> > > > > null-value" scenarios.
> > > > >
> > > > > Hence I'd suggest we also include the semantic changes with
> > > > > ValueJoinerWithKey, which can help distinguish these two scenarios,
> > > > > and also document that if users apply ValueJoiner only, they may
> not
> > > > > have this benefit, and hence we suggest users to use the former.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Jul 31, 2023 at 12:11 PM Florin Akermann
> > > > >  wrote:
> > > > >>
> > > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams
> > >
>


Re: [DISCUSS] KIP-962 Relax non-null key requirement in Kafka Streams

2023-08-07 Thread Lucas Brutschy
Hi Florin,

thanks for the KIP! This looks good to me. I agree that the precise
Java doc wording doesn't have to be discussed as part of the KIP.

I would also suggest to include an update to
https://kafka.apache.org/documentation/streams/upgrade-guide

Cheers,
Lucas

On Mon, Aug 7, 2023 at 10:51 AM Florin Akermann
 wrote:
>
> Hi Both,
>
> Thanks.
> I added remarks to account for this.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams#KIP962:RelaxnonnullkeyrequirementinKafkaStreams-Remarks
>
> In short, let's add a note in the Java docs? The exact wording of the note
> can be scrutinized in the pull request?
>
> What do you think?
>
>
> On Sun, 6 Aug 2023 at 19:41, Guozhang Wang 
> wrote:
>
> > I'm just thinking we can try to encourage users to migrate from XX to
> > XXWithKey in the docs, giving this as one good example that the latter
> > can help you distinguish different scenarios whereas the former
> > cannot.
> >
> > On Fri, Aug 4, 2023 at 6:32 PM Matthias J. Sax  wrote:
> > >
> > > Guozhang,
> > >
> > > thanks for pointing out ValueJoinerWithKey. In the end, it's just a
> > > documentation change, ie, point out that the passed in key could be
> > > `null` and similar?
> > >
> > > -Matthias
> > >
> > >
> > > On 8/2/23 3:20 PM, Guozhang Wang wrote:
> > > > Thanks Florin for the writeup,
> > > >
> > > > One quick thing I'd like to bring up is that in KIP-149
> > > > (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> > )
> > > > we introduced ValueJoinerWithKey which is aimed to enhance
> > > > ValueJoiner. It would have a benefit for this KIP such that
> > > > implementers can distinguish "null-key" v.s. "not-null-key but
> > > > null-value" scenarios.
> > > >
> > > > Hence I'd suggest we also include the semantic changes with
> > > > ValueJoinerWithKey, which can help distinguish these two scenarios,
> > > > and also document that if users apply ValueJoiner only, they may not
> > > > have this benefit, and hence we suggest users to use the former.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, Jul 31, 2023 at 12:11 PM Florin Akermann
> > > >  wrote:
> > > >>
> > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams
> >


Re: [DISCUSS] KIP-962 Relax non-null key requirement in Kafka Streams

2023-08-07 Thread Florin Akermann
Hi Both,

Thanks.
I added remarks to account for this.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams#KIP962:RelaxnonnullkeyrequirementinKafkaStreams-Remarks

In short, let's add a note in the Java docs? The exact wording of the note
can be scrutinized in the pull request?

What do you think?


On Sun, 6 Aug 2023 at 19:41, Guozhang Wang 
wrote:

> I'm just thinking we can try to encourage users to migrate from XX to
> XXWithKey in the docs, giving this as one good example that the latter
> can help you distinguish different scenarios whereas the former
> cannot.
>
> On Fri, Aug 4, 2023 at 6:32 PM Matthias J. Sax  wrote:
> >
> > Guozhang,
> >
> > thanks for pointing out ValueJoinerWithKey. In the end, it's just a
> > documentation change, ie, point out that the passed in key could be
> > `null` and similar?
> >
> > -Matthias
> >
> >
> > On 8/2/23 3:20 PM, Guozhang Wang wrote:
> > > Thanks Florin for the writeup,
> > >
> > > One quick thing I'd like to bring up is that in KIP-149
> > > (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> )
> > > we introduced ValueJoinerWithKey which is aimed to enhance
> > > ValueJoiner. It would have a benefit for this KIP such that
> > > implementers can distinguish "null-key" v.s. "not-null-key but
> > > null-value" scenarios.
> > >
> > > Hence I'd suggest we also include the semantic changes with
> > > ValueJoinerWithKey, which can help distinguish these two scenarios,
> > > and also document that if users apply ValueJoiner only, they may not
> > > have this benefit, and hence we suggest users to use the former.
> > >
> > >
> > > Guozhang
> > >
> > > On Mon, Jul 31, 2023 at 12:11 PM Florin Akermann
> > >  wrote:
> > >>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams
>


Re: [VOTE] KIP-940: Broker extension point for validating record contents at produce time

2023-08-07 Thread Andrew Schofield
Hi Adrian,
Thanks for the KIP. Looks like a nice improvement.

+1 (non-binding)

Thanks,
Andrew

> On 2 Aug 2023, at 12:33, Adrian Preston  wrote:
>
> Hello all,
>
>
>
> Edo and I would like to call for a vote on KIP-940:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-940%3A+Broker+extension+point+for+validating+record+contents+at+produce+time
>
>
>
> Thanks,
>
> Adrian
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU