ad1happy2go commented on issue #10995:
URL: https://github.com/apache/hudi/issues/10995#issuecomment-2058447334
@brightwon Thanks for the update.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
brightwon closed issue #10995: [SUPPORT] Issue with Repartition on Kafka Input
DataFrame and Same Precombine Value Rows In One Batch
URL: https://github.com/apache/hudi/issues/10995
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
ad1happy2go commented on issue #10995:
URL: https://github.com/apache/hudi/issues/10995#issuecomment-2056550465
@brightwon Do you have any more doubts? Feel free to close if you are good
on this. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the messag
ad1happy2go commented on issue #10995:
URL: https://github.com/apache/hudi/issues/10995#issuecomment-2051561910
@brightwon Yes changing precombining key will not be allowed. I do
understand you trying to repartition to scale the tagging stage. You can try
repartition on record key and see
brightwon commented on issue #10995:
URL: https://github.com/apache/hudi/issues/10995#issuecomment-2050814895
@ad1happy2go Thank you for your reply.
What I want is to speed up the tagging stage. Could you suggest a solution?
I can achieve this by using repartition with a completely un
ad1happy2go commented on issue #10995:
URL: https://github.com/apache/hudi/issues/10995#issuecomment-2050027649
@brightwon I do understand your issue. As precombine key should be more of
ordering field ideally should contains different values for same record key. In
your case, If the preco
brightwon opened a new issue, #10995:
URL: https://github.com/apache/hudi/issues/10995
**Describe the problem you faced**
I'm operating a typical Hudi workload that involves using spark structured
streaming to read CDC events from Kafka and perform Upserts into S3.
I've encountered