SimonChou12138 commented on issue #6457: URL: https://github.com/apache/seatunnel/issues/6457#issuecomment-1980245511
Whether to consider allowing paimon to support concurrent writes? At present, the Writer will write to different buckets at the same time, which will cause write conflicts during the final commit. In fact, in the official Paimon Api, it is possible to detect which Bucket the Row is in, build a Writer for the corresponding bucket, and then write to it, so that there will be no write conflicts at the end of the commit. In fact, the `write-only` parameter of the Paimon table can be used to solve the write conflict problem, but it will have some drawbacks, such as too many small files, skipping compression and deleting expired data. I have tried that the java api can be implemented, and I have tried to modify it in the existing Paimon Connector. The test in the example is normal, but there are still problems in testing actual tables in production environment. It is considered that the modified write process cannot completely avoid this problem, and it has to be calculated and implemented at the upper level of the flow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
