[GitHub] [hudi] danny0405 commented on pull request #9182: [HUDI-6588] Fix duplicate fileId on TM partial-failover and recovery

2023-07-27 Thread via GitHub


danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1654867431

   > how should these log files be cleaned up. Duplicate bucket id files cause 
tasks to fail to start all the time
   
   The log expected to be cleaned when the instant is committed (we have a 
marker machanism to ensure the retried files got cleaned), then issue here is 
why these partitial files are visible to the `BucketStreamWriter`, that's the 
direction we should dig into.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #9182: [HUDI-6588] Fix duplicate fileId on TM partial-failover and recovery

2023-07-27 Thread via GitHub


danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1654847760

   > StreamWriteOperatorCoordinator#handleBootstrapEvent() -> initInstant() -> 
startInstant() -> this.writeClient.startCommit() -> 
tableServiceClient.rollbackFailedWrites() -> rollbackFailedWrites() -> 
rollback() so TM need send bootstrap event to rollback old instant.Otherwise, 
two files with the same bucket id will be generated under the current partition
   
   Only global failover would trigger the rollback, partial failover would not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #9182: [HUDI-6588] Fix duplicate fileId on TM partial-failover and recovery

2023-07-27 Thread via GitHub


danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1652964890

   > send the bootstrap event to clean up these files and metadata
   
   The log files are actually cleaned by the cleaner/rollback procedure, senind 
the bootstrap event does not work as you expected. Before these log files are 
removed, they are actually invisible to the reader/view(or there should be a 
bug).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #9182: [HUDI-6588] Fix duplicate fileId on TM partial-failover and recovery

2023-07-26 Thread via GitHub


danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1651449174

   > How does this affect metadata cleaning?
   
   It removes the preceeding partial metadata if there is any.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #9182: [HUDI-6588] Fix duplicate fileId on TM partial-failover and recovery

2023-07-25 Thread via GitHub


danny0405 commented on PR #9182:
URL: https://github.com/apache/hudi/pull/9182#issuecomment-1650862935

   Each failed attempt of a subtask would trigger invocation of 
`StreamWriteOperatorCoordinator#subtaskFailed`, the original write metadata 
would got cleaned,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org