[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 13: Wow, ASAN fails again. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 13 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 10 Jan 2019 08:40:13 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#13). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [V] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 236 insertions(+), 125 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/13 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 13 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 12: (6 comments) http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc@311 PS12, Line 311: 0, 0, 1, 0, 0, 0, 0 > nit: can you add a comment to note what each metric means. e.g., /*METRIC_l Sorry, that's not a good solution i think. My motivation for the changes here is to eliminate and simplify redundant code, so i would rather rollback the modification than add comments. What do you think? :p http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc@1696 PS12, Line 1696: TestDeleteDeadContainersByDeletionTransaction > Can you also add a test for two log blocks refer to one log block container Done http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@372 PS12, Line 372: ~LogBlockContainer(); > Add a comment to describe what happens in this destructor. Done http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@760 PS12, Line 760: bool metadata_missing = false; : bool data_missing = false; > What was the motivation for the changes here? Doesn't seem like the behavio The motivation here is to support deleting incomplete files directly, otherwise opening this type of container will fail and need manual intervention. And i believe this will happen occasionally when rebooting cluster. As was said in last comments, it's not friendly for kudu newbee. Well, i rollback this modification. :( http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@1342 PS12, Line 1342: // It's possible for multiple deletion transactions to end up here. For : // example, if one transaction deletes the second to last block in a : // container and the second deletes its last block, the destructors of : // both transactions could run at a time that container's internal : // bookkeeping reflects the deadness of the container. > Can you add one more sentence to explain this situation is guarded by TrySe Done http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@2366 PS12, Line 2366: // TODO(adar): this should be reported as an inconsistency once dead : // container deletion is also done in real time. Until then, it would be : // confusing to report it as such since it'll be a natural event at startup. > Should this be updated? Done -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 12 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 10 Jan 2019 07:29:57 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 12: (6 comments) http://gerrit.cloudera.org:8080/#/c/12075/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12075/12//COMMIT_MSG@9 PS12, Line 9: LBM supports deleting Can you rephrase the sentence to say this patch intends to support deleting dead containers? This sounds like it is already supported today. http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc@311 PS12, Line 311: 0, 0, 1, 0, 0, 0, 0 nit: can you add a comment to note what each metric means. e.g., /*METRIC_log_block_manager_containers = */ 1. Same below. http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager-test.cc@1696 PS12, Line 1696: TestDeleteDeadContainersByDeletionTransaction Can you also add a test for two log blocks refer to one log block container and concurrently delete both and verify the metric looks right? http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@372 PS12, Line 372: ~LogBlockContainer(); Add a comment to describe what happens in this destructor. http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@1342 PS12, Line 1342: // It's possible for multiple deletion transactions to end up here. For : // example, if one transaction deletes the second to last block in a : // container and the second deletes its last block, the destructors of : // both transactions could run at a time that container's internal : // bookkeeping reflects the deadness of the container. Can you add one more sentence to explain this situation is guarded by TrySetDead()? http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@2366 PS12, Line 2366: // TODO(adar): this should be reported as an inconsistency once dead : // container deletion is also done in real time. Until then, it would be : // confusing to report it as such since it'll be a natural event at startup. Should this be updated? -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 12 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 09 Jan 2019 23:18:10 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/12/src/kudu/fs/log_block_manager.cc@760 PS12, Line 760: bool metadata_missing = false; : bool data_missing = false; What was the motivation for the changes here? Doesn't seem like the behavior has changed, except now there's more state to track (i.e. these 'missing' booleans). -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 12 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 09 Jan 2019 17:22:08 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#12). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [V] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 224 insertions(+), 125 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/12 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 12 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 11: All right, calling 'SyncDir()' has been removed. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 11 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 09 Jan 2019 10:37:07 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 11: (3 comments) > And here are another three comments, may i fix them? Go ahead. I agree with your first comment, disagree with the second, and don't care either way on the third. http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@679 PS11, Line 679: CONTAINER_DISK_FAILURE(block_manager_->env_->SyncDir(data_dir_->dir()), > Still, i think it wastes IO to call 'SyncDir' after file deletion. The rogu Now that we're calling SyncDir for each deleted container instead of just once after Repair(), I agree. http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@2452 PS11, Line 2452: string data_filename = StrCat(container->ToString(), kContainerDataFileSuffix); : uint64_t reported_size; : s = env_->GetFileSizeOnDisk(data_filename, &reported_size); : if (!s.ok()) { : HANDLE_DISK_FAILURE(s, error_manager_->RunErrorNotificationCb( : ErrorHandlerType::DISK_ERROR, dir)); : *result_status = s.CloneAndPrepend(Substitute( : "Could not get on-disk file size of container $0", container->ToString())); : return; : } : int64_t cleanup_threshold_size = container->live_bytes_aligned() * : (1 + FLAGS_log_container_excess_space_before_cleanup_fraction); : if (reported_size > cleanup_threshold_size) { : local_report.full_container_space_check->entries.emplace_back( : container->ToString(), reported_size - container->live_bytes_aligned()); : : // If the container is to be deleted outright, don't bother repunching : // its blocks. The report entry remains, however, so it's clear that : // there was a space discrepancy. : if (container->live_blocks()) { : need_repunching.insert(need_repunching.end(), : dead_blocks.begin(), dead_blocks.end()); : } : } > It's not necessary to clean up threshold size for a dead container, though True, although the FsReport we generate tries to account for all inconsistencies even if they're cleaned up by deleting containers. Plus GetFileSizeOnDisk isn't particularly expensive. http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@2514 PS11, Line 2514: AddNewContainerUnlocked(container); : MakeContainerAvailableUnlocked(std::move(container)); > It's a little bit strange to add the dead containers here and then remove t Yeah but it's simpler in that all containers are treated the same way. If you like, you could change the plumbing and avoid adding dead containers; when I updated your patch it felt a little bit out of scope for me so I didn't do it. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 11 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 09 Jan 2019 05:32:39 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 11: (3 comments) Thank you very much :) And here are another three comments, may i fix them? http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@679 PS11, Line 679: CONTAINER_DISK_FAILURE(block_manager_->env_->SyncDir(data_dir_->dir()), Still, i think it wastes IO to call 'SyncDir' after file deletion. The rogue admin should be responsible for his operations. Besides, for the kudu newbee, it will bring barriers while rebooting the kudu cluster. Anyway, that's only my opinion. http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@2452 PS11, Line 2452: string data_filename = StrCat(container->ToString(), kContainerDataFileSuffix); : uint64_t reported_size; : s = env_->GetFileSizeOnDisk(data_filename, &reported_size); : if (!s.ok()) { : HANDLE_DISK_FAILURE(s, error_manager_->RunErrorNotificationCb( : ErrorHandlerType::DISK_ERROR, dir)); : *result_status = s.CloneAndPrepend(Substitute( : "Could not get on-disk file size of container $0", container->ToString())); : return; : } : int64_t cleanup_threshold_size = container->live_bytes_aligned() * : (1 + FLAGS_log_container_excess_space_before_cleanup_fraction); : if (reported_size > cleanup_threshold_size) { : local_report.full_container_space_check->entries.emplace_back( : container->ToString(), reported_size - container->live_bytes_aligned()); : : // If the container is to be deleted outright, don't bother repunching : // its blocks. The report entry remains, however, so it's clear that : // there was a space discrepancy. : if (container->live_blocks()) { : need_repunching.insert(need_repunching.end(), : dead_blocks.begin(), dead_blocks.end()); : } : } It's not necessary to clean up threshold size for a dead container, though the number of the dead containers is very limited(we know that the dead containers will be deleted automatically at the runtime). http://gerrit.cloudera.org:8080/#/c/12075/11/src/kudu/fs/log_block_manager.cc@2514 PS11, Line 2514: AddNewContainerUnlocked(container); : MakeContainerAvailableUnlocked(std::move(container)); It's a little bit strange to add the dead containers here and then remove them in the 'Repair()'. Maybe we could do something at Line2386. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 11 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 09 Jan 2019 03:22:22 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 11: I pushed a new revision with a few cleanups: - Properly rebased on top of the "wrapped up containers in scoped_refptr" change. - The few remaining DOS line endings converted to UNIX line endings. - A few formatting/style fixes. - The only real change was to get read_only handling right: in that case, we should not delete dead containers at startup. To do this I moved the call to TrySetDead into Repair(), and changed the container vector plumbing from OpenDataDir() into Repair() to pass containers by scoped_refptr instead of by name. Please take a look and let me know if anything seems wrong. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 11 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 08 Jan 2019 20:36:23 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has uploaded a new patch set (#11) to the change originally created by helifu. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [V] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 235 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/11 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 11 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 10: > Something's not quite right with this patch; when I look at the full diff, I > see both the "wrapping up containers in scoped_refptr" patch as well as the > "deletion of dead/full containers" patch mixed together. This is still the case as of PS10. Here's a workflow I like to use to resolve situations like these: 1) 'git checkout' a new branch based on master. 2) 'git cherry-pick' the scoped_refptr commit into this new branch. 3) 'git cherry-pick' this commit onto the new branch. 4) Step 3 will yield a lot of conflicts; manually resolve them. When you're done, you should have a new branch with both patches in their desired state, and you can push the entirety of the new branch to gerrit. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 10 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 07 Jan 2019 22:31:31 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#10). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [V] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 344 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/10 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 10 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 8: Something's not quite right with this patch; when I look at the full diff, I see both the "wrapping up containers in scoped_refptr" patch as well as the "deletion of dead/full containers" patch mixed together. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 8 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Sun, 06 Jan 2019 05:06:26 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#8). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [V] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 344 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/8 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 8 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12075/7//COMMIT_MSG@15 PS7, Line 15: Step1: wrapping up containers in scoped_refptr. [V] : Step2: supporting dead containers removal. [X] > Could you do step 1 in a separate gerrit change rather than overwriting thi Thanks. Now that you've created https://gerrit.cloudera.org/c/12121, you should rebase this change on top of that one. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 7 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 02 Jan 2019 23:19:14 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12121 Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [X] Change-Id: I3c5c620014782b3d57dcbe047d0df73c949590c7 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 131 insertions(+), 130 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/12121/1 -- To view, visit http://gerrit.cloudera.org:8080/12121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3c5c620014782b3d57dcbe047d0df73c949590c7 Gerrit-Change-Number: 12121 Gerrit-PatchSet: 1 Gerrit-Owner: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12075/7//COMMIT_MSG@15 PS7, Line 15: Step1: wrapping up containers in scoped_refptr. [V] : Step2: supporting dead containers removal. [X] Could you do step 1 in a separate gerrit change rather than overwriting this one? That way this change will continue to reflect the work for KUDU-2636 and the interdiff will be relatively clean and easy to review. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 7 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 21 Dec 2018 17:39:48 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#7). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Step1: wrapping up containers in scoped_refptr. [V] Step2: supporting dead containers removal. [X] Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 131 insertions(+), 130 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/7 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 7 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 6: > maybe that is not enough. please take a look at this case: two remaining > LogBlocks are deleted in different LogBlockDeletionTransaction concurrently, > then one will trigger hole punching and the other will trigger dead container > removal. but it is not safe to access the container for hole punching when > removing dead container is executed first. > i think the key point is that the container point in > 'all_containers_by_name_' is a raw point which will never be deleted after > startup. in other words, the container point is not safe. > i am going to use 'std::shared_ptr' and 'std::weak_ptr' to wrap it. > what do you think? That's a very good point. It's hard to come up with a robust solution that doesn't involve shared ownership for containers via std::shared_ptr. I don't think you need weak_ptr though. You could take mimic the FileCache behavior and perform dead container deletion in the container's destructor. So the sequence of events could be something like this: 1. A bunch of deletion transactions run in different threads, all affecting the same container. 2. All but one of them trigger hole punching on the container. The hole punch closures capture a shared ref to the container. 3. One triggers container deletion, which actually just marks the container as deleted and removes it from the global map. 4. The hole punch closures run, and punch holes (unnecessary, but safe). 5. When all the closures are done, the container's ref count is now 0. Its destructor runs, and the container is safely deleted from disk. Since wrapping up containers in std::shared_ptr may be an extensive change, would you mind creating a separate change for it, and then rebasing this change on top of it? -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 20 Dec 2018 02:06:39 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 6: maybe that is not enough. please take a look at this case: two remaining LogBlocks are deleted in different LogBlockDeletionTransaction concurrently, then one will trigger hole punching and the other will trigger dead container removal. but it is not safe to access the container for hole punching when removing dead container is executed first. i think the key point is that the container point in 'all_containers_by_name_' is a raw point which will never be deleted after startup. in other words, the container point is not safe. i am going to use 'std::shared_ptr' and 'std::weak_ptr' to wrap it. what do you think? -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 20 Dec 2018 01:24:06 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 6: > There are more than one LogBlock which will reference the same > LogBlockContainer. If these LogBlocks are deleted in different > LogBlockDeletionTransaction concurrently, then the LogBlockContainer will be > deleted repeatedly. > I am trying to fix it. I see, this has to do with the non-atomicity of decrementing container->live_blocks_, then later fetching container->live_blocks() to compare it with 0. Yeah that is frustrating. One idea is to modify BlockDeleted() to return whether or not the container become dead when this block was deleted, and then plumb that information up the various delete paths to use it within CommitDeletedBlocks() to determine whether the container should be deleted (rather than looking at live_blocks()). -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 19 Dec 2018 19:20:18 + Gerrit-HasComments: No
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 6: (11 comments) There are more than one LogBlock which will reference the same LogBlockContainer. If these LogBlocks are deleted in different LogBlockDeletionTransaction concurrently, then the LogBlockContainer will be deleted repeatedly. I am trying to fix it. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.h@276 PS6, Line 276: // Removes dead containers from disk. : // : // Must call RemoveFullContainersUnlocked() first. > This is a good place to expand about the interaction with FileCache. You co Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1237 PS6, Line 1237: // Due to the shared ownership of LogBlockDeletionTransaction, the destructor is : // responsible for destroying all registered blocks. This includes: : // 1. Punching holes in deleted blocks, and : // 2. Deleting dead containers outright. > The formatting issue persists. Take a look at the diff in gerrit; you'll se done. a new line character '\r\n' in MS-DOS causes this problem. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1256 PS6, Line 1256: std::unordered_map> deleted_interval_map_; > Nit: while you're here, could you drop the unnecessary std:: prefixes here Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1260 PS6, Line 1260: unordered_map> dead_containers_by_dir_; > The value would probably be more efficient as an unordered_set rather than Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1631 PS6, Line 1631: // Access to container_ must be before calling log_block_.reset(). : // : // Because log_block_.reset() may cause the LogBlock() to lose its last reference : // and be destroyed. This may cause a registered LogBlockDeletionTransaction to lose : // its last reference and be destroyed, which will trigger the asynchronous code that : // will eventually destroy the container which is dead. Thus, the ordering of operations : // here is very important, otherwise it is unsafe. > Nit: reword a bit: Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1965 PS6, Line 1965: // The Descriptor() in the FileCache() will be destroyed : // along with LogBlockContainer's destructor. > Nit: I would drop this comment. It's not really necessary to improve the un Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1980 PS6, Line 1980: DataDir* dir, const std::vector& names) { > Don't need std:: prefixes. Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1982 PS6, Line 1982: for (const auto& name : names) { > There are some extra line formatting issues here. Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@2008 PS6, Line 2008: // Remove from the memory. > Nit: "Remove the containers from memory." Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@2014 PS6, Line 2014: // Remove from the disk. > Nit: "Remove the containers from disk." Done http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@2580 PS6, Line 2580: // After the deletions, the data directory is sync'ed to reduce the chance : // of a data file existing without its corresponding metadata file (or vice : // versa) in the event of a crash. The block manager would treat such a case : // as corruption and require manual intervention. : // : // TODO(adar) the above is not fool-proof; a crash could manifest in between : // any pair of deletions. That said, the odds of it happening are incredibly : // rare, and manual resolution isn't hard (just delete the existing file). > Move this into the body of RemoveDeadContainersFromDisk, just above the cal Done -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerri
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 6: (11 comments) The dense_node-itest failure is interesting. Here's some useful bits from the log: I1219 04:37:17.664212 817 log_block_manager.cc:2002] Deleted 1 dead containers (1256 metadata bytes) ... I1219 04:37:19.328910 817 log_block_manager.cc:2002] Deleted 2 dead containers (2513 metadata bytes) F1219 04:37:19.336031 817 log_block_manager.cc:1969] Check failed: to_delete *** Check failure stack trace: *** @ 0x7fc8c8bd162d google::LogMessage::Fail() at ??:0 @ 0x7fc8c8bd364c google::LogMessage::SendToLog() at ??:0 @ 0x7fc8c8bd1189 google::LogMessage::Flush() at ??:0 @ 0x7fc8c8bd3fdf google::LogMessageFatal::~LogMessageFatal() at ??:0 @ 0x7fc8c68bf7eb kudu::fs::LogBlockManager::RemoveFullContainersUnlocked() at ??:0 @ 0x7fc8c68bf8d3 kudu::fs::LogBlockManager::RemoveDeadContainers() at ??:0 @ 0x7fc8c938065e kudu::ThreadPool::DispatchThread() at ??:0 @ 0x7fc8c937626e kudu::Thread::SuperviseThread() at ??:0 @ 0x7fc8c6d95184 start_thread at ??:0 @ 0x7fc8c81dbffd clone at ??:0 FYI, the test runs with only one data directory. But it seems like it's trying to delete the same set of containers maybe? Some more logging in log_block_manager.cc may help. You may also be able to reproduce the failure locally. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.h@276 PS6, Line 276: // Removes dead containers from disk. : // : // Must call RemoveFullContainersUnlocked() first. This is a good place to expand about the interaction with FileCache. You could say: // Removes dead containers from disk. // // Must first call RemoveFullContainersUnlocked() to remove the containers from // memory. This guarantees that the containers' FileCache descriptors are expired. // Thus, the FileCache deletions done here will delete the files straight away, // which allows the syncing of the data dir to actually have some effect. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1237 PS6, Line 1237: // Due to the shared ownership of LogBlockDeletionTransaction, the destructor is : // responsible for destroying all registered blocks. This includes: : // 1. Punching holes in deleted blocks, and : // 2. Deleting dead containers outright. The formatting issue persists. Take a look at the diff in gerrit; you'll see that each line is offset from the next with an extra blank line. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1256 PS6, Line 1256: std::unordered_map> deleted_interval_map_; Nit: while you're here, could you drop the unnecessary std:: prefixes here and on L1264? http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1260 PS6, Line 1260: unordered_map> dead_containers_by_dir_; The value would probably be more efficient as an unordered_set rather than a set. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1631 PS6, Line 1631: // Access to container_ must be before calling log_block_.reset(). : // : // Because log_block_.reset() may cause the LogBlock() to lose its last reference : // and be destroyed. This may cause a registered LogBlockDeletionTransaction to lose : // its last reference and be destroyed, which will trigger the asynchronous code that : // will eventually destroy the container which is dead. Thus, the ordering of operations : // here is very important, otherwise it is unsafe. Nit: reword a bit: // We may not safely access container_ after calling log_block_.reset(). // // log_block_.reset() may cause the LogBlock() to lose its last reference and be destroyed. // This may cause a registered LogBlockDeletionTransaction to lose its last reference and // be destroyed, which will trigger the asynchronous code in its destructor that will // eventually destroy the (now dead) container. http://gerrit.cloudera.org:8080/#/c/12075/6/src/kudu/fs/log_block_manager.cc@1965 PS6, Line 1965: // The Descriptor() in the FileCache() will be destroyed : // along with LogBlockContainer's destructor. Nit: I would drop this comment. It's not really necessary to improve the understanding of RemoveFullContainersUnlocked; it's more useful for RemoveDeadC
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#6). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 211 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/6 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1979 PS4, Line 1979: for (auto& name : names) { > Maybe define a helper function like this? done. this comment was previously blocked by the following question, but now it is ok. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1980 PS4, Line 1980: string data_file_name = StrCat(name, kContainerDataFileSuffix); > Looks like you missed this. Done http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1983 PS4, Line 1983: file_cache_.Invalidate(metadata_file_name); : WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFi > That's true, but the call to RemoveFullContainersUnlocked on L1975 will hav ah,i can catch up with you at last. yes, you are correct! i made a mistake earlier.the key point is std::weak_ptr. the file_cache_.DeleteFile is safe here. but the logic is a little complicated indeed :( http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc@1237 PS5, Line 1237: // Due to the shared ownership of LogBlockDeletionTransaction, the : // destructor is responsible for destroying all registered blocks. : // This includes: : // 1. Punching holes in deleted blocks, and : // 2. Deleting dead containers outright. > There's something wrong here with the formatting; looks like there's an ext Done http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc@1632 PS5, Line 1632: log_block_.reset(); : if (container_->metrics()) { : container_->metrics()->generic_metrics.blocks_open_reading->Decrement(); : } > I think this may be the source of the TSAN data race. Great job! Done. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 5 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 19 Dec 2018 04:15:40 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc@1632 PS5, Line 1632: log_block_.reset(); : if (container_->metrics()) { : container_->metrics()->generic_metrics.blocks_open_reading->Decrement(); : } I think this may be the source of the TSAN data race. Given the ordering of these operations, log_block_.reset() may cause the LogBlock() to lose its last reference and be destroyed. This may cause a registered LogBlockDeletionTransaction to lose its last reference and be destroyed, which will trigger the asynchronous code that will eventually destroy the container. Thus, this access to container_ is unsafe. It should be fixable by just swapping the order of the two operations here. Please also leave a comment with this example that explains how, once log_block_.reset() is called, it's no longer safe to access container_. In fact we may want to null out container_ to further emphasize this (if someone makes this mistake in the future, better to get a SIGSEGV on a 0x0 memory access than a data race). -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 5 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 18 Dec 2018 22:13:49 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 5: (4 comments) Looks like you have a TSAN data race to figure out too. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1979 PS4, Line 1979: for (auto& name : names) { > i have no idea :( Maybe define a helper function like this? void LogBlockManager::DeleteDeadContainersFromDisk(DataDir* dir, const vector& names) { int64_t deleted_metadata_bytes = 0; for (const auto& n : names) { string data_file_name = StrCat(n, kContainerDataFileSuffix); string metadata_file_name = StrCat(n, kContainerMetadataFileSuffix); uint64_t metadata_size; Status s = env_->GetFileSize(metadata_file_name, &metadata_size); if (s.ok()) { deleted_metadata_bytes += metadata_size; } else { WARN_NOT_OK_LBM_DISK_FAILURE(s, "Could not get size of dead container metadata file " + metadata_file_name); } WARN_NOT_OK_LBM_DISK_FAILURE(file_cache_.DeleteFile(data_file_name), "Could not delete dead container data file " + data_file_name); WARN_NOT_OK_LBM_DISK_FAILURE(file_cache_.DeleteFile(metadata_file_name), "Could not delete dead container metadata file " + metadata_file_name); } if (!names.empty()) { WARN_NOT_OK_LBM_DISK_FAILURE(env_->SyncDir(dir->dir()), "Could not sync data directory"); LOG(INFO) << Substitute("Deleted $0 dead containers ($1 metadata bytes)", dead_containers.size(), deleted_metadata_bytes); } } You could call it here on L1979 and it could replace the code on L2553-2586. Make sure to preserve the useful comments though. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1980 PS4, Line 1980: string data_file_name = StrCat(name, kContainerDataFileSuffix); > done. Looks like you missed this. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1983 PS4, Line 1983: file_cache_.Invalidate(metadata_file_name); : WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFi > The function 'FileCache::DeleteFile' does not delete file directl That's true, but the call to RemoveFullContainersUnlocked on L1975 will have deleted the LogBlockContainer instances from memory, which means the associated descriptors from the FileCache will have been destroyed. So if we were to call FileCache::DeleteFile here to delete the container's files, I think we'll end up in the "There is no outstanding descriptor" case, evict the fd from the cache, and delete the file. Does that sound right to you? It's been a while since I've looked at the FileCache code so maybe I missed something. Anyway, if you agree, we don't need the Invalidate calls and file_cache_.DeleteFile should be safe. http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc@1237 PS5, Line 1237: // Due to the shared ownership of LogBlockDeletionTransaction, the : // destructor is responsible for destroying all registered blocks. : // This includes: : // 1. Punching holes in deleted blocks, and : // 2. Deleting dead containers outright. There's something wrong here with the formatting; looks like there's an extra empty line for each line of text. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 5 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 18 Dec 2018 22:02:49 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#5). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk pressure, save tserver startup time and etc. It uses the container size and live blocks to determine whether the container is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 180 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/5 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 5 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG@10 PS4, Line 10: presure > pressure done http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG@12 PS4, Line 12: it > Nit: "the container" done. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 18 Dec 2018 11:02:55 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager-test.cc@1771 PS4, Line 1771: CheckMetrics({0, 0, 0, 0, 0, 0, 0}); > I think you need to wrap calls to CheckMetrics in NO_FATALS also. done. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 18 Dec 2018 10:59:04 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1237 PS4, Line 1237: // Given the shared ownership of LogBlockDeletionTransaction, at this point : // all registered blocks should be destructed. Thus, 1)coalescing deletions : // for blocks that are contiguous in a container and schedules hole punching; : // 2)schedules containers removal by data dir. > How about: done. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1288 PS4, Line 1288: DataDir* dir = entry.first; : vector names(entry.second.begin(), entry.second.end()); > Nit: you can inline this into the Bind() call, the way it was done on L1280 done. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1979 PS4, Line 1979: // Invalidate the cache and remove from the disk. > Could you consolidate this with the equivalent code in Repair()? i have no idea :( http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1980 PS4, Line 1980: for (auto& name : names) { > 'name' can be a cref. done. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1983 PS4, Line 1983: file_cache_.Invalidate(data_file_name); : file_cache_.Invalidate(metadata_file_name); > Why do we need to call Invalidate in this code path? We're not renaming any The function 'FileCache::DeleteFile' does not delete file directly, because there is a case to mark the 'flags_' of class BaseDescriptor to be deleted only. Then the 'descriptor_expiry_thread_' of class FileCache erases the descriptor and the file is deleted in the destructor of class BaseDescriptor at last. The default of 'file_cache_expiry_period_ms' is 60 seconds. Thus the 'env_->SyncDir()' later will be meaningless. So, i think it is better to use 'file_cache_.Invalidate'+'env_->DeleteFile'. What do you think? http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1985 PS4, Line 1985: WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFile(data_file_name), : "Could not delete dead container data file " + data_file_name); : WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFile(metadata_file_name), : "Could not delete dead container metadata file " + metadata_file_name); > Shouldn't we delete through the file cache? I.e. file_cache_.DeleteFile? the answer is listed above. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 18 Dec 2018 01:27:01 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG@10 PS4, Line 10: presure pressure http://gerrit.cloudera.org:8080/#/c/12075/4//COMMIT_MSG@12 PS4, Line 12: it Nit: "the container" http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager-test.cc@1771 PS4, Line 1771: CheckMetrics({0, 0, 0, 0, 0, 0, 0}); I think you need to wrap calls to CheckMetrics in NO_FATALS also. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1237 PS4, Line 1237: // Given the shared ownership of LogBlockDeletionTransaction, at this point : // all registered blocks should be destructed. Thus, 1)coalescing deletions : // for blocks that are contiguous in a container and schedules hole punching; : // 2)schedules containers removal by data dir. How about: // Due to the shared ownership of LogBlockDeletionTransaction, the destructor is // responsible for destroying all registered blocks. This includes: // 1. Punching holes in deleted blocks, and // 2. Deleting dead containers outright. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1288 PS4, Line 1288: DataDir* dir = entry.first; : vector names(entry.second.begin(), entry.second.end()); Nit: you can inline this into the Bind() call, the way it was done on L1280. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1979 PS4, Line 1979: // Invalidate the cache and remove from the disk. Could you consolidate this with the equivalent code in Repair()? http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1980 PS4, Line 1980: for (auto& name : names) { 'name' can be a cref. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1983 PS4, Line 1983: file_cache_.Invalidate(data_file_name); : file_cache_.Invalidate(metadata_file_name); Why do we need to call Invalidate in this code path? We're not renaming anything. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1985 PS4, Line 1985: WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFile(data_file_name), : "Could not delete dead container data file " + data_file_name); : WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFile(metadata_file_name), : "Could not delete dead container metadata file " + metadata_file_name); Shouldn't we delete through the file cache? I.e. file_cache_.DeleteFile? -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 17 Dec 2018 19:21:37 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#4). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full containers which are dead before hole punching. It can help to ease disk presure, save tserver startup time and etc. It uses the container size and live blocks to determine whether it is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 180 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/4 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 3: (11 comments) thanks for your detailed comments. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1754 PS3, Line 1754: TestDeleteDeadContainersAtPunchHole > Probably "TestDeleteDeadContainersByDeletionTransaction" is more appropriat done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1796 PS3, Line 1796: } > Can you add a variant of this test that opens the block for reading before done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1798 PS3, Line 1798: removing > Nit: removal done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@141 PS3, Line 141: "Number of Dead Block Containers", > "Number of Dead Block Containers Deleted" done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@143 PS3, Line 143: "Number of dead block containers that were deleted"); > "Number of full (but dead) block containers that were deleted" done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1237 PS3, Line 1237: // Given the shared ownership of LogBlockDeletionTransaction, at this point : // all registered blocks should be destructed. Thus, coalescing deletions : // for blocks that are contiguous in a container and schedules hole punching. > Can you update this comment to reflect the changes to the LogBlockDeletionT done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1269 PS3, Line 1269: std::unordered_map> containers_by_dir; > Don't need std:: prefixes for the new code; we're "using ..." that stuff at done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1273 PS3, Line 1273: // Filter out the containers that are full and dead. > FWIW, I think it'd be cleaner to identify full and dead containers in Commi can not agree any more ;) http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1946 PS3, Line 1946: void LogBlockManager::RemoveFullContainerUnlocked(const string& container_name) { > Can you convert this into RemoveFullContainersUnlocked? Looks like both cal done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1968 PS3, Line 1968: // Invalidate the cache and remove from the disk. > Please consolidate this with the equivalent code in Repair(). In particular done. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1976 PS3, Line 1976: if (metrics()) { : metrics()->dead_containers_deleted->Increment(); : } > You can pull this out of the loop and increment just once by names.size(). done. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 3 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Mon, 17 Dec 2018 12:20:53 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 3: (11 comments) Looks good! Hao did some recent work in the log block manager; Hao, could you also take a look? http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1754 PS3, Line 1754: TestDeleteDeadContainersAtPunchHole Probably "TestDeleteDeadContainersByDeletionTransaction" is more appropriate now. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1796 PS3, Line 1796: } Can you add a variant of this test that opens the block for reading before deleting it, tests that after the BlockDeletionTransaction goes out of scope, the block (and its container) is still alive, and only when the opened block goes out of scope is the block (and the container) deleted? You should be able to reuse pretty much all of this code with some parameterization. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager-test.cc@1798 PS3, Line 1798: removing Nit: removal http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@141 PS3, Line 141: "Number of Dead Block Containers", "Number of Dead Block Containers Deleted" http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@143 PS3, Line 143: "Number of dead block containers that were deleted"); "Number of full (but dead) block containers that were deleted" http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1237 PS3, Line 1237: // Given the shared ownership of LogBlockDeletionTransaction, at this point : // all registered blocks should be destructed. Thus, coalescing deletions : // for blocks that are contiguous in a container and schedules hole punching. Can you update this comment to reflect the changes to the LogBlockDeletionTransaction destructor? http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1269 PS3, Line 1269: std::unordered_map> containers_by_dir; Don't need std:: prefixes for the new code; we're "using ..." that stuff at the top of the file (and if we're not for a particular class, add a "using::" definition for it). http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1273 PS3, Line 1273: // Filter out the containers that are full and dead. FWIW, I think it'd be cleaner to identify full and dead containers in CommitDeletedBlocks (after the call to RemoveLogBlocks). Then, don't even call AddBlock() on them; just throw them into a new std::unordered_map of container names keyed by data dirs. Finally, in the destructor, you process the deleted_interval_map_ as before, then process the new unordered_map separately. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1946 PS3, Line 1946: void LogBlockManager::RemoveFullContainerUnlocked(const string& container_name) { Can you convert this into RemoveFullContainersUnlocked? Looks like both callers would benefit from that right now: it'd let you do one metrics decrement instead of a per-container decrement. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1968 PS3, Line 1968: // Invalidate the cache and remove from the disk. Please consolidate this with the equivalent code in Repair(). In particular: - We should WARN when a deletion fails. - We should trigger the disk failure handling path when a deletion fails. http://gerrit.cloudera.org:8080/#/c/12075/3/src/kudu/fs/log_block_manager.cc@1976 PS3, Line 1976: if (metrics()) { : metrics()->dead_containers_deleted->Increment(); : } You can pull this out of the loop and increment just once by names.size(). -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 3 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Sat, 15 Dec 2018 20:49:25 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#3). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full container which is dead after hole punching. It can help to save startup time. It uses the file's physical size to determine whether the container is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 145 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/3 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 3 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1179 PS1, Line 1179: : /// : // LogBlockCreationTransaction : > I did miss the question :( It's certainly possible to automatically delete one of a given file pair (xxx.data and xxx.metadata) if the other of the pair is missing. I'm not sure that this wise, though, given that we don't know at startup _why_ the files are missing. If a rogue admin accidentally deleted all of the .metadata files, do you want Kudu to delete the .data files as a result? Seems like failing to start and forcing the admin to investigate is the safer option. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 14 Dec 2018 01:09:59 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1179 PS1, Line 1179: : /// : // LogBlockCreationTransaction : > Point 2 is spelled out in more detail in Repair(). I'll copy the relevant c I did miss the question :( And, i have another idea. Is it possible to check the integrity of file pairs(xxx.data and xxx.metadata) at startup after a crash? In other words, the LBM needs to support the corruption, for example, delete the incomplete file pairs directly.What do you think? -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Fri, 14 Dec 2018 00:57:48 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1179 PS1, Line 1179: : /// : // LogBlockCreationTransaction : > Agree with point 1 and 3. Point 2 is spelled out in more detail in Repair(). I'll copy the relevant comment: // Delete all dead containers. // // After the deletions, the data directory is sync'ed to reduce the chance // of a data file existing without its corresponding metadata file (or vice // versa) in the event of a crash. The block manager would treat such a case // as corruption and require manual intervention. // // TODO(adar) the above is not fool-proof; a crash could manifest in between // any pair of deletions. That said, the odds of it happening are incredibly // rare, and manual resolution isn't hard (just delete the existing file). To go into more detail, the concern here is that once we start deleting files, a machine crash could lead to any subset of the deleted files coming back to life, regardless of the order in which they were deleted. Without a SyncDir() call, this vulnerability window is as long as the filesystem's journal commit period. On ext4 that's given by the 'commit' mount option, which defaults to 5s. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 13 Dec 2018 05:24:21 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Hello Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12075 to look at the new patch set (#2). Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full container which is dead after hole punching. It can help to save startup time. It uses the file's physical size to determine whether the container is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 136 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/2 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 1: (3 comments) Thanks for your comments.@adar http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1179 PS1, Line 1179: // Though the result of deleting a data block is not the same as : // before when there are empty data blocks in the same container : // which is FULL, especially in the unit test, this design is still : // acceptable. > I don't agree with this assessment. This approach suffers from the followin Agree with point 1 and 3. But to point 2, i have a question, is it necessary to call SyncDir() after deleting the container? In my opinion it's ok that let the file system flush dirty data to disk.Then, the 'batch' containers deletion is unnecessary. http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env.h File src/kudu/util/env.h: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env.h@648 PS1, Line 648: virtual Status SizeOnDisk(uint64_t* size) const = 0; > Could you add a brief unit test to env-test to cover this new method? rollback. http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env_posix.cc@1299 PS1, Line 1299: // From stat(2): : // : // The st_blocks field indicates the number of blocks allocated to : // the file, 512-byte units. (This may be smaller than st_size/512 : // when the file has holes.) : *size = sbuf.st_blocks * 512; > Can you reuse this block of code for RWFile::SizeOnDisk, using a decomposed rollback. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 1 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Thu, 13 Dec 2018 03:40:11 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 ) Change subject: KUDU-2636: LBM supports deleting dead and full containers .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1179 PS1, Line 1179: // Though the result of deleting a data block is not the same as : // before when there are empty data blocks in the same container : // which is FULL, especially in the unit test, this design is still : // acceptable. I don't agree with this assessment. This approach suffers from the following problems: 1. It goes to disk unnecessarily to figure out if a container is dead. If we can safely call full(), why can't we call live_blocks() and compare it to 0? 2. By chaining to ContainerDeletionAsync, we can't "batch" container deletion. LogBlockDeletionTransaction allows for the client to specify a group of blocks to be deleted together (i.e. deleting an entire tablet). The LBM uses this to optimize deletion where it can. In the case of deleting containers that are no longer necessary, we can delete all of the container files, then do one SyncDir() call per parent directories, instead of a SyncDir() per container. 3. Furthermore, if a LogBlockDeletionTransaction is going to delete all of the blocks in a container, it'd be cheaper to recognize that and, rather than punching any holes, delete the container files outright. http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/fs/log_block_manager.cc@1966 PS1, Line 1966: void LogBlockManager::RemoveDeadFullContainer(const std::string& container_name) { Would be good to reuse/consolidate the file deletion code here with the equivalent in Repair(). http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env.h File src/kudu/util/env.h: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env.h@648 PS1, Line 648: virtual Status SizeOnDisk(uint64_t* size) const = 0; Could you add a brief unit test to env-test to cover this new method? http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: http://gerrit.cloudera.org:8080/#/c/12075/1/src/kudu/util/env_posix.cc@1299 PS1, Line 1299: // From stat(2): : // : // The st_blocks field indicates the number of blocks allocated to : // the file, 512-byte units. (This may be smaller than st_size/512 : // when the file has holes.) : *size = sbuf.st_blocks * 512; Can you reuse this block of code for RWFile::SizeOnDisk, using a decomposed static helper? The motivation here is understandability: it'd be good for readers to see that comment. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 1 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 12 Dec 2018 19:09:53 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2636: LBM supports deleting dead and full containers
helifu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12075 Change subject: KUDU-2636: LBM supports deleting dead and full containers .. KUDU-2636: LBM supports deleting dead and full containers LBM supports deleting the full container which is dead after hole punching. It can help to save startup time. It uses the file's physical size to determine whether the container is dead or live. Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/util/env.h M src/kudu/util/env_posix.cc M src/kudu/util/file_cache.cc 6 files changed, 170 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/12075/1 -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 1 Gerrit-Owner: helifu