David Ribeiro Alves has posted comments on this change. Change subject: log block manager: detect and repair unpunched holes ......................................................................
Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test-util.cc File src/kudu/fs/log_block_manager-test-util.cc: PS5, Line 156: initial_data_size does it make sense to add assertions on size? PS5, Line 460: int64_t* old_data_file_size maybe set a nullptr default arg so that you don't have to pass it if you don't care (twice in this file iirc) http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: Line 50: extra line, likely also clean the other one below http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS5, Line 79: 0.10 this is what's gating the heavy startup penalty, right? How easy is it to reach 10% wasted space? afraid that if it's too easy then we might as well not gate it at all PS5, Line 79: log_container_extra_space_heuristic_slop_fraction this flag name is not very informative. maybe: log_container_wasted_space_before_cleanup_fraction or somesuch PS5, Line 80: Additional fraction of a log container's calculated size that " : "must be consumed on disk before the container considered to be " : "inconsistent and subject to excess space cleanup. explain when/how the cleanup happens (at startup or with a tool, right?) PS5, Line 1833: actual_size you're basically saying in the comments above that this number is unstrustworthy, or did I get that wrong? maybe "reported_size" or "size_reported_by_os" would be a better name PS5, Line 1834: s = env_->GetFileSizeOnDisk(data_filename, &actual_size); : if (!s.ok()) { : *result_status = s.CloneAndPrepend(Substitute( : "Could not get on-disk file size of container $0", container->ToString())); : return; : } use RETURN_NOT_OK_PREPEND PS5, Line 1840: measured_size _this_ is the actual size, right? PS5, Line 2054: TODO(adar): can be more efficient (less fs work and more space reclamation : // in case of misaligned blocks) via hole coalescing first, but this is easy. file a jira? seems like we might have to revisit this if it turns out that hole puching is easily triggered and too heavy http://gerrit.cloudera.org:8080/#/c/6717/5/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: PS5, Line 116: certain which ones? can it be any operation? -- To view, visit http://gerrit.cloudera.org:8080/6717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I016ea401380f4c8c7b1fd907ff67cb595f377dd1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
