Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18569 )
Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata ...................................................................... Patch Set 71: (74 comments) http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@12 PS71, Line 12: long time bootstrap consumption nit: long bootstrap times http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@24 PS71, Line 24: different nit: difference http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@25 PS71, Line 25: is nit: is that http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@27 PS71, Line 27: the nit: The http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@26 PS71, Line 26: a : native nit: in a native http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@27 PS71, Line 27: , nit: end the sentence here, it's long enough already http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@28 PS71, Line 28: clase class http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@29 PS71, Line 29: container nit: a container http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@35 PS71, Line 35: container nit: a container http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@38 PS71, Line 38: load containers when bootstrap nit: loading containers during the bootstrap phase http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@39 PS71, Line 39: container nit: a container http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@43 PS71, Line 43: container nit: a container http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@56 PS71, Line 56: use nit: uses http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@57 PS71, Line 57: , it is specified by flag : '--block_manager'. The new block manager is enabled by setting --block_manager=logr http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@66 PS71, Line 66: it : shows that reopen staged reduced upto 90% time cost. nit: it shows that the time spent to re-open tablet server's metadata when 99.99% of all the records removed reduced about 9.5 times when using LogBlockContainerRdbMeta instead of LogBlockContainerNativeMeta. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/client/CMakeLists.txt File src/kudu/client/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/client/CMakeLists.txt@284 PS71, Line 284: rocksdb This looks a bit odd: how does it happen that Kudu C++ client tests need rocksdb as well? Is rocksdb library really needed here? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/CMakeLists.txt File src/kudu/fs/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/CMakeLists.txt@44 PS71, Line 44: kudu_test_util This looks wrong: a non-test libkudu_fs library is now dependent on libkudu_test_util test-only library. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1205 PS71, Line 1205: larger nit: greater http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243 PS71, Line 1243: to into http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243 PS71, Line 1243: is not take effect on does not affect http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243 PS71, Line 1243: so the metadata updating of "logr" : // block_manager works well updating the metadata stored in the logr-based block manager succeeds without any errors http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h File src/kudu/fs/dir_manager.h: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@121 PS71, Line 121: perpare preparatory http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@202 PS71, Line 202: public style nit: add a space before the 'public' label, similar to the other visibility labels in this file http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@220 PS71, Line 220: private ditto for the 'private' label http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc File src/kudu/fs/dir_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@207 PS71, Line 207: shared_ptr<rocksdb::Cache> RdbDir::s_block_cache_; Why do you need it here? Isn't it enough to have it initialized using std::once() in the code below? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229 PS71, Line 229: The A http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229 PS71, Line 229: will be is http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229 PS71, Line 229: is not does not http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@233 PS71, Line 233: opts.error_if_exists Should at least this option be enabled to avoid overriding existing metadata due to a mishap? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@281 PS71, Line 281: wait flush jobs to finish is enough it's enough to wait for the flush jobs to finish http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@282 PS71, Line 282: may consume longer time which cause longer time to shut down server. nit: ... may take more time, which results in longer times to shut down a server. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@295 PS71, Line 295: const Either remove 'const' or add 'const' to the result pointer. Otherwise, it's not a const-correct construct. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_util.cc File src/kudu/fs/dir_util.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_util.cc@169 PS71, Line 169: dir_type_ == "log" || dir_type_ == "logr" This pattern is seen in multiple places. Does it make sense to introduce an utility function IsLogTypeDir() to check for this condition? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/file_block_manager.h File src/kudu/fs/file_block_manager.h: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/file_block_manager.h@74 PS71, Line 74: kName This should have been 'name'; 'kSomething' is for variables, not method/function names. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/file_block_manager.h@74 PS71, Line 74: static Could this be 'constexpr' as well? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_manager-test.cc File src/kudu/fs/fs_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_manager-test.cc@224 PS71, Line 224: done nit: addressed http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h@189 PS71, Line 189: std::string container; : std::string rocksdb_key;; Could these be 'const'? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h@190 PS71, Line 190: ; nit: remove the extra semi-colon http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc File src/kudu/fs/log_block_manager-test-util.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@671 PS71, Line 671: slice_key nit: use rocksdb::Slice(key) in-place here as well? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@678 PS71, Line 678: DCHECK(dir); I'm not sure this makes sense -- it 'dir' was nullptr, it would crash one line above http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@697 PS71, Line 697: Kinds nit: Types http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@700 PS71, Line 700: auto r = rand_.Uniform(1); : DCHECK_EQ(r, 0); What is this for? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@734 PS71, Line 734: rocksdb::WriteOptions() nit: could this be replaced with {} ? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@734 PS71, Line 734: slice_key nit: use rocksdb::Slice(key) here as well as for the value? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@741 PS71, Line 741: slice_key nit: use rocksdb::Slice(key) in-place here? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@1629 PS71, Line 1629: CHECK_EQ(FLAGS_block_manager, "logr"); nit: is this necessary given of the condition of the enclosing if() clause? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@2612 PS71, Line 2612: CHECK_OK Is it possible to use ASSERT_OK() here as well? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@2773 PS71, Line 2773: // TODO(yingchun): need to clear up dead metadata Is this still relevant? If yes, please add corresponding comment into the corresponding method of the LogBlockManagerRdbMeta implementation as well. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@546 PS71, Line 546: kName style nit: this should be 'name'; 'kSomething' is for variables http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@546 PS71, Line 546: static const char* kName() { return "log"; } Could this be constexpr? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@592 PS71, Line 592: The metadata part of a container All the container's metadata http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@592 PS71, Line 592: to an RocksDB into a RocksDB http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@594 PS71, Line 594: will be are http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@594 PS71, Line 594: in of http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@595 PS71, Line 595: are removed are being removed http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@595 PS71, Line 595: block : // manager the block manager http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@596 PS71, Line 596: in RocksDB will be compacted ... in the RocksDB instance is compacted ... http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@599 PS71, Line 599: Compare Comparing http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@601 PS71, Line 601: to scan all records to scan through all the records http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@601 PS71, Line 601: may : // cause lower performance may adversely affect the performance http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@603 PS71, Line 603: a lot of options many configuration options http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@604 PS71, Line 604: flexibly. nit: drop this part http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@666 PS71, Line 666: static std::string ConstructRocksDBKey(const std::string& container_id, Could this be 'constexpr' as well? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@123 PS71, Line 123: useful effective http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@4033 PS71, Line 4033: string cid nit: use 'const auto&' to avoid copying? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@4050 PS71, Line 4050: rocksdb::WriteOptions del_opt; Remove this variable and use {} as the first parameter for the Delete() call below? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@4080 PS71, Line 4080: // The 'keys' is used to keep the lifetime of the data referenced by Slices in 'batch'. : vector<string> keys; Why to have this 'keys' array when in the while() cycle below only the current key is used further in the code? Isn't Delete() working in a synchronous way, i.e. once it returns, it doesn't refer anymore to the parameter it was given? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc@2117 PS71, Line 2117: only logr block manager is not supported logr block manager is not yet supported http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc@4574 PS71, Line 4574: if (FLAGS_block_manager == "logr") { : // Exclude the RocksDB data size. : uint64_t size_of_rdb; : ASSERT_OK(env_->GetFileSizeOnDiskRecursively(JoinPathSegments(data_dir, "rdb"), &size_of_rdb)); : size_before_delete -= size_of_rdb; : } nit: this block and another one at lines 4615 look very similar. Does it make sense to separate the code into a function and use it both places (and maybe in other future tests)? http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc File src/kudu/tserver/tablet_server-test.cc: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc@761 PS71, Line 761: to reset 'dd_manager' to release : // the last reference of dd_manager() to release the RocksDB LOCK file, otherwise ... to reset 'dd_manager', releasing the last 'dd_manager' reference. That's to allow for releasing the RocksDB LOCK file. Otherwise ... http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc@763 PS71, Line 763: require acquire http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt File src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt@a453 PS71, Line 453: Why to remove this? This doesn't look to be a rocksdb-related update. http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt@658 PS71, Line 658: rocksdb) Just curious: why lz4 isn't needed here, but is needed for jwt-util-test below? http://gerrit.cloudera.org:8080/#/c/18569/71/thirdparty/build-definitions.sh File thirdparty/build-definitions.sh: http://gerrit.cloudera.org:8080/#/c/18569/71/thirdparty/build-definitions.sh@1192 PS71, Line 1192: -DWITH_LZ4=ON With current settings, does this use LZ4 from Kudu's thirdparty or it somehow searches and picks up whatever is available on a build node? Did you try to build this on a clean node where LZ4 library isn't available from anywhere but from the Kudu's thirdparty 'installed' directory? -- To view, visit http://gerrit.cloudera.org:8080/18569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f Gerrit-Change-Number: 18569 Gerrit-PatchSet: 71 Gerrit-Owner: Yingchun Lai <laiyingc...@apache.org> Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: KeDeng <kdeng...@gmail.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com> Gerrit-Reviewer: Yingchun Lai <laiyingc...@apache.org> Gerrit-Reviewer: Yuqi Du <shenxingwuy...@gmail.com> Gerrit-Comment-Date: Thu, 18 Jan 2024 03:52:46 +0000 Gerrit-HasComments: Yes